Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Readers > Amazon Kindle

Notices

Reply
 
Thread Tools Search this Thread
Old 11-04-2011, 05:57 AM   #16
Snorkledorf
Blue. Not sad...just blue
Snorkledorf ought to be getting tired of karma fortunes by now.Snorkledorf ought to be getting tired of karma fortunes by now.Snorkledorf ought to be getting tired of karma fortunes by now.Snorkledorf ought to be getting tired of karma fortunes by now.Snorkledorf ought to be getting tired of karma fortunes by now.Snorkledorf ought to be getting tired of karma fortunes by now.Snorkledorf ought to be getting tired of karma fortunes by now.Snorkledorf ought to be getting tired of karma fortunes by now.Snorkledorf ought to be getting tired of karma fortunes by now.Snorkledorf ought to be getting tired of karma fortunes by now.Snorkledorf ought to be getting tired of karma fortunes by now.
 
Snorkledorf's Avatar
 
Posts: 218
Karma: 1267018
Join Date: Oct 2009
Location: Japan
Device: Ridibooks Paper Pro
Same as Charlie: save to calibre, but leave as Topaz on the Kindle.

Except for books that I really like, then I'll go to the trouble of converting them to Markdown text and cleaning them up from there. But for something that I just want to read through once it's not worth the trouble.
Snorkledorf is offline   Reply With Quote
Old 11-11-2011, 02:48 PM   #17
Blossom
Treasure Seeker
Blossom ought to be getting tired of karma fortunes by now.Blossom ought to be getting tired of karma fortunes by now.Blossom ought to be getting tired of karma fortunes by now.Blossom ought to be getting tired of karma fortunes by now.Blossom ought to be getting tired of karma fortunes by now.Blossom ought to be getting tired of karma fortunes by now.Blossom ought to be getting tired of karma fortunes by now.Blossom ought to be getting tired of karma fortunes by now.Blossom ought to be getting tired of karma fortunes by now.Blossom ought to be getting tired of karma fortunes by now.Blossom ought to be getting tired of karma fortunes by now.
 
Blossom's Avatar
 
Posts: 18,708
Karma: 26026435
Join Date: Mar 2010
Device: Kobo HD Glo, Kindles, Kindle Fires, Andriod Devices
I accidentally bought a fiction Topaz book the other day. It looked horrible on my Kindle. I thought it was a mobi file because it has a file size on the buy page. I guess not. I tried importing into Calibre and the results were just as bad. I try not to buy Topaz but afterward I found the book was not available in any other digital format on the net.

After reading up I finally found a better solution. I extracted the SVG images then used this program called Prince to convert them to pdf files. I then merge them in Acrobat Pro and cropped it to just the text then used my OCR software and imported into Word. It worked wonderfully! It kept the italics, bold and format. The only thing you lose is the pictures but you can add those in manually.
Blossom is offline   Reply With Quote
Old 11-11-2011, 03:07 PM   #18
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 28,375
Karma: 203720150
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by Blossom View Post
After reading up I finally found a better solution. I extracted the SVG images then used this program called Prince to convert them to pdf files. I then merge them in Acrobat Pro and cropped it to just the text then used my OCR software and imported into Word. It worked wonderfully! It kept the italics, bold and format. The only thing you lose is the pictures but you can add those in manually.
Out of curiosity... what's your process after OCR'ing (I'm assuming ABBYY) to Word? I struggle with that step. Not that I can't get a working ebook from it, but I'm usually quite disgusted with the HTML produced by ABBYY And/Or the HTML produced by saving a Word doc as Unfiltered HTML. I spend ridiculous amounts of time trying to clean either up.

I am stuck with Word 2007 and ABBYY FineReader 9.0. Are the newer versions of each miraculously better at producing HTML that doesn't make me want to yak?
DiapDealer is offline   Reply With Quote
Old 11-11-2011, 03:09 PM   #19
thebestjeter
Addict
thebestjeter ought to be getting tired of karma fortunes by now.thebestjeter ought to be getting tired of karma fortunes by now.thebestjeter ought to be getting tired of karma fortunes by now.thebestjeter ought to be getting tired of karma fortunes by now.thebestjeter ought to be getting tired of karma fortunes by now.thebestjeter ought to be getting tired of karma fortunes by now.thebestjeter ought to be getting tired of karma fortunes by now.thebestjeter ought to be getting tired of karma fortunes by now.thebestjeter ought to be getting tired of karma fortunes by now.thebestjeter ought to be getting tired of karma fortunes by now.thebestjeter ought to be getting tired of karma fortunes by now.
 
Posts: 208
Karma: 757546
Join Date: Sep 2010
Device: Kindle 3 Wifi and Kindle DX Graphite
I have two technical books that I really like and they are in Topaz. They look really great. By the way, I don't know how you guys buy a book in Topaz and only find out that after doing the purchase. It's matter of just downloading the sample, and if when you press the key text Aa, the option for changing the Font face appears grey out, the books it's a Topaz.

Last edited by thebestjeter; 11-11-2011 at 06:21 PM.
thebestjeter is offline   Reply With Quote
Old 11-11-2011, 04:58 PM   #20
Blossom
Treasure Seeker
Blossom ought to be getting tired of karma fortunes by now.Blossom ought to be getting tired of karma fortunes by now.Blossom ought to be getting tired of karma fortunes by now.Blossom ought to be getting tired of karma fortunes by now.Blossom ought to be getting tired of karma fortunes by now.Blossom ought to be getting tired of karma fortunes by now.Blossom ought to be getting tired of karma fortunes by now.Blossom ought to be getting tired of karma fortunes by now.Blossom ought to be getting tired of karma fortunes by now.Blossom ought to be getting tired of karma fortunes by now.Blossom ought to be getting tired of karma fortunes by now.
 
Blossom's Avatar
 
Posts: 18,708
Karma: 26026435
Join Date: Mar 2010
Device: Kobo HD Glo, Kindles, Kindle Fires, Andriod Devices
Quote:
Originally Posted by DiapDealer View Post
Out of curiosity... what's your process after OCR'ing (I'm assuming ABBYY) to Word? I struggle with that step. Not that I can't get a working ebook from it, but I'm usually quite disgusted with the HTML produced by ABBYY And/Or the HTML produced by saving a Word doc as Unfiltered HTML. I spend ridiculous amounts of time trying to clean either up.

I am stuck with Word 2007 and ABBYY FineReader 9.0. Are the newer versions of each miraculously better at producing HTML that doesn't make me want to yak?
I use Word 2003 and ABBY FineReader Pro 11. I find v.11 does a fantastic job compared to v.9 which I also had and is well worth the upgrade!

I OCR the pdf in ABBYY then had it save as a Word doc with editable content and then I open up Word and clean it up a bit. This one took little to no work. Mostly checking spelling OCR errors which was only a few words and search and replace formatting like bold words that shouldn't be...etc then apply my Macros, Save as filtered html and done! It took about 45 Minutes but that's only because I scrolled through it twice to make sure I didn't miss anything. As this is my first Topaz to PDF to Html conversation so I wanted to make sure it was well done.

Some tips I have found, make sure ABBYY isn't set to save images of the pdf to Word. Word will make a mess of it. You can manually add them later if you want.

Do not use Calibre to make your PDF it will choke ABBYY not to mention it will be 3 times the size it should.

Edit the Word doc in normal view to get how it will look on your eReader. Use the Paragraph button and learn what each character means so you can use your eye to catch things out of place.

The one I just did didn't need this extra step though but if the formatting is too messed up Book Designer 5 will fix this by converting the styles to html tags. This works great on fiction books.
Because plain text is just that and it uses the basics B & I tags...etc so it's easier to edit.

There is a trick in BD5 that will fix most broken sentences too. Just import into BD5 using "Keep Original Format" checked then save as html. Changed options to Reformat completely with Keep styles checked. Then import the html file you just saved. You find almost all broken sentences are fixed except the ones that have a capital word or ' after the break occurs.

You can then save as html and open that in Word to edit. I like to open it up in Notepad2 before Word and do some quick search replaces to change the DIV tags to P tags instead and get rid of the "    " it adds to each paragraph as a indent.

I do my final editing in Word then import into Calibre for a good readable copy that works well on my Kindle.

Last edited by Blossom; 11-11-2011 at 05:09 PM.
Blossom is offline   Reply With Quote
Old 11-11-2011, 05:03 PM   #21
SCION
Séduisant
SCION ought to be getting tired of karma fortunes by now.SCION ought to be getting tired of karma fortunes by now.SCION ought to be getting tired of karma fortunes by now.SCION ought to be getting tired of karma fortunes by now.SCION ought to be getting tired of karma fortunes by now.SCION ought to be getting tired of karma fortunes by now.SCION ought to be getting tired of karma fortunes by now.SCION ought to be getting tired of karma fortunes by now.SCION ought to be getting tired of karma fortunes by now.SCION ought to be getting tired of karma fortunes by now.SCION ought to be getting tired of karma fortunes by now.
 
Posts: 4,706
Karma: 2107018
Join Date: Jan 2010
Location: Texas, USA
Device: Boox Note Air2+; Kobo Libra2; Kindle Scribe, Oasis3; iPad Mini6
Quote:
Originally Posted by thebestjeter View Post
... It's matter of just download the sample, and if when you press the key text Aa, the option for change the Font face appears grey out, the books it's a Topaz.
I wasn't aware of this. Thanks. I was always looking for number of pages. But the font thing is a sure bet.
SCION is offline   Reply With Quote
Old 11-11-2011, 05:08 PM   #22
Blossom
Treasure Seeker
Blossom ought to be getting tired of karma fortunes by now.Blossom ought to be getting tired of karma fortunes by now.Blossom ought to be getting tired of karma fortunes by now.Blossom ought to be getting tired of karma fortunes by now.Blossom ought to be getting tired of karma fortunes by now.Blossom ought to be getting tired of karma fortunes by now.Blossom ought to be getting tired of karma fortunes by now.Blossom ought to be getting tired of karma fortunes by now.Blossom ought to be getting tired of karma fortunes by now.Blossom ought to be getting tired of karma fortunes by now.Blossom ought to be getting tired of karma fortunes by now.
 
Blossom's Avatar
 
Posts: 18,708
Karma: 26026435
Join Date: Mar 2010
Device: Kobo HD Glo, Kindles, Kindle Fires, Andriod Devices
Quote:
Originally Posted by thebestjeter View Post
I have two technical books that I really like and they are in Topaz. They look really great. By the way, I don't know how you guys buy a book in Topaz and only find out that after doing the purchase. It's matter of just download the sample, and if when you press the key text Aa, the option for change the Font face appears grey out, the books it's a Topaz.
I don't use my Kindle for samples I use K4PC and it uses a similar font so I had no idea it was Topaz till I bought it. It has a file size so I thought I was safe but it's okay, it turned out to be a nice learning experience. Now I know what to do with all the free topaz books I've gotten over the months.
Blossom is offline   Reply With Quote
Old 11-11-2011, 05:10 PM   #23
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 28,375
Karma: 203720150
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Thanks for the tips, Blossom! I don't do very many conversions of that nature, so I'll have to see if the Finereader upgrade makes financial sense for me. But, the few PDF->Finereader conversions I've attempted have driven me crazy. I could have probably saved time by retyping the whole thing... after all the manual tweaking involved.
DiapDealer is offline   Reply With Quote
Old 11-11-2011, 05:16 PM   #24
Blossom
Treasure Seeker
Blossom ought to be getting tired of karma fortunes by now.Blossom ought to be getting tired of karma fortunes by now.Blossom ought to be getting tired of karma fortunes by now.Blossom ought to be getting tired of karma fortunes by now.Blossom ought to be getting tired of karma fortunes by now.Blossom ought to be getting tired of karma fortunes by now.Blossom ought to be getting tired of karma fortunes by now.Blossom ought to be getting tired of karma fortunes by now.Blossom ought to be getting tired of karma fortunes by now.Blossom ought to be getting tired of karma fortunes by now.Blossom ought to be getting tired of karma fortunes by now.
 
Blossom's Avatar
 
Posts: 18,708
Karma: 26026435
Join Date: Mar 2010
Device: Kobo HD Glo, Kindles, Kindle Fires, Andriod Devices
Quote:
Originally Posted by DiapDealer View Post
Thanks for the tips, Blossom! I don't do very many conversions of that nature, so I'll have to see if the Finereader upgrade makes financial sense for me. But, the few PDF->Finereader conversions I've attempted have driven me crazy. I could have probably saved time by retyping the whole thing... after all the manual tweaking involved.
Give the Trial a try before upgrading. I had no problems with the 3 pdfs I've done with it. It doesn't have much problems at all recognizing the words. My only issue has been this program takes power! It uses nearly 4GB of ram at times but I have a quad core and 6GB of ram.
Blossom is offline   Reply With Quote
Old 11-12-2011, 03:37 AM   #25
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 85,544
Karma: 93383099
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
Quote:
Originally Posted by Blossom View Post
Give the Trial a try before upgrading. I had no problems with the 3 pdfs I've done with it. It doesn't have much problems at all recognizing the words. My only issue has been this program takes power! It uses nearly 4GB of ram at times but I have a quad core and 6GB of ram.
How do you do the initial step of converting the Topaz book into a PDF file?
HarryT is offline   Reply With Quote
Old 11-12-2011, 03:54 AM   #26
Blossom
Treasure Seeker
Blossom ought to be getting tired of karma fortunes by now.Blossom ought to be getting tired of karma fortunes by now.Blossom ought to be getting tired of karma fortunes by now.Blossom ought to be getting tired of karma fortunes by now.Blossom ought to be getting tired of karma fortunes by now.Blossom ought to be getting tired of karma fortunes by now.Blossom ought to be getting tired of karma fortunes by now.Blossom ought to be getting tired of karma fortunes by now.Blossom ought to be getting tired of karma fortunes by now.Blossom ought to be getting tired of karma fortunes by now.Blossom ought to be getting tired of karma fortunes by now.
 
Blossom's Avatar
 
Posts: 18,708
Karma: 26026435
Join Date: Mar 2010
Device: Kobo HD Glo, Kindles, Kindle Fires, Andriod Devices
Quote:
Originally Posted by HarryT View Post
How do you do the initial step of converting the Topaz book into a PDF file?
I extract the SVG images (xhtml files) using help from good ole Alf then use a free program called Prince to convert each image into a pdf file.

I then use Acrobat Pro to merge the pdf files into one single pdf which is full of the SVG images. I then crop the pdf to remove the arrows and gray border leaving only the text area.

The final result makes a good PDF readable on your PC or run through a OCR program to covert to a text editable format. The image PDF is not recommend for an eReader. The book I did the PDF size was 100MB when done.
Blossom is offline   Reply With Quote
Old 11-18-2011, 06:41 AM   #27
Snorkledorf
Blue. Not sad...just blue
Snorkledorf ought to be getting tired of karma fortunes by now.Snorkledorf ought to be getting tired of karma fortunes by now.Snorkledorf ought to be getting tired of karma fortunes by now.Snorkledorf ought to be getting tired of karma fortunes by now.Snorkledorf ought to be getting tired of karma fortunes by now.Snorkledorf ought to be getting tired of karma fortunes by now.Snorkledorf ought to be getting tired of karma fortunes by now.Snorkledorf ought to be getting tired of karma fortunes by now.Snorkledorf ought to be getting tired of karma fortunes by now.Snorkledorf ought to be getting tired of karma fortunes by now.Snorkledorf ought to be getting tired of karma fortunes by now.
 
Snorkledorf's Avatar
 
Posts: 218
Karma: 1267018
Join Date: Oct 2009
Location: Japan
Device: Ridibooks Paper Pro
Thanks for the tip about Prince; I'll have to look into that! Anything to smack down those Topaz files!

My ebook workflow from PDF is to convert the OCRed file to Markdown-formatted text for editing, before converting it to mobi for the kindle.

I tried using Word for a while but the Styles palette drove me nuts. Markdown is like an easy-to-edit stripped-down version of HTML that calibre understands. Very human-readable so I've found it comfortable.

In more detail:
  1. Run the scanned PDF through ABBYY FineReader 11 (running in a virtual Windows 7 machine on my Mac). Spellcheck it here.
  2. Save as HTML.
  3. Import that into calibre where it becomes a ZIP archive.
  4. Convert from ZIP to TXTZ (if it has images) or TXT (if not). Set calibre's conversion output settings to Format: Markdown; Do not remove links: on; Do not remove images: on.
  5. Rename the .txtz file to .zip, and unzip it.
  6. Use BBEdit to clean up the resulting Markdown-formatted text file using Regular Expressions (BBEdit also understands Markdown and will preview it for me).
  7. Import it back into calibre and convert to mobi for the Kindle.

FineReader 11 seems to be quite a bit better than v9 that I was using before. Very happy with it. Since I upgraded it's finally really worthwhile to get my scanned-but-not-OCRed-yet library out of limbo.
Snorkledorf is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
My Run-In With Topaz SpiderMatt Amazon Kindle 50 03-13-2011 06:48 PM
Can you tell if it's topaz before you buy it? GA Russell General Discussions 12 01-17-2011 11:13 AM
Just bought a Topaz mr ploppy Amazon Kindle 24 12-25-2010 11:49 AM
A Decent Topaz Gideon Amazon Kindle 4 04-21-2009 09:21 PM
Topaz looks horrible... AnemicOak Amazon Kindle 17 03-03-2009 10:18 PM


All times are GMT -4. The time now is 06:02 AM.


MobileRead.com is a privately owned, operated and funded community.