![]() |
#16 |
Blue. Not sad...just blue
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 218
Karma: 1267018
Join Date: Oct 2009
Location: Japan
Device: Ridibooks Paper Pro
|
Same as Charlie: save to calibre, but leave as Topaz on the Kindle.
Except for books that I really like, then I'll go to the trouble of converting them to Markdown text and cleaning them up from there. But for something that I just want to read through once it's not worth the trouble. |
![]() |
![]() |
![]() |
#17 |
Treasure Seeker
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 18,708
Karma: 26026435
Join Date: Mar 2010
Device: Kobo HD Glo, Kindles, Kindle Fires, Andriod Devices
|
I accidentally bought a fiction Topaz book the other day. It looked horrible on my Kindle. I thought it was a mobi file because it has a file size on the buy page. I guess not. I tried importing into Calibre and the results were just as bad. I try not to buy Topaz but afterward I found the book was not available in any other digital format on the net.
After reading up I finally found a better solution. I extracted the SVG images then used this program called Prince to convert them to pdf files. I then merge them in Acrobat Pro and cropped it to just the text then used my OCR software and imported into Word. It worked wonderfully! It kept the italics, bold and format. ![]() |
![]() |
![]() |
Advert | |
|
![]() |
#18 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 28,375
Karma: 203720150
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
Quote:
I am stuck with Word 2007 and ABBYY FineReader 9.0. Are the newer versions of each miraculously better at producing HTML that doesn't make me want to yak? |
|
![]() |
![]() |
![]() |
#19 |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 208
Karma: 757546
Join Date: Sep 2010
Device: Kindle 3 Wifi and Kindle DX Graphite
|
I have two technical books that I really like and they are in Topaz. They look really great. By the way, I don't know how you guys buy a book in Topaz and only find out that after doing the purchase. It's matter of just downloading the sample, and if when you press the key text Aa, the option for changing the Font face appears grey out, the books it's a Topaz.
Last edited by thebestjeter; 11-11-2011 at 06:21 PM. |
![]() |
![]() |
![]() |
#20 | |
Treasure Seeker
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 18,708
Karma: 26026435
Join Date: Mar 2010
Device: Kobo HD Glo, Kindles, Kindle Fires, Andriod Devices
|
Quote:
![]() I OCR the pdf in ABBYY then had it save as a Word doc with editable content and then I open up Word and clean it up a bit. This one took little to no work. Mostly checking spelling OCR errors which was only a few words and search and replace formatting like bold words that shouldn't be...etc then apply my Macros, Save as filtered html and done! ![]() ![]() Some tips I have found, make sure ABBYY isn't set to save images of the pdf to Word. Word will make a mess of it. You can manually add them later if you want. Do not use Calibre to make your PDF it will choke ABBYY not to mention it will be 3 times the size it should. Edit the Word doc in normal view to get how it will look on your eReader. Use the Paragraph button and learn what each character means so you can use your eye to catch things out of place. The one I just did didn't need this extra step though but if the formatting is too messed up Book Designer 5 will fix this by converting the styles to html tags. This works great on fiction books. Because plain text is just that and it uses the basics B & I tags...etc so it's easier to edit. There is a trick in BD5 that will fix most broken sentences too. Just import into BD5 using "Keep Original Format" checked then save as html. Changed options to Reformat completely with Keep styles checked. Then import the html file you just saved. You find almost all broken sentences are fixed except the ones that have a capital word or ' after the break occurs. You can then save as html and open that in Word to edit. I like to open it up in Notepad2 before Word and do some quick search replaces to change the DIV tags to P tags instead and get rid of the " " it adds to each paragraph as a indent. I do my final editing in Word then import into Calibre for a good readable copy that works well on my Kindle. ![]() Last edited by Blossom; 11-11-2011 at 05:09 PM. |
|
![]() |
![]() |
Advert | |
|
![]() |
#21 |
Séduisant
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,706
Karma: 2107018
Join Date: Jan 2010
Location: Texas, USA
Device: Boox Note Air2+; Kobo Libra2; Kindle Scribe, Oasis3; iPad Mini6
|
I wasn't aware of this. Thanks. I was always looking for number of pages. But the font thing is a sure bet.
|
![]() |
![]() |
![]() |
#22 | |
Treasure Seeker
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 18,708
Karma: 26026435
Join Date: Mar 2010
Device: Kobo HD Glo, Kindles, Kindle Fires, Andriod Devices
|
Quote:
![]() |
|
![]() |
![]() |
![]() |
#23 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 28,375
Karma: 203720150
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
Thanks for the tips, Blossom! I don't do very many conversions of that nature, so I'll have to see if the Finereader upgrade makes financial sense for me. But, the few PDF->Finereader conversions I've attempted have driven me crazy. I could have probably saved time by retyping the whole thing... after all the manual tweaking involved.
![]() |
![]() |
![]() |
![]() |
#24 | |
Treasure Seeker
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 18,708
Karma: 26026435
Join Date: Mar 2010
Device: Kobo HD Glo, Kindles, Kindle Fires, Andriod Devices
|
Quote:
![]() |
|
![]() |
![]() |
![]() |
#25 | |
eBook Enthusiast
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 85,544
Karma: 93383099
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
|
Quote:
|
|
![]() |
![]() |
![]() |
#26 | |
Treasure Seeker
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 18,708
Karma: 26026435
Join Date: Mar 2010
Device: Kobo HD Glo, Kindles, Kindle Fires, Andriod Devices
|
Quote:
![]() I then use Acrobat Pro to merge the pdf files into one single pdf which is full of the SVG images. I then crop the pdf to remove the arrows and gray border leaving only the text area. The final result makes a good PDF readable on your PC or run through a OCR program to covert to a text editable format. The image PDF is not recommend for an eReader. The book I did the PDF size was 100MB when done. ![]() |
|
![]() |
![]() |
![]() |
#27 |
Blue. Not sad...just blue
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 218
Karma: 1267018
Join Date: Oct 2009
Location: Japan
Device: Ridibooks Paper Pro
|
Thanks for the tip about Prince; I'll have to look into that! Anything to smack down those Topaz files!
My ebook workflow from PDF is to convert the OCRed file to Markdown-formatted text for editing, before converting it to mobi for the kindle. I tried using Word for a while but the Styles palette drove me nuts. Markdown is like an easy-to-edit stripped-down version of HTML that calibre understands. Very human-readable so I've found it comfortable. In more detail:
FineReader 11 seems to be quite a bit better than v9 that I was using before. Very happy with it. Since I upgraded it's finally really worthwhile to get my scanned-but-not-OCRed-yet library out of limbo. |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
My Run-In With Topaz | SpiderMatt | Amazon Kindle | 50 | 03-13-2011 06:48 PM |
Can you tell if it's topaz before you buy it? | GA Russell | General Discussions | 12 | 01-17-2011 11:13 AM |
Just bought a Topaz | mr ploppy | Amazon Kindle | 24 | 12-25-2010 11:49 AM |
A Decent Topaz | Gideon | Amazon Kindle | 4 | 04-21-2009 09:21 PM |
Topaz looks horrible... | AnemicOak | Amazon Kindle | 17 | 03-03-2009 10:18 PM |