View Single Post
Old 11-11-2011, 03:07 PM   #18
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 28,672
Karma: 205039118
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by Blossom View Post
After reading up I finally found a better solution. I extracted the SVG images then used this program called Prince to convert them to pdf files. I then merge them in Acrobat Pro and cropped it to just the text then used my OCR software and imported into Word. It worked wonderfully! It kept the italics, bold and format. The only thing you lose is the pictures but you can add those in manually.
Out of curiosity... what's your process after OCR'ing (I'm assuming ABBYY) to Word? I struggle with that step. Not that I can't get a working ebook from it, but I'm usually quite disgusted with the HTML produced by ABBYY And/Or the HTML produced by saving a Word doc as Unfiltered HTML. I spend ridiculous amounts of time trying to clean either up.

I am stuck with Word 2007 and ABBYY FineReader 9.0. Are the newer versions of each miraculously better at producing HTML that doesn't make me want to yak?
DiapDealer is online now   Reply With Quote