View Single Post
Old 06-20-2013, 05:59 AM   #11
DSpider
Evangelist
DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.
 
DSpider's Avatar
 
Posts: 450
Karma: 343115
Join Date: Nov 2009
Location: Romania
Device: PW2 2014
Quote:
Originally Posted by patrik View Post
I would be interested to hear from all of you using Word/OO between Finereader and Sigil, what do you do that is not easily done in Sigil?
You know how FineReader creates styles for bolds and italics? Yeah, I hate those, so I run a custom Word 2010 macro that will turn the document into plain text (yes, that's right) with formatting intact and then have it come back squeaky clean. Then I go through the whole thing and recreate the layout.

I do not recommend converting. PDF isn't the most friendly format out there, and if it wasn't saved as a tagged PDF (i.e. if you select some random text, the selection should NOT look like there are several letters and groups of letters separated; then it's not a tagged PDF), like over 90% of PDFs out there are, then it's really not worth trying to convert using Mobipocket or whatever. OCR it. Because the software will have to approximate the location of paragraphs (since each of those groups have individual coordinates, like on a blank piece of paper) and it may result in paragraphs within paragraphs, or a paragraph placed before a wrong paragraph, and so on. No, thanks!

The title of this thread is wrong. You do not convert with ABBYY FineReader. You OCR with it, and then manually tweak the stuffing out of it with some other software. Think of FineReader as an extraction tool. You extract text from images, and that's it. There are no layout options in FineReader.
DSpider is offline   Reply With Quote