View Single Post
Old 10-10-2014, 09:48 PM   #10
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by ittiandro View Post
I have attempted a new EPUB conversion following your instructions in regard to selecting the proper area types for different parts of the page(s), but these non-text areas are still not rendered properly or not rendered at all in the conversion when I open the EPUB file in my tablet.
.... I can't gather anything from this PDF. Can you maybe take screenshots of what your Finereader page looks like? Do you have red boxes around the Figures?

And you haven't shown what the EPUB output is either.

If you push View - Image and Text Window, you should be able to see what Finereader will be outputting.

Click image for larger version

Name:	FinereaderSideBySide.png
Views:	337
Size:	96.7 KB
ID:	129504

Look at my screenshots of Finereader above, you see the left half of the screen shows the original PDF + Green/Red boxes? And the right half of my screen where it shows the OCR text (with blue highlights around unsure characters)? Does it look similar on your end?

The stuff that appears in the "View" Window, is what will appear when you export the file. Can you see the figures in the View window?

Quote:
Originally Posted by ittiandro View Post
I enclose a few samples pages. If somebody wants to be kind enough to have a look at them and may be tell me what I am not doing or doing wrong, i'd appreciate.
Is this the original source? Perhaps you accidentally sent a few pages out of Finereader?

If that is the case, you might also want to go into Options - Save - PDF, and set "Image Settings" to "Best quality (source image resolution)". This will make sure the PDF output matches the original, and doesn't get super compressed into death.

You might also want to set Save Mode to "Text Under the Page Image". (This makes sure that the original scan is still showing, and it just hides the OCRed text behind it).

Quote:
Originally Posted by ittiandro View Post
Being a scientific book ( physics) there are quite a few math symbols and special characters which may be beyond proper recognition by the ABBYY software ( and by me, since I am not into maths), but this does not worry me too much. All I am striving for is to get a proper rendition of the basic figures, diagrams and tables, in order to get a basic understanding of some of the issues.
Ouch... I would highly recommend against trying to make an EPUB of a physics book, ESPECIALLY for your first time. There are WAY too many figures, complex equations, sub/superscripts, inline equations, greek/mathematical/weird symbols (that Finereader won't get correct).

It would absolutely take forever, even for someone who knows what they are doing (let me tell you... I wouldn't touch digitizing a physics book with a ten foot pole). :P

Last edited by Tex2002ans; 10-10-2014 at 09:58 PM.
Tex2002ans is offline   Reply With Quote