View Single Post
Old 10-25-2011, 07:29 AM   #1
BranMakMorn
Enthusiast
BranMakMorn began at the beginning.
 
Posts: 30
Karma: 10
Join Date: Jan 2010
Device: none
Question about OCRd djvu and pdf with ABBYY

Hello everyone,

I've got a nagging problem which I didn't manage to solve browsing this section of the forum. So here it is: I have some books in .djvu format that I want to convert to .pdf PRESERVING THE OCR so that I can read and annotate them on iPad.

Now, I can of course open the djvu with ABBYY Finereader: it will scan the whole document and read the text, usually doing a very good job.

BUT. When I produce the OCRd .pdf, it will be a 'copy' of the original text, not the page-as-it-was. In other words: I don't want to have a 're-typed' copy of the book (also because ABBYY does an awful job with numbered footnotes), I want to keep the EXACT same looks of the printed book (font, spacings...everything).

I can achieve this if I simply 'print' the djvu file as a .pdf of course. But if I do this, I lose the searchable text, it will just be an image.

So the question would be: Is there any way to convert a djvu file, preserving BOTH ORCd text (searchability) AND general outlook?

Thank you!
BranMakMorn is offline   Reply With Quote