Quote:
Originally Posted by fredthefork
Hello! I'm new to this so please forgive me if this is basic knowledge.
I have a PDF file which is OCRed. I would like to convert it to epub. ...
What am I missing here? This can't be so difficult, - can it? 
|
Yes.
One, "cropping" tools like Briss don't delete anything. They just set a new page size for viewing. The old data is still there; it's just off the page and out of view.
Two, the PDF was OCRd before it was cropped. The headers and similar "junk" is still in the text layer from the OCR process and still "visible" to the format converter so it ends up in the ePub.
You
might be more successful if you "crop" the PDF first and then to the OCR. This
might prevent the OCR process from "seeing" the parts that were trimmed.