MobileRead Forums - View Single Post - small PDFs becoming huge LRFs when converted

chaley · 08-19-2010, 04:49 PM

Quote:

Originally Posted by Timber

Nope. OCRing the file makes them marginally smaller as PDFs, but they still get huge as LRFs.

I think that Acrobat's OCR leaves the images, associating the text with the characters they come from in some overlay fashion. This is why you can sometimes search text in PDFs that are obviously images. There was a thread sometime back about Greek characters in documents that demonstrated this. When looking at the PDF, one saw greek, but ebooks made using the OCRed text had garbage in the same spot.

Try saving the OCRed PDF as text. That will get rid of the images. You could also try HTML.