View Single Post
Old 08-19-2010, 03:49 PM   #4
chaley
Grand Sorcerer
chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.
 
Posts: 12,465
Karma: 8025600
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
Quote:
Originally Posted by Timber View Post
Nope. OCRing the file makes them marginally smaller as PDFs, but they still get huge as LRFs.
I think that Acrobat's OCR leaves the images, associating the text with the characters they come from in some overlay fashion. This is why you can sometimes search text in PDFs that are obviously images. There was a thread sometime back about Greek characters in documents that demonstrated this. When looking at the PDF, one saw greek, but ebooks made using the OCRed text had garbage in the same spot.

Try saving the OCRed PDF as text. That will get rid of the images. You could also try HTML.
chaley is offline   Reply With Quote