View Single Post
Old 05-25-2016, 10:21 PM   #3
dwig
Wizard
dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.
 
dwig's Avatar
 
Posts: 1,613
Karma: 6718541
Join Date: Dec 2004
Location: Paradise (Key West, FL)
Device: Current:Surface Go & Kindle 3 - Retired: DellV8p, Clie UX50, ...
As theducks said, calibre does not do OCR so it must be a case of the PDF having OCR text in addition to the bitmaps of the scanned pages and that calibre is using this text in its conversion.

I don't know it there is a way in calibre to force it to ignore the text and only use the bitmaps. There is a workaround. If you use a "virtual printer" utility that installs a printer driver that streams its data to a file in PDF format instead of to a hardware device the resulting PDF will be "flattened" with only a bitmap for each page. You would then import this "flattened" PDF into calibre to perform that PDF>AZW3 conversion.

I've used both Bullzip PDF Printer and PrimoPDF on Windows for similar tasks in the past. My issues had been with PDFs that used partially transparent objects that became opaque and obscured the text below them, but the technique should work as well for the OP's problem. I would think that the MacOSX's built in service for printing to PDF (found in its normal print dialog) should work as well.
dwig is offline   Reply With Quote