MobileRead Forums - View Single Post - How can I convert a scanned PDF into a searchable/text-based PDF in Calibre?

feuille · 05-29-2026, 03:00 AM

I have been using a plugin for this purpose for a while now—one that I haven't released yet because I still want to add a proofreading step. It is based on OCRmyPDF, which, in turn, relies on Tesseract for the OCR component. If anyone wants to try this out, I could release it even without the proofreading feature (which utilizes the hOCR files generated by Tesseract). For high-quality scans, the results are already excellent out-of-the-box.

05-29-2026, 03:00 AM	#4
feuille Connoisseur Posts: 70 Karma: 666 Join Date: May 2020 Location: Germany Device: android smartphone + tablet with Moon Reader and ReadEra Apps.	OCRthisPDF (not yet published( I have been using a plugin for this purpose for a while now—one that I haven't released yet because I still want to add a proofreading step. It is based on OCRmyPDF, which, in turn, relies on Tesseract for the OCR component. If anyone wants to try this out, I could release it even without the proofreading feature (which utilizes the hOCR files generated by Tesseract). For high-quality scans, the results are already excellent out-of-the-box.