View Single Post
Old 05-27-2015, 03:27 PM   #1
crazybrit
Member
crazybrit doesn't littercrazybrit doesn't littercrazybrit doesn't litter
 
Posts: 23
Karma: 208
Join Date: Oct 2014
Device: Nexus5, Nexus9
Tool to rewrite a PDF as new text after OCR

I have some old scanned CS papers in PDF format (I don't have an original printed copy) and the quality isn't great. I can upload them into Google Docs (after turning on translate into native google docs format option) and it will perform OCR but it's obviously adding this as a separate layer in the PDF as the text quality remains unchanged.

Is there any Open Source (or freeware; Windows or Linux) tool that can take this OCR layer and generate a new PDF (or better; a Word/Open Office/LaTeX doc) with the same general layout that I can then hand edit (to clean up the conversion errors)?

EDIT: I tried to use this online site (http://www.free-online-ocr.com/) and results were horrible though it was clearly trying to do what I asked above. http://www.onlineocr.net/ was better but is limited to one page at a time unless I register. I'd prefer something that I can run locally vs uploading to the web.

Last edited by crazybrit; 05-27-2015 at 03:55 PM.
crazybrit is offline   Reply With Quote