MobileRead Forums - View Single Post

mr ploppy · 05-04-2010, 05:13 PM

I've tried most of these methods, but the best so far is to open the PDF in an OCR program and generate a new text file from that. Then manually delete any headers and footers, and fix any broken paragraphs. I haven't seen any common OCR problems yet, presumably because the text in a PDF will be perfectly straight and without any scanner or paper noise.

05-04-2010, 05:13 PM	#9
mr ploppy Feral Underclass Posts: 3,622 Karma: 26821535 Join Date: Jan 2010 Location: Yorkshire, tha noz Device: 2nd hand paperback	I've tried most of these methods, but the best so far is to open the PDF in an OCR program and generate a new text file from that. Then manually delete any headers and footers, and fix any broken paragraphs. I haven't seen any common OCR problems yet, presumably because the text in a PDF will be perfectly straight and without any scanner or paper noise.