View Single Post
Old 03-05-2009, 01:22 PM   #1
Thorkin
Junior Member
Thorkin began at the beginning.
 
Posts: 6
Karma: 10
Join Date: Mar 2009
Device: kindle
Free/Shareware PDF converters with OCR capability?

I'm trying to convert a set of classic, illustrated children's books ([url=http://www.archive.org/details/merryadventureso00pylerich]Howard Pyle's books of Robin Hood, King Arthur, etc.) from public-domain .pdfs to ebooks I can read on my kindle.

Problem is, they're image-based PDFs, and heavy with illustration. Some pdf converters can't process them at all; some strip out all the illustrations and just convert the text; some convert every page into an image, which leaves the images excellent (well, apart from the "digitized by' watermarks on every page which I'd like to crop out) but makes the text too small to easily read. The only PDF converter I've found that seemed able to process them the way I'd like is ABBYY -- but that has a fifty-page limit on the trial version, which isn't enough for even one book, much less Pyle's collected works.

So as best I can figure out, I need a pdf converter that can do OCR of text and will also leave in the various images. Anyone have any pointers? Thanks!
Thorkin is offline   Reply With Quote