MobileRead Forums - View Single Post

nrapallo · 02-05-2009, 05:58 PM

Thanks Paul!

I've always admired your efforts here and even considered OCR'ing some of these .pdf's for my own personal use.

However, I want to share a "tip" about getting the OCR'ed text the easy way: get Google's extracted text via their bot search of Paul's website.

If you search using this term: "site:djm.cc/library/ filetype:pdf" (without the quotes), you will get Google's listings of Paul's .pdf archives (or just click here).

Now, just click 'View as HTML' and you will get a 'free' OCR'ed text version of the .pdf ebook. Some work better than others, though. It's a cheat, but it works!

Have fun!

EDIT: Oops, only yields the first 50 pages! Sorry, maybe not such a good tip for larger books! :(

02-05-2009, 05:58 PM	#23
nrapallo GuteBook/Mobi2IMP Creator Posts: 2,958 Karma: 2530691 Join Date: Dec 2007 Location: Toronto, Canada Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN	Thanks Paul! I've always admired your efforts here and even considered OCR'ing some of these .pdf's for my own personal use. However, I want to share a "tip" about getting the OCR'ed text the easy way: get Google's extracted text via their bot search of Paul's website. If you search using this term: "site:djm.cc/library/ filetype:pdf" (without the quotes), you will get Google's listings of Paul's .pdf archives (or just click here). Now, just click 'View as HTML' and you will get a 'free' OCR'ed text version of the .pdf ebook. Some work better than others, though. It's a cheat, but it works! Have fun! EDIT: Oops, only yields the first 50 pages! Sorry, maybe not such a good tip for larger books! :( Last edited by nrapallo; 02-05-2009 at 06:04 PM.