View Single Post
Old 03-22-2009, 05:10 AM   #10
Student1
Groupie
Student1 doesn't litterStudent1 doesn't litter
 
Posts: 159
Karma: 170
Join Date: Feb 2009
Device: PRS-505
Quote:
Originally Posted by mjh215 View Post
Hmm, someone is welcome to check my math but if it took you 5 hours to go from 3 gigs to 5 gigs, that is 1GB per 2.5 hours, with 10TB roughly to go you should be done in about 5 to 6 months... But I like the idea, if I had the space and the bandwidth (I'm not -that- patient) I'm sick enough to do it too...

I'd rather have the PDF's then the epubs for archiving, you can always reOCR the PDF's or use them as visual reference for correcting an epub whereas with the epub you are stuck with the original OCR...

-MJ
Yeah its a bit slow, about 7 gigs now... Just letting it slowly download! But its really nice they didn t think of protecting the web site from a scraper... guess they didn t think someone was crazy enough !

I wonder what they used to automate the ocr process of all those books. Can't believe they manually did them all... wouldn t make sense!
Student1 is offline   Reply With Quote