View Single Post
Old 10-02-2013, 09:44 PM   #8
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by mb2u View Post
All my prospective conversion are non-fiction.
Glorious!

If you are serious about OCRing and getting high quality work out there, I would not mind teaching everything I know. (I am free over AIM/YIM/MSN/Skype/email).

While you can OCR for your own personal benefit, the benefit does not outweigh the costs (I spend about 8-15 hours just to get a great EPUB, but just starting, you might be spending 40+ hours on a book).

In my opinion, you should try to tackle works that are in the public domain, or books that are released as CC (Creative Commons). After finishing your OCR, and making a clean EPUB, you can then post it on MobileRead/elsewhere so that the ENTIRE WORLD can benefit from your conversion (instead of just you).

Archive.org has scans of a massive amount of public domain books. Or if you are interested in some "training materials", I have a bunch of journal articles that need OCR (~13 pages each).

Tackling the easy/short stuff I believe would have built up my skills/familiarity with the tools way faster, and it definitely keeps the motivation up (makes you feel like you are actually ACCOMPLISHING SOMETHING).

When I first jumped in to OCR I decided it would be a good idea to tackle all the hard stuff first... I wish I didn't do that! When I used to tackle these large books that were complex/way out of my league, I would spend an entire week on it and felt like I got nowhere!

Quote:
Originally Posted by mb2u View Post
I know what you mean....it would destroy the flow of the story correcting errors in fiction. It would demolish it!
The few fiction books that I actually wanted to read (that were PDF only)... I pretty much just had to feed it through OCR, export, split chapters really fast, and run a few basic cleanup regex. Then I read through the book in Sigil and fixed the errors as I came across them while reading. Took forever, but nothing was spoiled.
Tex2002ans is offline   Reply With Quote