MobileRead Forums - View Single Post

delphidb96 · 03-02-2008, 11:45 AM

Quote:

Originally Posted by captaingeorges

Thanks Harry!

But when you talked about new books to upload or books that you are working on or the Gutenberg project, I guess somebody must be scanning them ???

Regards

Georges

There are. Just as there are people who buy a scanner and start scanning in the dead-tree books they've accumulated in their personal libraries. I'd recommend either building a book jig to hold the book open while you take digitial photos of each page with a cheap 5-7mp camera or that you buy a scanner which is designed to hold the book's pages flat - the only inexpensive one is the Plustek OpticBook - it runs between $250-$350 on eBay.

Once you've got the pages scanned/photo'd, you need to either bundle them into a PDF file, or run them through the Optical Character Recognition (OCR) software to generate a text file of the page images. There are some decent, inexpensive ones available but the OpticBook comes with one bundled in with the rest of the software.

After creating the text file, and cropping any book images into individual image files, you then need to edit the content of the text files. Most books use slightly different fonts and unless you're willing to 'train' the OCR software, the software will probably only get 85%-95% of the characters right. Not only that, but the scanning process - unless you set the cropping factors right - will capture the page headers and footers as well, which means the basic file will have a bunch of page numbers, titles and authors' names scattered throughout the text - and will break paragraphs, and even words, inappropriately.

Once the text files have been edited, you can then create either an RTF, PDF or HTML document containing the text and any associated images and then run that through either Mobigen or BookDesigner to create your ebook file.

Simple! (But somewhat labor intensive.)

Derek