View Single Post
Old 10-29-2008, 10:31 AM   #18
jharker
Developer
jharker could sell banana peel slippers to a Deveel.jharker could sell banana peel slippers to a Deveel.jharker could sell banana peel slippers to a Deveel.jharker could sell banana peel slippers to a Deveel.jharker could sell banana peel slippers to a Deveel.jharker could sell banana peel slippers to a Deveel.jharker could sell banana peel slippers to a Deveel.jharker could sell banana peel slippers to a Deveel.jharker could sell banana peel slippers to a Deveel.jharker could sell banana peel slippers to a Deveel.jharker could sell banana peel slippers to a Deveel.
 
Posts: 345
Karma: 3473
Join Date: Apr 2007
Location: Brooklyn, NY, USA
Device: iRex iLiad v1, Blackberry Tour, Kindle DX, iPad.
Quote:
Originally Posted by TallMomof2 View Post
A scanned page image is essentially a photograph or picture of the page. Like a picture it is not seen as text (characters) by the ebook program. What you have to do is run the scanned pages through an OCR program to convert the images to text so that it is treated as text instead of an image. The "gotcha" is that conversion usually results in many errors that require a human to edit the text. I can't tell you how many ebooks I've read that are poorly converted scanned pages. And these are from legitimate publishers.
This is the reason I don't buy ebooks any more. Now I mostly read free books or books out of copyright. I once spent about $30 on ebooks and they all had major errors or flaws that were clearly OCR-related. Many would have been fixed by a spellcheck program or easily noticed by a human proofreader. One book was missing all of its quotation marks. It's going to be a while before I trust e-publishers enough to give them money again.

Hopefully Google can get OCRopus running well enough to make decent ebooks available...
jharker is offline   Reply With Quote