Originally Posted by etienne66
While on the subject of PDF, I've seen at least a few PDFs from Google's book scanning project. Although the results look good, the underlying text that you can't see contains many typographic errors that may prevent you from searching the document or using a dictionary in your ereader. I'm sure that ahi wouldn't allow such bad workmanship to pass without correction though.
Googlebooks are run through an automatic OCR program; no manual corrections are done. For most books, the process gets mostly-searchable text--except for odd terms that aren't in whatever dictionary they're using, which includes mathematical notation, and likely any medical or other scientific terms. This makes the process useful for novels and considerably less useful for technical texts.
As far as we've been able to establish, their epubs are made by converting the text from those OCR'd works. Shudder.
If Google gets their way, "ebook" will come to mean "crappy digital version of the Real Thing" in most people's mind.