Quote:
Originally Posted by Elfwreck
If that's not what Topaz does, there's no reason not to OCR the book into a USEFUL format and work from there. (If that is what Topaz does, they might be scrimping on time by not correcting the OCR errors--because if you never see the actual text, only the word-images, it's not likely to matter much, as long as over 95% are OCR'd correctly. And they usually are.)
|
It's my understanding that it's a quick and dirty method for Amazon to get as many books as possible out for the Kindle.
Quote:
And I suppose none of the publishing houses would consider going to the darknet, grabbing a scanned-and-OCR'd text copy of the book from some fan, and proofreading *that* instead of starting from scratch.
|
They should. Some of the pirated scanned/OCR'd stuff out there is of higher quality than what some of the publishers are putting out anyway.