Quote:
Originally Posted by Tex2002ans
Yes. Archive.org just does a whole host of automated conversions... and I wouldn't use them if you could help it.
I usually just stick with their:
1. B&W PDF. Usually this is decent. In the case of this specific "yellowed book", it was crap.
2. Color PDF. This matches what they show in their online reader. Helpful if working with color, drawings, or "yellowed books". (You can do your own contrast/color corrections from this, and create a better grayscale/B&W version.)
3. As a last resort, work directly from the JPEG2000 images. These are the highest resolution/quality.
Do not touch their "EPUBs" or any of their other "ebook" formats (they are just automatically run through OCR, no proofing or anything). You're better off working from the source files and recreating your own OCR/ebooks from that.
|
I always use the original image files and run them through ABBYY, but not everybody has that, and then working from the text or epub files at archive.org is an option. Especially when their OCR is as clean as in this case.
Quote:
Originally Posted by hobnail
I've also done it using the txt file and depending on the quality of the scan and the original book it can be a painful amount of work.
|
No denying this.