Thanks Patricia and RWood. The OCR errors and format irregularities in the TXT files are pretty bad. Plus, the pagination of the original book (from which the scan was taken) has been retained, so I need to go back and excise the footers. This is going to be LOTS of work.
RWood, did you use the Internet Archives PDFs? Are they text or image-based? The PDF for my book is 50M. I didn't download it because my ISP meters bandwidth usage, and my connection is in constant use already.
|