Quote:
Originally Posted by Ghitulescu
|
Yep, you most likely figured it out.
I'm betting the problem is the monolithic HTML file: ~900 KBs. If you have an older ereader, that would crash (can only handle files ~300 KBs).
Like you also figured out, a simple Calibre EPUB->EPUB with file splitting should take care of that issue.
Also, the book is laid out in two-column format. Usually, that's incredibly hard to OCR correctly. OCR might think both columns are a single line, so you get half-left/half-right sentences, making the ebook completely unreadable.
According to the metadata, looks like they ran it through Finereader 8.0.
I ran it through Finreader 12 for you, then created a very rough EPUB. This one should be more accurate + will at least not have all the headers/footers clogging up the text.
Note: This book's font also had very low-hanging+round 'g's. OCR thought they were 'O's on their own line, so you'll see lots of those randomly appearing within the EPUB.