I have scanned a couple of backlist books using Abbyy Finereader, saved to a Word doc. It does a great job, but even 99.9% correct means a lot of typos in a book of 100,000 words (= 500,000 characters x 0.1% = 500). Quite apart from all else, it's a whole lot easier to proof in a Word doc than an epub.
In both books, interestingly, the same error predominated: a lower-case M presented as lower-case RN.
I use Word2CleanHtml dot com to get clean html from the Word docx. There are of course many other ways to do the same, but this is quick and easy and all but flawless. I open the html file in Sigil and go from there.
|