Why resurrect such an old thread from its grave. There are many threads about this from a later date or even start a new thread...
I use of course my own Word add-in to clean up OCR (and publishers) errors and create an ePUB with clean HTML as result.
|