sounds like poor quality source material. what format are most of your source books in ?
can you post code snippets from them to illustrate the issue. ( easy to do if source is HTML or EPUB ) ask for help if needed.
I strongly recommend that you learn how to use sigil if you plan on repairing books. it's no harder than using MS word & has its own, helpful forum.
PS in my experience, the tempting mega collections : 1000+ books in 1 download, are atrocious quality, and the older the book, the more likely it is to have been through multiple bad conversions before it even reached you. I'm thinking e.g. golden era sci fi - written way before ebook formats or decent scanning software were invented.
"scanned in a galaxy far, far away" is NOT a badge of quality ! & if there is no legal e-book version on sale anywhere then the book was most likely scanned & run thru OCR software, then converted from PDF, not typed in word by word from a printed original!
|