My guess (and it is just that) would be either of two things are the culprit:
- The metadata is different between the sample files you mentioned. Perhaps slightly different encoding or something else that is just slightly off. If either sample book comes with CSS, the issue may be there instead. (Although I know you said you checked this.)
- The original source document at the publisher's end was handled differently. The publisher(s) used different software programs (or versions) to generate their output and the results, as you're finding, are not optimal.
You may need to use a text editor with strong regular expression (regex) support in it's search & replace function. Some things can be automated and others will require oversight and your best judgment. On Windows, Notepad++ (free, available from sourceforge) or RegexBuddy (commercial software) can be used if you don't already have a regex tool. A word of warning, regex logic is quite convoluted if you need more than a simple search & replace.