I used to try and clean up the html but in my opinion it's more work than is necessary,
for my needs. I now take a big sledgehammer and delete all of the original css and replace it with my minimal css as explained
here. After "fixing" the css this way all that crap just stops creating problems (although it requires self discipline to not look too much at their horrid html). I'll still remove those spans that are around every word or the classless ones around paragraphs, and I bold the chapter titles when they use p tags instead of h tags.