Some of those spans call up italics or bold, so to do it right, each one really should be checked and not blindly run a regex script removing them all. Quite a few of them are useless though, even with many <span></span> for no reason multiple times with a sentence.
And with over 10,000 instances of span in the book, I'm not sure I want to do the publisher's job on it since I've read it already. If it was a book I scanned, even for my own use, I'd have gladly cleaned it all up. I did make a lot of corrections for spelling and grammar and works omitted, run together, punctuation mid-sentence where you know it was picked up by OCR and then not proofed very well.
I may still give in though and do it when I'm bored, just to see if the page numbers change still (although I'm not sure they will).
|