When doing conversions I find myself spending more and more time just doing everything in regex. Getting a bit too good at it really.
Originally Posted by DiapDealer
I wish chapter headers were more consistently tagged, but I also understand why they're oftentimes not.
Yeah, it doesnt help that a lot of tools that use automatic tidying break completely valid markup, for chapter markers and part/book pages I generally just use a single h2 tag, throw in a line break, horizontal row and the rest - looks perfect on everything, even converts to mobi without trouble. However every now and then I'll tidy and prettyprint... forgetting. Next thing I know there's 3 h2's, empty paragraphs, styled hr's and a whole load of spans and inline css - urgh
But anyway, I'd suggest anyone that does conversion from sites/pdf/poor formats in general should get hold of the terribly-badly-named RegexBuddy - makes life a whole lot easier; I guess there might be a free/OSS tool similar, but last I looked they were pretty lacking.