Robert - Thanks for the interesting links. I downloaded the the American Verse Project version of the Emily Dickinson poems series 1. I must admit looking at the html sourceit looks pretty horrible and would be a nightmare to turn into something reasonably semantic. though it was, surprisingly, still readable in mobipocket.
I tried manybooks.net and it produced a pretty good version of some Emily Dickinson poems with good clean code. I think manybooks is worth checking out as a source of html versions of PG texts. But the bad news is that it produced a really pretty horrible version of Leaves of Grass - with new paragraphs starting mid-line. I don't envy anyone trying to produce a decently formatted version.
Laughing Vulcan - did you do the formatting in html or after conversion? If you have a decent version in html could you upload that as well? Thanks