Originally Posted by kovidgoyal
So really what the tool will have to do is:
1) Accept html input
2) parse the html input into some simple internal markup
3) Try to auto identify structural components (or ask the user to provide input to help identify them)
4) Provide an editor interface for the internal markup
5) Export the internal markup to EPUB
My understanding exactly. Only I'm thinking of making the "simple" internal markup not so simple. But yes, one has to parse the initial HTML and create a new one.