Originally Posted by RikaStrom
Nick, a question about this.
Several times I have "processed" an html book through GEB and the result was a disaster of coding and formatting. I've always held the opinion that this was because the original source code was "dirty". Is this what you mean when you say you "clean up" the html code before processing html to imp?
I use eBook Publisher (not GEB Librarian) to primarily convert .html to .imp ebooks. If the .html was prepared/copied from a website, there are a lot of things that eBook Publisher will complain about (it's great at finding most errors) which then requires you to edit the .html file so that it will work in the .imp ebook.
Recent versions of eBook Publisher have introduced some strange limitations, like only being able to use "\" in path of absolute filenames, require removal of any "width=" within image references like <img src="cover.jpg" width="50%">, etc.
When the preparer/author of the .html used is not you, then a lot of things can be wrongly/differently coded, as the eBook Publisher expects things to be presented in it's ways or it doesn't work. These "errors" are what I clean up when using .html as a source. I'm sorry I can't give you a checklist of things to change as each .html would be different, but after doing one of two conversions, you'll be able to see a pattern of things that need to be "fixed".
If you cannot determine why something doesn't work, try posting a new thread with the issue and help will come along...