Quote:
Originally Posted by DaleDe
Here is a test case. It is the first few chapters of a book I am trying to convert. There is some font boiler plate at the front that would be good to remove also. This is a typical Word file saved as html. I chopped the main part of the file off so some rules such as /html entry are missing and some links won't work. There is also an @page entry that I trimmed the margins out of so that they wouldn't get in the way of a good ePUB.
|
You still haven't said what do you want Sigil to do with this.
If you want Sigil to remove the various junk with which Word fills the exported HTML, then that's already planned. Tidy has some nice flags for this and I'll be extending that behavior too. This would be available through a dialog that pops up on import and asks the user to check importing options like "Remove excessive Word markup" or whatever.
If this is what you want, add it to the tracker so I don't forget. But bear in mind that this is not high on my list of priorities because there are more important things to implement first.