It's pretty clear to me that using the "save as HTML" is NOT the answer for importing Word docs. studying the output HTML files reveals the usual MS forest of un-needed tags and a lot of javascript even for the simplest things. As was pointed out to me here, this can't be read by any ebook reader.
my file has some spacing, drop caps, and underlines. all of which turned into literally pages of un-wanted tags. fine. I hand stripped everything out, saved as HTML, and made sure there was no javascript in the resulting code. but, I STILL got the empty white page of death in sigil.
no problem, I tired save as RTF. Nope, again, the white page of death in Sigil. now this was pretty frustrating, as I just couldn't see where the problem could possibly be hiding.
fine, I saved just as text. this worked, and sigil did import the text in fine, but I then had to go back in and fix the style issues in html and some on screen edits. so far so good. at least I have a working document.
however I think it is fair to make these observations:
Sigil should NOT recommend use of the Word's internal HTML conversion to make an import file for sigil as the "best way". I tired many different and simple text sniplets, all crash when attempting to load into sigil as HTML. since RTF also didn't work for me, I think you should be changing the suggested word export technique to plain text, as that would have saved me hours of work and many inexplicable problems. there are no doubt examples of Word files that can somehow work in a higher level export format, but there are so many issues with even simple files, that I just can't see it as the "recommended" way, especially since there is no guidance at all as to what can go wrong or why it does so in sigil.
other than that time-wasting input format nightmare, I have to say sigil worked pretty well, although two problem are still making me crazy:
1. how do I get paragraphs to indent automatically? the default is left aligned blocks of text, not very attractive. I see no way to fix it. i tried altering a P tag in the CSS area but I could only get the inter-paragraph spaces to go away, not get a leading indent.
2. why on earth does the entire document reload at the very start whenever you change anything in the code window? talk about irritating...especially in a 249 page document...there's no quick way to return.
I also noted that when saving, the program automatically appends .sgf to the file name, this makes saving as an epub file impossible. you have to go in and edit the file name to get rid of this quirk before saving as an epub file.
one last thing that remains a mystery to me, does the TOC ever appear anywhere in the document? I have my entries in it, but within sigil, I can't see it or use it for navigation at all. how on earth do you make it actually appear? The wiki tutorial says zip on this topic.
many thanks,
walter
|