Hi, cap:
Well, I thought I'd figured out a way to get this to work, but I can't seem to get around the span problems. I've attached a sampling from the live file I'm working in right now, as a txt file.
As you'll see, some of the spans need to be deleted and replaced with a space; some don't. There does not seem to be any rhyme or reason as to what type of span appears where. I had some success with uploading the file to OO Writer and then tweaking it in Sigil; so my process may end up being something horrid like open in Word; s&r the section breaks and soft hyphens; save; open in OOWriter, then save as html; THEN open in Sigil and remove all the bloody font declarations. No matter how you slice it, this is just a pain. Ignore the page headers that are still in there; I was going to experiment with using regex to simplify that process, as well.
Hitch
|