MobileRead Forums - View Single Post

LadyKate · 02-09-2014, 03:55 PM

Quote:

Originally Posted by unboggling

His and Her outhouses? Luxury! We had to share one.

Sounds similar to this:

http://daringfireball.net/projects/markdown/

(mentioned in Prefs > Input Options > TXT Input > Markdown)

I did a quick look at the markdown but... does it maintain the italics and bold settings? Didn't notice that.

The freebie tool (not around that I could find but still works well) HTML Book Fixer, strips the excess spans BUT it also manages to remove the italics if they are in a span. Most irritating.

With excess nested spans it is darn near impossible to find the matching open / close tags that refer to italics using regex and a royal pain to "eyeball" the italics in the original.

I don't know why modern word processors don't allow the option to clean up the underlying code that is used to create pdf and html files. The main reason I find pdf files so hard to clean up is because most were created in a wysiwyg program. From the underlying code I get in the html it is usually word or a word clone that uses the horrid "<p class=MsoNormal><span style='mso-fareast-font-family:"MS Mincho"'>" often skipping the quotes around the class name. (note the font family/name is whatever font the doc used.)

I think all that excess code can lead to problems in conversions when nested too deep. I had one problem caused by not cleaning up a file because I had not noticed that one of the nested div tags was class="chapter" and around the entire chapter and another was class="chapterHead" and around the Chapter whatever.