View Single Post
Old 07-21-2014, 05:20 PM   #745
Rev. Bob
Wizard
Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.Rev. Bob ought to be getting tired of karma fortunes by now.
 
Rev. Bob's Avatar
 
Posts: 1,760
Karma: 9918418
Join Date: Feb 2013
Location: Here on the perimeter, there are no stars
Device: Kobo H2O, iPad mini 3, Kindle Touch
Quote:
Originally Posted by JSWolf View Post
What I want is the proper output messages. Saying spans are being stripped does not work when no spans are being stripped. Saying the header is being modified when it is works much better.
Okay, fine. Since you can't seem to grasp the idea of waiting until the end of the debugging process to worry about the exact wording of the "routine X changed file Y" message, I'll move that all the way up to the top of the list.

In fact, I've broken the de-indent bit out into its own little new option, so it's not in stripspans at all any more. Stripspans still contracts and removes empty formatting elements, so <b/> and <i></i> go away, but you know what? "Stripped spans in X" seems to me like a fine way to cover that functionality in addition to the strict removal of <span> elements. If you don't like it, deal with it until we're sure all the bugs are worked out, then feel free to suggest an alternate wording if you like. As I've said before, let's worry about getting the engine running correctly before polishing the chrome.

The new "de-indent" option is --unpretty (since it's the opposite of calibre's Beautify HTML in many ways), and it does the following:

- Removes tabs and other whitespace before HTML lines.
- Adds newlines after block-level elements.
- Adds a second newline after P, OL, UL, and H1-H6 elements.

In other words, it "straightens up" HTML, but without all the bloody indents. Note that this operation is not safe for files containing PRE elements, so the plugin skips any such file (and says so). The one thing that it doesn't do that I'd like to add is an "unwrap paragraphs" feature, which would remove linefeeds within paragraphs, but that winds up being a bit complicated for a quick afternoon fix.

EDIT, 7pm EDT: The unpretty algorithm has been updated. Nothing in stripspans or stripkobo has changed from the 5:20pm version.
Attached Files
File Type: zip Modify ePub - unpretty.zip (66.3 KB, 270 views)

Last edited by Rev. Bob; 07-21-2014 at 07:06 PM.
Rev. Bob is offline   Reply With Quote