MobileRead Forums - View Single Post

slowsmile · 10-19-2017, 07:09 AM

Quote:

Filenames with spaces cause epubcheck to complain, and they are potentially troublesome.
So an option to change all filename spaces to underscores?
Quite tedious to do by hand with a few dozen files.

It's certainly possible to have spaces in filenames in the Book Browser. But to cure that in plugin code would be quite a difficult task because there are strict IDPF relationships and dependencies that exist between the epub filenames, ids and hrefs(links) that exist in the content.opf and toc.ncx files in the epub. I've already looked into trying to do this and realized fairly quickly that it was not easy because of all the dependencies and strict rules that apply to that data. So, IMHO, I'm afraid that your request is not really realistic and probably not even be possible at the plugin level.

Quote:

Mend XHTML Source Code" -- can this be made to convert Windows-1252/ANSI html files to UTF-8?

Sorry but I'm going to have to say no again. The CustomCleanerPlus plugin is limited to doing just two specific tasks. It's first primary task is to clean out the dross proprietary data from the html or epub. It's second main task is to ensure epub 2 code compliance. In other words the plugin is a cleaner -- it's not a fixer or a converter. Also, converting several different encodings within your file to unicode is also not such an easy task. These problems are usually related to the user using too many applications to edit or format his or her doc or code and not ensuring that these editing environments are in unicode. Unless you are careful you will have cp1252, latin-1, ANSI and over 100 other possible encodings pop up in your epub. You can try patching your file in code using just the favourites like all the UTF forms, latin-1 and cp1252 etc but you can never really be sure that you have covered all bases. I could be wrong but I think Sigil also does a basic unicode check whenever you open an epub or import an html file into Sigil.

Quote:

-- Trims the epub stylesheet(s) - removes any unneeded or redundant class properties from the css
I like to snoop through interesting new stylesheets, and I can then use the Sigil tool to strip them if I want to.

My meaning of the word "redundant" used above is wrt unnecessary style property declarations in the css. Here's an example:

my-style {
font-size: 100%;
line-height: 1.2em;
}

The "font-size:100%" and the "line-height: 1.2em" are unnecessary because you should always go with the vendor defaults for the latter properties for standard epubs(and for Kindle mobi). This may or may not result in an empty style declaration. But if you don't remove these properties from your epub then this might also cause problems on some vendor conversions.

Quote:

-- Removes all html tags that are empty or that contain just spaces
These are often in a file because a linespace was intended. So after spotting these I often use them to modify the html, e.g. convert the following tag to one with "margin-top: 2em", for example.

Sorry again, but I can't accommodate your wish with this one either. For two reasons. First I've found the best way to format your ebook as a Word doc or ODT doc to be converted to epub is to ensure that you have created named paragraph styles for all your headings, text and spacing in your doc. The second reason is that the blank lines or hard line breaks with the enter key were causing havoc with my plugin image formatting on test using the plugin. Epub xhtml breaks also have completely different behaviour when compared with html breaks and this was a big problem initially for me with this plugin. That's really why I decided to just remove all hard breaks from the epub on cleanup.