Hi to all,
as requested some weeks ago, here is a plugin that deletes from the xhtml all classes and ids that are not referenced anywhere in the stylesheets (nor in fragment identifiers in href or other attributes, in the case of ids).
There is a graphical interface that lets the user choose what they want to remove and what they want to keep.
I tried to adjust the colors of backgrounds and texts to keep it all readable and not too ugly in the various dark modes, but I couldn't test the plugin on a Mac (and I know Tcl/Tk and Mac don't always get along very well...).
The parser css used by the plugin is css_parser/cssutils, the parser xhtml is gumbo (adapter for sigil_bs4), all provided by Sigil installers.
The license of the plugin is the GPL v3 or any later version.
I tested this plugin on Windows 11 with Sigil 2.3 and 2.4.2, but it should work the same on Linux and macOS.
If the plugin's window doesn't appear after launch on macOS, it probably is just hidden behind other windows: you should be able to bring it to the foreground just by clicking on the plugin's icon in the dock.
Until version 0.2.2 the plugin by default looked for id references in the form of fragment identifiers in the attributes "href", "epub:textref" and "src", since version 0.2.3 this list has been expanded in three groups of attributes (users can edit these lists in the Preferences pane of the plugin, after launching it).
Attributes that can contain a fragment identifier:
- href
- epub:textref
- src
- action
- cite
- data
- form
- formaction
- ping
- poster
- xlink:href
- altimg
- cdgroup
- resource
Attributes that can contain a single id reference:
- commandfor
- list
- popovertarget
- xref
- aria-activedescendant
Attributes that can contain a whitespace separated list of id references:
- for
- headers
- itemref
- aria-controls
- aria-describedby
- aria-details
- aria-errormessage
- aria-flowto
- aria-labelledby
- aria-owns
(A fragment identifier is the part of a URI that follows the '#' character, e.g. href="#fragid".
An id reference is just the value of the id that is referenced.)
Some of the attributes will probably never be used in epubs to match an element's id (like the "ping" attribute), but it should be harmless to keep them around, just in case.
Version 0.2.3 of the plugin also provides a different tk theme for the Linux ui, that I find more pleasant than the default one. If it causes problems, you can disable it updating manually the preferences of the plugin in $SIGIL_PREFS/plugin_prefs/cssUndefinedClasses/cssUndefinedClasses.json, setting (or adding) the "tktheme" entry to an empty string (i.e.
).
Changes:
Spoiler:
v0.2.0:- the search for fragment identifiers has been extended to all xml files (ncx, media overlays...);
- the list of attributes in which to search for fragment identifiers is user customizable in the preferences pane, and the default values are href, epub:textref and src;
- another option for the user in the preferences pane is to restrict the search for classes and IDs to be removed to only a selected subset of files.
v0.2.1:- flash bug fix (in v0.2.0 fragment identifiers were searched for only in selected xhtml files).
v0.2.2- added an icon for the plugin;
- less rigid parsing of css selectors: now the plugin process also some invalid names for classes and ids in the stylesheets (although invalid, the plugin shouldn't ignore them).
v0.2.3- expanded the list of attributes used to search for id references;
- added the clearlooks tk theme for Linux ui