06-01-2012, 04:08 PM | #1 |
Wizard
Posts: 2,608
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
|
Cleaning a stylesheet of unused styles
Hi
I am not sure this belong to the Sigil forum. Please forgive me if I am mistaken. Ideally we all have clean style sheets. Nearly always. Sometimes, though, from a badly formatted file, after using a too trusting and not enough discriminating converter, some (many) unused styles land in the style sheet and clutter it. As a prevention tool, just before publishing, I'd like to be sure that my style sheet contains only styles that are really used in the html files and to be able to discard the others. I can imagine for example a script or a tool, parsing the style sheet, then counting and summing up the occurrences of styles in the html files. After that, a style showing 0 occurrence could probably be safely deleted. Maybe there could be other solutions? |
06-01-2012, 04:24 PM | #2 | |
Well trained by Cats
Posts: 29,965
Karma: 55705602
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Quote:
Feature has been requested as part of the Calibre Quality Check PI. Flightcrew only concerns itself with invalid EPUB syntax. Excess (unused) selectors is not against the rules. (In reality, Flightcrew does not complain if a selector does not exist and this should be at minimum, a Warning event . ) |
|
Advert | |
|
06-01-2012, 04:39 PM | #3 |
Grand Sorcerer
Posts: 27,602
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
I just search for occurrences of each class name in the xhtml and delete it if the search comes up empty. Yeah, it can be a bit time consuming, but in general, cleaning up the code in a book is almost always time consuming anyway, but at least you only have to do it once (for each book).
\bsuspected-unused-classname\b A plugin API will be so welcome when it happens. (that's just wishful thinking not a nudge or anything ) Last edited by DiapDealer; 06-01-2012 at 04:43 PM. |
06-01-2012, 05:38 PM | #4 |
Sigil developer
Posts: 1,274
Karma: 1101600
Join Date: Jan 2011
Location: UK
Device: Kindle PW, K4 NT, K3, Kobo Touch
|
There is a long standing issue requesting this in Sigil - to identify/removed unused CSS, etc. Haven't looked at it yet, but its definitely something I'd like to see as well, if only to clean out styles I've added but then decided not to use.
Ideally finding something to base it on as building from scratch is a long road. |
06-01-2012, 05:44 PM | #5 |
Jr. - Junior Member
Posts: 586
Karma: 2000358
Join Date: Aug 2010
Location: Alabama
Device: Archos, Asus, HP, Lenovo, Nexus and Samsung tablets in 7,8 and 10"
|
Doing an epub to epub conversion in Calibre you will clean the style sheet of unused stuff and put it in alphabetical order to boot. I have been surprised, more than a few times, that Calibre will catch some of my typos and create new labels for me that I can then clean up.
Regards - John |
Advert | |
|
06-01-2012, 05:54 PM | #6 |
Grand Sorcerer
Posts: 27,602
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
I love calibre, but If I just spent a long time getting the ePub's code into the shape I wanted, an epub to epub conversion with calibre is going to literally destroy a big chunk of my work.
Just a side-note... I'd be interested in the reverse of this idea as well: finding orphaned classes assigned to elements in the xhtml that aren't represented in the stylesheet. Last edited by DiapDealer; 06-01-2012 at 05:59 PM. |
06-01-2012, 06:54 PM | #7 | |
Jr. - Junior Member
Posts: 586
Karma: 2000358
Join Date: Aug 2010
Location: Alabama
Device: Archos, Asus, HP, Lenovo, Nexus and Samsung tablets in 7,8 and 10"
|
Quote:
Actually it kinda does do the reverse. The typos I was talking about are misspelled labels (classes?) that will cause Calibre to create a label for me. Since Calibre always uses "calibreXX" as labels and the output is in alphabetical order they are easy to spot and fix (all of my labels are descriptive). When I go to the label created by Calibre I normally find that it has replaced something like class="txet" instead of "text". Regards - John |
|
06-01-2012, 07:30 PM | #8 | |
Grand Sorcerer
Posts: 27,602
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
Quote:
Last edited by DiapDealer; 06-01-2012 at 07:33 PM. |
|
06-02-2012, 12:17 AM | #9 |
Wizard
Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
|
well you could use a hybrid approach.
1. save original epub or set that option in calibre for epub to epub convert. 2. run the calibre convert 3. open the result & make a note of what styles were deleted 4. delete that epub file, restore original , manually delete styles noted in step 3. cold be faster than lots of find / count operations ? I'd love to see a sigil enhancement or plug in though - that would persuade me to move up from version 4 Last edited by cybmole; 06-02-2012 at 12:20 AM. |
06-02-2012, 01:52 AM | #10 |
Wizard
Posts: 2,608
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
|
So, to sum it up for now, we can try first to use calibre epub to epub conversion as a sniffing dog on a copy of our EPUB. This conversion should provide us with us some modified selectors names (the one calibre needed to modify).
Then, as DiapDealer said, we can come back to the original EPUB and ask these modified selectors the \bsuspect\b question. Thanks for these tips. I keep the file open because, of course, there could be other proposals. |
06-02-2012, 02:46 AM | #11 |
frumious Bandersnatch
Posts: 7,516
Karma: 19000001
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
|
A tool that can detect unused/undefined styles would be very useful indeed. Just note that unused styles could be more that just classes. A stylesheet could contain something like this:
Code:
div p.caption { ... } A possible hackish way to look for unused styles: Set everything to display:none, except for one selector. Open the files in a browser. If you see anything, that selector is being used there. Repeat with a different selector, etc. Finding all undefined styles is harder/impossible. How could an automated tool know that, for this: Code:
<div class="poetry"> <p>...</p> <p>...</p> </div> |
06-02-2012, 08:04 AM | #12 |
Grand Sorcerer
Posts: 27,602
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
True, that would be next to impossible to programmatically determine... but I wasn't really thinking of that level of control. I was only thinking of truly empty/orphaned class names—meaning that given your html example, ".poetry" simply doesn't exist anywhere in the CSS.
|
06-02-2012, 10:22 AM | #13 |
Wizard
Posts: 2,608
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
|
I am a little off limits here, but this may interest some people.
As a good part of my work is done upstream in OpenOffice, I asked, on a French Linux forum, if there was any tool for this kind of job. Well, yes. This is a macro that detects unused personalized styles in a odt file and offers to delete them. I tried it and it works (at least in French language) for this purpose, first for character, then for paragraph styles. You have to insert it in My Macros/Standard. Spoiler:
It really works. End of offlimits. Last edited by roger64; 06-07-2012 at 03:49 AM. Reason: spoiler |
06-02-2012, 02:04 PM | #14 |
Grand Sorcerer
Posts: 12,256
Karma: 74007256
Join Date: Nov 2007
Location: Toronto
Device: Nexus 7, Clara, Touch, Tolino EPOS
|
You might like to look at Degristling the sausage that has some discussions on locating unused CSS classes.
Unfortunately, it uses a Mac editor BBEDIT However in one of the comments the responder points to a python3 script that will list all used CSS styles. Code:
Usage: from the CLI, type ./cssstylelist.py > dump.txt Spoiler:
|
06-02-2012, 02:55 PM | #15 |
Sigil developer
Posts: 1,274
Karma: 1101600
Join Date: Jan 2011
Location: UK
Device: Kindle PW, K4 NT, K3, Kobo Touch
|
Thanks.
Getting the list of used styles should be fairly straightforward if we're just looking for classes that are used. The internal XML parsers that are used can list them like this script. How to list them is another matter. But parsing the CSS files to find the unused items is entirely different. There are a lot more ways to define items in the CSS than I've used. An interesting challenge to look into for a later version. |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Buy Broken or unused readers for the Museum | eBookLuke | Flea Market | 14 | 05-06-2012 11:30 AM |
Free Broken or unused readers for the Museum | eBookLuke | Flea Market | 0 | 05-22-2011 06:52 AM |
How often should an unused Kobo be charged? | Gary_M_Mugford | Kobo Reader | 2 | 10-30-2010 10:38 PM |
Unutterably Silly Zelda's gallery of used and unused avatars | Wetdogeared | Lounge | 40 | 05-16-2009 11:31 AM |
iLiad The six unused connections | design256 | iRex Developer's Corner | 10 | 09-13-2006 08:52 AM |