Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 06-01-2012, 05:08 PM   #1
roger64
Wizard
roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.
 
Posts: 1,456
Karma: 846401
Join Date: Jan 2009
Device: KoboGlo
Cleaning a stylesheet of unused styles

Hi

I am not sure this belong to the Sigil forum. Please forgive me if I am mistaken.

Ideally we all have clean style sheets. Nearly always. Sometimes, though, from a badly formatted file, after using a too trusting and not enough discriminating converter, some (many) unused styles land in the style sheet and clutter it.

As a prevention tool, just before publishing, I'd like to be sure that my style sheet contains only styles that are really used in the html files and to be able to discard the others.

I can imagine for example a script or a tool, parsing the style sheet, then counting and summing up the occurrences of styles in the html files. After that, a style showing 0 occurrence could probably be safely deleted.

Maybe there could be other solutions?
roger64 is offline   Reply With Quote
Old 06-01-2012, 05:24 PM   #2
theducks
Grand Sorcerer
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 15,059
Karma: 5939999
Join Date: Aug 2009
Location: (The original) Silicon Valley, USA
Device: Galaxy Tab 2, Astak Pocket Pro, K4NT
Quote:
Originally Posted by roger64 View Post
Hi

I am not sure this belong to the Sigil forum. Please forgive me if I am mistaken.

Ideally we all have clean style sheets. Nearly always. Sometimes, though, from a badly formatted file, after using a too trusting and not enough discriminating converter, some (many) unused styles land in the style sheet and clutter it.

As a prevention tool, just before publishing, I'd like to be sure that my style sheet contains only styles that are really used in the html files and to be able to discard the others.

I can imagine for example a script or a tool, parsing the style sheet, then counting and summing up the occurrences of styles in the html files. After that, a style showing 0 occurrence could probably be safely deleted.

Maybe there could be other solutions?
IMHO this is a EPUB forum question, but I will not move the thread unless requested because of possible overlap with Flightcrew checks.


Feature has been requested as part of the Calibre Quality Check PI.

Flightcrew only concerns itself with invalid EPUB syntax. Excess (unused) selectors is not against the rules. (In reality, Flightcrew does not complain if a selector does not exist and this should be at minimum, a Warning event . )
theducks is offline   Reply With Quote
Old 06-01-2012, 05:39 PM   #3
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 9,409
Karma: 43171350
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
I just search for occurrences of each class name in the xhtml and delete it if the search comes up empty. Yeah, it can be a bit time consuming, but in general, cleaning up the code in a book is almost always time consuming anyway, but at least you only have to do it once (for each book).

\bsuspected-unused-classname\b

A plugin API will be so welcome when it happens.
(that's just wishful thinking not a nudge or anything )

Last edited by DiapDealer; 06-01-2012 at 05:43 PM.
DiapDealer is offline   Reply With Quote
Old 06-01-2012, 06:38 PM   #4
meme
Sigil developer
meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.
 
Posts: 1,275
Karma: 1101600
Join Date: Jan 2011
Location: UK
Device: Kindle PW, K4 NT, K3, Kobo Touch
There is a long standing issue requesting this in Sigil - to identify/removed unused CSS, etc. Haven't looked at it yet, but its definitely something I'd like to see as well, if only to clean out styles I've added but then decided not to use.

Ideally finding something to base it on as building from scratch is a long road.
meme is offline   Reply With Quote
Old 06-01-2012, 06:44 PM   #5
Jabby
Jr. - Junior Member
Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.
 
Posts: 575
Karma: 2000358
Join Date: Aug 2010
Location: East Texas
Device: Archos, Asus, HP, Lenovo, Nexus and Samsung tablets in 7,8 and 10"
Doing an epub to epub conversion in Calibre you will clean the style sheet of unused stuff and put it in alphabetical order to boot. I have been surprised, more than a few times, that Calibre will catch some of my typos and create new labels for me that I can then clean up.

Regards - John
Jabby is offline   Reply With Quote
Old 06-01-2012, 06:54 PM   #6
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 9,409
Karma: 43171350
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
I love calibre, but If I just spent a long time getting the ePub's code into the shape I wanted, an epub to epub conversion with calibre is going to literally destroy a big chunk of my work.

Just a side-note... I'd be interested in the reverse of this idea as well: finding orphaned classes assigned to elements in the xhtml that aren't represented in the stylesheet.

Last edited by DiapDealer; 06-01-2012 at 06:59 PM.
DiapDealer is offline   Reply With Quote
Old 06-01-2012, 07:54 PM   #7
Jabby
Jr. - Junior Member
Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.
 
Posts: 575
Karma: 2000358
Join Date: Aug 2010
Location: East Texas
Device: Archos, Asus, HP, Lenovo, Nexus and Samsung tablets in 7,8 and 10"
Quote:
Originally Posted by DiapDealer View Post
I love calibre, but If I just spent a long time getting the ePub's code into the shape I wanted, an epub to epub conversion with calibre is going to literally destroy a big chunk of my work.

Just a side-note... I'd be interested in the reverse of this idea as well: finding orphaned classes assigned to elements in the xhtml that aren't represented in the stylesheet.
I guess I'm just lucky. Calibre has never screwed up any of my epubs, although I keep my original just in case. Of course I don't do anything more complicated than a table (which is pretty complicated for me). No linked footnotes or such.

Actually it kinda does do the reverse. The typos I was talking about are misspelled labels (classes?) that will cause Calibre to create a label for me. Since Calibre always uses "calibreXX" as labels and the output is in alphabetical order they are easy to spot and fix (all of my labels are descriptive). When I go to the label created by Calibre I normally find that it has replaced something like class="txet" instead of "text".

Regards - John
Jabby is offline   Reply With Quote
Old 06-01-2012, 08:30 PM   #8
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 9,409
Karma: 43171350
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
I guess I'm just lucky. Calibre has never screwed up any of my epubs, although I keep my original just in case. Of course I don't do anything more complicated than a table
Calibre assigns every, single html element a class. That's the part I can't tolerate (but I understand why they've chosen to do it that way). I want bare-naked tags, except for the handful of paragraphs/spans/etc... that need special styling. But I fully admit I'm just extra finicky when it comes to that sort of coding convention. I'm a rabid minimalist.

Last edited by DiapDealer; 06-01-2012 at 08:33 PM.
DiapDealer is offline   Reply With Quote
Old 06-02-2012, 01:17 AM   #9
cybmole
Wizard
cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.
 
Posts: 2,962
Karma: 1280000
Join Date: Sep 2010
Device: Kobo aura HD, Kobo Arc, Kindle Fire HDX 8.9 , Kindle for PC
well you could use a hybrid approach.
1. save original epub or set that option in calibre for epub to epub convert.
2. run the calibre convert
3. open the result & make a note of what styles were deleted
4. delete that epub file, restore original , manually delete styles noted in step 3.

cold be faster than lots of find / count operations ?

I'd love to see a sigil enhancement or plug in though - that would persuade me to move up from version 4

Last edited by cybmole; 06-02-2012 at 01:20 AM.
cybmole is offline   Reply With Quote
Old 06-02-2012, 02:52 AM   #10
roger64
Wizard
roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.
 
Posts: 1,456
Karma: 846401
Join Date: Jan 2009
Device: KoboGlo
Quote:
Originally Posted by cybmole View Post
well you could use a hybrid approach.
So, to sum it up for now, we can try first to use calibre epub to epub conversion as a sniffing dog on a copy of our EPUB. This conversion should provide us with us some modified selectors names (the one calibre needed to modify).

Then, as DiapDealer said, we can come back to the original EPUB and ask these modified selectors the \bsuspect\b question.

Thanks for these tips.

I keep the file open because, of course, there could be other proposals.
roger64 is offline   Reply With Quote
Old 06-02-2012, 03:46 AM   #11
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 6,253
Karma: 4801165
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
A tool that can detect unused/undefined styles would be very useful indeed. Just note that unused styles could be more that just classes. A stylesheet could contain something like this:

Code:
div p.caption { ... }
and if the XHTML does contain <p class="caption">, but not inside a <div>, the above style would be unused. So a CSS checking tool should be a little more sophisticated.

A possible hackish way to look for unused styles: Set everything to display:none, except for one selector. Open the files in a browser. If you see anything, that selector is being used there. Repeat with a different selector, etc.

Finding all undefined styles is harder/impossible. How could an automated tool know that, for this:

Code:
<div class="poetry">
<p>...</p>
<p>...</p>
</div>
the "div.poetry p" style is missing?
Jellby is offline   Reply With Quote
Old 06-02-2012, 09:04 AM   #12
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 9,409
Karma: 43171350
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by Jellby View Post
Finding all undefined styles is harder/impossible. How could an automated tool know that, for this:

Code:
<div class="poetry">
<p>...</p>
<p>...</p>
</div>
the "div.poetry p" style is missing?
True, that would be next to impossible to programmatically determine... but I wasn't really thinking of that level of control. I was only thinking of truly empty/orphaned class names—meaning that given your html example, ".poetry" simply doesn't exist anywhere in the CSS.
DiapDealer is offline   Reply With Quote
Old 06-02-2012, 11:22 AM   #13
roger64
Wizard
roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.
 
Posts: 1,456
Karma: 846401
Join Date: Jan 2009
Device: KoboGlo
I am a little off limits here, but this may interest some people.

As a good part of my work is done upstream in OpenOffice, I asked, on a French Linux forum, if there was any tool for this kind of job. Well, yes.

This is a macro that detects unused personalized styles in a odt file and offers to delete them. I tried it and it works (at least in French language) for this purpose, first for character, then for paragraph styles.
You have to insert it in My Macros/Standard.

Spoiler:


Code:
'---------------------------------------------------------- 03/02/2012
' Supprimer les styles personnalisés inutilisés
' d'un document texte ou d'un classeur
'---------------------------------------------------------------------
sub stylesPersoInutiles()
dim coStylesFamilles as object, oStyleFamille as object
dim oStyle as object, nomFamille as string
dim f as long, x as long
dim ts(), buf as string, iRet as integer
const SEP = ", "

    coStylesFamilles = thisComponent.StyleFamilies
    for f = 0 to coStylesFamilles.count -1
        ' Pour chaque famille
        nomFamille = coStylesFamilles.elementNames(f)
        oStyleFamille = coStylesFamilles.getByName(nomFamille)
        buf = ""
        for x = 0 to oStyleFamille.Count -1
            ' Pour chaque style
            oStyle = oStyleFamille(x)
'xray oStyle            
            if (oStyle.isUserDefined) and (not oStyle.isInUse) then
                buf = buf & oStyle.name & SEP
            end if
        next x
        
        if len(buf) > len(SEP) then
            buf = left(buf, len(buf) - len(SEP))
            iRet = msgBox("Styles personnalisés non utilisés : " _
                & chr(13) & buf & chr(13) & chr(13) _
                & "Faut-il les détruire ?", 4+32+256, nomFamille)
            if iRet = 6 then
                ts = split(buf, SEP)
                for x = 0 to uBound(ts) 
                    oStyleFamille.removeByName(ts(x))
                next x
            end if
        end if
    next f
end sub


It really works.



End of offlimits.

Last edited by roger64; 06-07-2012 at 04:49 AM. Reason: spoiler
roger64 is offline   Reply With Quote
Old 06-02-2012, 03:04 PM   #14
PeterT
Taking a break; Fed up
PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.
 
PeterT's Avatar
 
Posts: 7,183
Karma: 45264785
Join Date: Nov 2007
Location: Toronto
Device: Wife: Touch, Arc, Vox Me: Nexus 7, Glo
You might like to look at Degristling the sausage that has some discussions on locating unused CSS classes.

Unfortunately, it uses a Mac editor BBEDIT

However in one of the comments the responder points to a python3 script that will list all used CSS styles.

Code:
Usage: from the CLI, type ./cssstylelist.py > dump.txt
Spoiler:

Code:
#! /usr/bin/env python3
# file: cssstylelist.py
# Make a list of every style used in html and returns that
# From the CLI, type `./cssstylelist.py > file_with_a_list.txt
# Feel free to address any complain to @gabalese

import os, glob, sys
try:
	from lxml import etree as ET
except ImportError:
	import xml.etree.ElementTree as ET
	print("lxml not installed. Running with xml.etree instead")

path = "OEBPS/Text" # your mileage may vary
list = []
new_list = []

def cssList():
	global list
	global new_list 
	for infile in glob.glob(os.path.join(path, '*html')):
		try:
			html = ET.parse(infile).getroot()
		except:
			print("ERROR: Unable to parse " + infile)
			print("This is likely to happen with ill-formed xhtml files.")
			sys.exit(1)
		for i in html.iter():
			list.append(i.get("class"))
	
	for i in list:
		if i not in new_list:
			if i is not None:
				new_list.append(i)
			
	return new_list	


	
if __name__ == "__main__":
	for item in (cssList()):
		print(item)
PeterT is offline   Reply With Quote
Old 06-02-2012, 03:55 PM   #15
meme
Sigil developer
meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.
 
Posts: 1,275
Karma: 1101600
Join Date: Jan 2011
Location: UK
Device: Kindle PW, K4 NT, K3, Kobo Touch
Thanks.

Getting the list of used styles should be fairly straightforward if we're just looking for classes that are used. The internal XML parsers that are used can list them like this script. How to list them is another matter.

But parsing the CSS files to find the unused items is entirely different. There are a lot more ways to define items in the CSS than I've used.

An interesting challenge to look into for a later version.
meme is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Buy Broken or unused readers for the Museum eBookLuke Flea Market 14 05-06-2012 12:30 PM
Free Broken or unused readers for the Museum eBookLuke Flea Market 0 05-22-2011 07:52 AM
How often should an unused Kobo be charged? Gary_M_Mugford Kobo Reader 2 10-30-2010 11:38 PM
Unutterably Silly Zelda's gallery of used and unused avatars Wetdogeared Lounge 40 05-16-2009 12:31 PM
iLiad The six unused connections design256 iRex Developer's Corner 10 09-13-2006 09:52 AM


All times are GMT -4. The time now is 01:34 AM.


MobileRead.com is a privately owned, operated and funded community.