|  06-02-2012, 03:08 PM | #16 | 
| frumious Bandersnatch            Posts: 7,570 Karma: 20150435 Join Date: Jan 2008 Location: Spaniard in Sweden Device: Cybook Orizon, Kobo Aura | 
			
			Calibre does some flattening to the CSS which, I believe, means it turns every selector into a class... Of course I'm not suggesting this should be done in Sigil, but maybe the "logic" behind that can be used to check the styles?
		 | 
|   |   | 
|  06-04-2012, 02:11 AM | #17 | 
| Zealot            Posts: 114 Karma: 5246 Join Date: Jul 2010 Device: none | 
			
			I use Linux primarily; this is what I use sometimes; it's hackish, not so elegant and it doesn't take into account css rules for HTML tags, e.g.: h2 {font-size: xx-large;} so only css rules for .foo in e.g. <h2 class="foo">. Code: $ cd OEBPS/Text
$ pcregrep -o -h 'class=".+?"' * | sort -u | perl -p -e 's/class="//' | perl -p -e 's/("\n| )/|/' | perl -p -e 's/\|$//g'Code: ach1|acl|bmh|byline|cn|cotx|crt|crt1|crt2|custom1|da|ded|ded1|di|dia|dropcap|dt|dt1|dt3|dt4 Code: $ cd ../Styles
$ STRING="(ach1|acl|bmh|byline|cn|cotx|crt|crt1|crt2|custom1|da|ded|ded1|di|dia|dropcap|dt|dt1|dt3|dt4)"
$ pcregrep -N ANY -M '\.'$STRING'\s*?\{(\s*.+?)+?\s*\}' *.css | 
|   |   | 
|  06-07-2012, 05:47 AM | #18 | 
| Wizard            Posts: 2,625 Karma: 3120635 Join Date: Jan 2009 Device: Kindle PW3 (wifi) | 
			
			@Ahmad Samir Thanks for providing a very nice code. Your first command, the list of used styles in the text, represents a good part of the answer. Would it be possible to add, for each of the listed styles, the occurrences of each of them? I mean, to get a list of this kind: |Italdroite 25|Header 12| and so on... We could focus later on the least used items. Spoiler: 
 I did not find the second one so useful: it somewhat replicates the style sheet and is a little too verbose, or, more probably, I missed something... Last edited by roger64; 06-07-2012 at 05:50 AM. | 
|   |   | 
|  06-07-2012, 08:48 AM | #19 | 
| Zealot            Posts: 114 Karma: 5246 Join Date: Jul 2010 Device: none | 
			
			The second command is supposed to print only the css rules that appeared in the output of the first command... that's how it usually works here. To get a count of the occurrences of each class, you can use something like: Code: for i in $(pcregrep -o -h 'class=".+?"' * | sort -u); do echo "$i $(grep -r $i | wc -l)"; done | sort -t ' ' -k2 -nr Code: class="di" 54 class="body" 52 class="toccn" 50 class="dropcap" 44 class="cotx" 44 class="cn" 44 class="acl" 14 class="p3" 6 class="crt1" 5 class="ull" 4 class="crt" 4 class="pt" 3 | 
|   |   | 
|  06-07-2012, 10:13 AM | #20 | 
| Wizard            Posts: 2,625 Karma: 3120635 Join Date: Jan 2009 Device: Kindle PW3 (wifi) |  This is exactly what I had been wishing for! The only problem... I cannot get it working this time. After half a minute, the terminal freezes and I get no result. | 
|   |   | 
|  06-07-2012, 10:55 AM | #21 | 
| Zealot            Posts: 114 Karma: 5246 Join Date: Jul 2010 Device: none | |
|   |   | 
|  06-07-2012, 11:11 AM | #22 | |
| Wizard            Posts: 2,625 Karma: 3120635 Join Date: Jan 2009 Device: Kindle PW3 (wifi) | Quote: 
 I can return to the prompt with Ctrl+C at any time. | |
|   |   | 
|  06-07-2012, 11:22 AM | #23 | 
| Zealot            Posts: 114 Karma: 5246 Join Date: Jul 2010 Device: none | |
|   |   | 
|  06-07-2012, 11:24 AM | #24 | |
| Resident Curmudgeon            Posts: 80,665 Karma: 150249619 Join Date: Nov 2006 Location: Roslindale, Massachusetts Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3 | Quote: 
 | |
|   |   | 
|  06-07-2012, 11:44 AM | #25 | 
| Wizard            Posts: 2,625 Karma: 3120635 Join Date: Jan 2009 Device: Kindle PW3 (wifi) | 
			
			Immediate result: roger@lmde64 ~/Bureau/Coups/OEBPS/Text $ pcregrep -o -h 'class=".+?"' * | sort -u class="Centrage" class="Chanson" class="frameFrame" class="Header" class="Heading" class="Italdroite" class="let" class="let1" class="let2" class="smcpCentrage" class="smcpChanson" class="smcpDroite" class="smcpIncise" class="smcpTypeA" class="smcpTypeV" class="Standard" class="Subtitle" roger@lmde64 ~/Bureau/Coups/OEBPS/Text $ | 
|   |   | 
|  06-08-2012, 01:51 AM | #26 | 
| Wizard            Posts: 2,625 Karma: 3120635 Join Date: Jan 2009 Device: Kindle PW3 (wifi) | 
			
			Here is a solution for the above post coming straight from a Linux forum. I have been told that problem was that grep was reading on stdin (for whatever it means...) Spoiler: 
 The result provides needed information (classes only). Last edited by roger64; 06-08-2012 at 01:59 AM. | 
|   |   | 
|  06-08-2012, 03:35 AM | #27 | 
| Zealot            Posts: 114 Karma: 5246 Join Date: Jul 2010 Device: none | 
			
			Glad it's working for you now. 'grep -r' seems to work here, but of course 'grep -rc $i' is a better/more elegant solution than 'grep -r $i | wc -l'.
		 | 
|   |   | 
|  06-08-2012, 05:40 AM | #28 | 
| Berti            Posts: 1,197 Karma: 4985964 Join Date: Jan 2012 Location: Zischebattem Device: Acer Lumiread | 
			
			I made a small Excel-Tool wich reads the files in Sigil's "scratchpad" and covers most of the requested features. Do you Work with Linux only ? Maybe somebody find's it useful ... ------------------------------- edit 12.10.2012: Attachment removed. Sigil has now some great new features, making this one obsolete. Last edited by mmat1; 10-12-2012 at 01:06 PM. Reason: Update to version 3.1 | 
|   |   | 
|  06-08-2012, 07:35 AM | #29 | |
| Wizard            Posts: 2,625 Karma: 3120635 Join Date: Jan 2009 Device: Kindle PW3 (wifi) | Quote: 
 On Linux, scratchpad Sigil temp files are here: /tmp/Sigil/scratchpad. Nothing too confusing about it. I did authorize your macros on OpenOffice, but nothing happened when I clicked on Analyze. Maybe it's coming from OpenOffice, maybe from permissions, probably from me... Anyway, I'm used to see things never working at first try.  Which string did you use to search for the tags? Last edited by roger64; 06-08-2012 at 07:37 AM. | |
|   |   | 
|  06-08-2012, 10:14 AM | #30 | |
| Berti            Posts: 1,197 Karma: 4985964 Join Date: Jan 2012 Location: Zischebattem Device: Acer Lumiread | Quote: 
 I wonder if this thing will work under LO/OO in Linux ... This is not understood. Maybe this is a correct answer: It searches for everything which follows a < and is not preceeded by a /. Must-have tags ("html", "body") are omitted in most cases. And it searches for the contents of class=".*?" | |
|   |   | 
|  | 
| Thread Tools | Search this Thread | 
| 
 | 
|  Similar Threads | ||||
| Thread | Thread Starter | Forum | Replies | Last Post | 
| Buy Broken or unused readers for the Museum | eBookLuke | Flea Market | 14 | 05-06-2012 11:30 AM | 
| Free Broken or unused readers for the Museum | eBookLuke | Flea Market | 0 | 05-22-2011 06:52 AM | 
| How often should an unused Kobo be charged? | Gary_M_Mugford | Kobo Reader | 2 | 10-30-2010 10:38 PM | 
| Unutterably Silly Zelda's gallery of used and unused avatars | Wetdogeared | Lounge | 40 | 05-16-2009 11:31 AM | 
| iLiad The six unused connections | design256 | iRex Developer's Corner | 10 | 09-13-2006 08:52 AM |