06-02-2012, 03:08 PM | #16 |
frumious Bandersnatch
Posts: 7,515
Karma: 18512745
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
|
Calibre does some flattening to the CSS which, I believe, means it turns every selector into a class... Of course I'm not suggesting this should be done in Sigil, but maybe the "logic" behind that can be used to check the styles?
|
06-04-2012, 02:11 AM | #17 |
Zealot
Posts: 114
Karma: 5246
Join Date: Jul 2010
Device: none
|
I use Linux primarily; this is what I use sometimes; it's hackish, not so elegant and it doesn't take into account css rules for HTML tags, e.g.:
h2 {font-size: xx-large;} so only css rules for .foo in e.g. <h2 class="foo">. Code:
$ cd OEBPS/Text $ pcregrep -o -h 'class=".+?"' * | sort -u | perl -p -e 's/class="//' | perl -p -e 's/("\n| )/|/' | perl -p -e 's/\|$//g' Code:
ach1|acl|bmh|byline|cn|cotx|crt|crt1|crt2|custom1|da|ded|ded1|di|dia|dropcap|dt|dt1|dt3|dt4 Code:
$ cd ../Styles $ STRING="(ach1|acl|bmh|byline|cn|cotx|crt|crt1|crt2|custom1|da|ded|ded1|di|dia|dropcap|dt|dt1|dt3|dt4)" $ pcregrep -N ANY -M '\.'$STRING'\s*?\{(\s*.+?)+?\s*\}' *.css |
06-07-2012, 05:47 AM | #18 |
Wizard
Posts: 2,608
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
|
@Ahmad Samir
Thanks for providing a very nice code. Your first command, the list of used styles in the text, represents a good part of the answer. Would it be possible to add, for each of the listed styles, the occurrences of each of them? I mean, to get a list of this kind: |Italdroite 25|Header 12| and so on... We could focus later on the least used items. Spoiler:
I did not find the second one so useful: it somewhat replicates the style sheet and is a little too verbose, or, more probably, I missed something... Last edited by roger64; 06-07-2012 at 05:50 AM. |
06-07-2012, 08:48 AM | #19 |
Zealot
Posts: 114
Karma: 5246
Join Date: Jul 2010
Device: none
|
The second command is supposed to print only the css rules that appeared in the output of the first command... that's how it usually works here.
To get a count of the occurrences of each class, you can use something like: Code:
for i in $(pcregrep -o -h 'class=".+?"' * | sort -u); do echo "$i $(grep -r $i | wc -l)"; done | sort -t ' ' -k2 -nr Code:
class="di" 54 class="body" 52 class="toccn" 50 class="dropcap" 44 class="cotx" 44 class="cn" 44 class="acl" 14 class="p3" 6 class="crt1" 5 class="ull" 4 class="crt" 4 class="pt" 3 |
06-07-2012, 10:13 AM | #20 |
Wizard
Posts: 2,608
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
|
This is exactly what I had been wishing for! The only problem... I cannot get it working this time. After half a minute, the terminal freezes and I get no result. |
06-07-2012, 10:55 AM | #21 |
Zealot
Posts: 114
Karma: 5246
Join Date: Jul 2010
Device: none
|
|
06-07-2012, 11:11 AM | #22 | |
Wizard
Posts: 2,608
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
|
Quote:
I can return to the prompt with Ctrl+C at any time. |
|
06-07-2012, 11:22 AM | #23 |
Zealot
Posts: 114
Karma: 5246
Join Date: Jul 2010
Device: none
|
|
06-07-2012, 11:24 AM | #24 | |
Resident Curmudgeon
Posts: 73,894
Karma: 128597114
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
Quote:
|
|
06-07-2012, 11:44 AM | #25 |
Wizard
Posts: 2,608
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
|
Immediate result:
roger@lmde64 ~/Bureau/Coups/OEBPS/Text $ pcregrep -o -h 'class=".+?"' * | sort -u class="Centrage" class="Chanson" class="frameFrame" class="Header" class="Heading" class="Italdroite" class="let" class="let1" class="let2" class="smcpCentrage" class="smcpChanson" class="smcpDroite" class="smcpIncise" class="smcpTypeA" class="smcpTypeV" class="Standard" class="Subtitle" roger@lmde64 ~/Bureau/Coups/OEBPS/Text $ |
06-08-2012, 01:51 AM | #26 |
Wizard
Posts: 2,608
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
|
Here is a solution for the above post coming straight from a Linux forum.
I have been told that problem was that grep was reading on stdin (for whatever it means...) Spoiler:
The result provides needed information (classes only). Last edited by roger64; 06-08-2012 at 01:59 AM. |
06-08-2012, 03:35 AM | #27 |
Zealot
Posts: 114
Karma: 5246
Join Date: Jul 2010
Device: none
|
Glad it's working for you now. 'grep -r' seems to work here, but of course 'grep -rc $i' is a better/more elegant solution than 'grep -r $i | wc -l'.
|
06-08-2012, 05:40 AM | #28 |
Berti
Posts: 1,196
Karma: 4985964
Join Date: Jan 2012
Location: Zischebattem
Device: Acer Lumiread
|
I made a small Excel-Tool wich reads the files in Sigil's "scratchpad" and covers most of the requested features.
Do you Work with Linux only ? Maybe somebody find's it useful ... ------------------------------- edit 12.10.2012: Attachment removed. Sigil has now some great new features, making this one obsolete. Last edited by mmat1; 10-12-2012 at 01:06 PM. Reason: Update to version 3.1 |
06-08-2012, 07:35 AM | #29 | |
Wizard
Posts: 2,608
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
|
Quote:
On Linux, scratchpad Sigil temp files are here: /tmp/Sigil/scratchpad. Nothing too confusing about it. I did authorize your macros on OpenOffice, but nothing happened when I clicked on Analyze. Maybe it's coming from OpenOffice, maybe from permissions, probably from me... Anyway, I'm used to see things never working at first try. Which string did you use to search for the tags? Last edited by roger64; 06-08-2012 at 07:37 AM. |
|
06-08-2012, 10:14 AM | #30 | |
Berti
Posts: 1,196
Karma: 4985964
Join Date: Jan 2012
Location: Zischebattem
Device: Acer Lumiread
|
Quote:
I wonder if this thing will work under LO/OO in Linux ... This is not understood. Maybe this is a correct answer: It searches for everything which follows a < and is not preceeded by a /. Must-have tags ("html", "body") are omitted in most cases. And it searches for the contents of class=".*?" |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Buy Broken or unused readers for the Museum | eBookLuke | Flea Market | 14 | 05-06-2012 11:30 AM |
Free Broken or unused readers for the Museum | eBookLuke | Flea Market | 0 | 05-22-2011 06:52 AM |
How often should an unused Kobo be charged? | Gary_M_Mugford | Kobo Reader | 2 | 10-30-2010 10:38 PM |
Unutterably Silly Zelda's gallery of used and unused avatars | Wetdogeared | Lounge | 40 | 05-16-2009 11:31 AM |
iLiad The six unused connections | design256 | iRex Developer's Corner | 10 | 09-13-2006 08:52 AM |