![]() |
#16 |
frumious Bandersnatch
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 7,549
Karma: 19500001
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
|
Calibre does some flattening to the CSS which, I believe, means it turns every selector into a class... Of course I'm not suggesting this should be done in Sigil, but maybe the "logic" behind that can be used to check the styles?
|
![]() |
![]() |
![]() |
#17 |
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 114
Karma: 5246
Join Date: Jul 2010
Device: none
|
I use Linux primarily; this is what I use sometimes; it's hackish, not so elegant and it doesn't take into account css rules for HTML tags, e.g.:
h2 {font-size: xx-large;} so only css rules for .foo in e.g. <h2 class="foo">. Code:
$ cd OEBPS/Text $ pcregrep -o -h 'class=".+?"' * | sort -u | perl -p -e 's/class="//' | perl -p -e 's/("\n| )/|/' | perl -p -e 's/\|$//g' Code:
ach1|acl|bmh|byline|cn|cotx|crt|crt1|crt2|custom1|da|ded|ded1|di|dia|dropcap|dt|dt1|dt3|dt4 Code:
$ cd ../Styles $ STRING="(ach1|acl|bmh|byline|cn|cotx|crt|crt1|crt2|custom1|da|ded|ded1|di|dia|dropcap|dt|dt1|dt3|dt4)" $ pcregrep -N ANY -M '\.'$STRING'\s*?\{(\s*.+?)+?\s*\}' *.css |
![]() |
![]() |
![]() |
#18 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,625
Karma: 3120635
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
|
@Ahmad Samir
Thanks for providing a very nice code. Your first command, the list of used styles in the text, represents a good part of the answer. Would it be possible to add, for each of the listed styles, the occurrences of each of them? I mean, to get a list of this kind: |Italdroite 25|Header 12| and so on... We could focus later on the least used items. Spoiler:
I did not find the second one so useful: it somewhat replicates the style sheet and is a little too verbose, or, more probably, I missed something... Last edited by roger64; 06-07-2012 at 05:50 AM. |
![]() |
![]() |
![]() |
#19 |
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 114
Karma: 5246
Join Date: Jul 2010
Device: none
|
The second command is supposed to print only the css rules that appeared in the output of the first command... that's how it usually works here.
To get a count of the occurrences of each class, you can use something like: Code:
for i in $(pcregrep -o -h 'class=".+?"' * | sort -u); do echo "$i $(grep -r $i | wc -l)"; done | sort -t ' ' -k2 -nr Code:
class="di" 54 class="body" 52 class="toccn" 50 class="dropcap" 44 class="cotx" 44 class="cn" 44 class="acl" 14 class="p3" 6 class="crt1" 5 class="ull" 4 class="crt" 4 class="pt" 3 |
![]() |
![]() |
![]() |
#20 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,625
Karma: 3120635
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
|
![]() This is exactly what I had been wishing for! The only problem... I cannot get it working this time. After half a minute, the terminal freezes and I get no result. |
![]() |
![]() |
![]() |
#21 |
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 114
Karma: 5246
Join Date: Jul 2010
Device: none
|
|
![]() |
![]() |
![]() |
#22 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,625
Karma: 3120635
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
|
Quote:
I can return to the prompt with Ctrl+C at any time. |
|
![]() |
![]() |
![]() |
#23 |
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 114
Karma: 5246
Join Date: Jul 2010
Device: none
|
|
![]() |
![]() |
![]() |
#24 | |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 79,760
Karma: 145864619
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
Quote:
|
|
![]() |
![]() |
![]() |
#25 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,625
Karma: 3120635
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
|
Immediate result:
roger@lmde64 ~/Bureau/Coups/OEBPS/Text $ pcregrep -o -h 'class=".+?"' * | sort -u class="Centrage" class="Chanson" class="frameFrame" class="Header" class="Heading" class="Italdroite" class="let" class="let1" class="let2" class="smcpCentrage" class="smcpChanson" class="smcpDroite" class="smcpIncise" class="smcpTypeA" class="smcpTypeV" class="Standard" class="Subtitle" roger@lmde64 ~/Bureau/Coups/OEBPS/Text $ |
![]() |
![]() |
![]() |
#26 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,625
Karma: 3120635
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
|
Here is a solution for the above post coming straight from a Linux forum.
I have been told that problem was that grep was reading on stdin (for whatever it means...) Spoiler:
The result provides needed information (classes only). Last edited by roger64; 06-08-2012 at 01:59 AM. |
![]() |
![]() |
![]() |
#27 |
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 114
Karma: 5246
Join Date: Jul 2010
Device: none
|
Glad it's working for you now. 'grep -r' seems to work here, but of course 'grep -rc $i' is a better/more elegant solution than 'grep -r $i | wc -l'.
|
![]() |
![]() |
![]() |
#28 |
Berti
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,197
Karma: 4985964
Join Date: Jan 2012
Location: Zischebattem
Device: Acer Lumiread
|
I made a small Excel-Tool wich reads the files in Sigil's "scratchpad" and covers most of the requested features.
Do you Work with Linux only ? Maybe somebody find's it useful ... ------------------------------- edit 12.10.2012: Attachment removed. Sigil has now some great new features, making this one obsolete. Last edited by mmat1; 10-12-2012 at 01:06 PM. Reason: Update to version 3.1 |
![]() |
![]() |
![]() |
#29 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,625
Karma: 3120635
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
|
Quote:
On Linux, scratchpad Sigil temp files are here: /tmp/Sigil/scratchpad. Nothing too confusing about it. I did authorize your macros on OpenOffice, but nothing happened when I clicked on Analyze. Maybe it's coming from OpenOffice, maybe from permissions, probably from me... Anyway, I'm used to see things never working at first try. ![]() Which string did you use to search for the tags? Last edited by roger64; 06-08-2012 at 07:37 AM. |
|
![]() |
![]() |
![]() |
#30 | |
Berti
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,197
Karma: 4985964
Join Date: Jan 2012
Location: Zischebattem
Device: Acer Lumiread
|
Quote:
I wonder if this thing will work under LO/OO in Linux ... This is not understood. Maybe this is a correct answer: It searches for everything which follows a < and is not preceeded by a /. Must-have tags ("html", "body") are omitted in most cases. And it searches for the contents of class=".*?" |
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Buy Broken or unused readers for the Museum | eBookLuke | Flea Market | 14 | 05-06-2012 11:30 AM |
Free Broken or unused readers for the Museum | eBookLuke | Flea Market | 0 | 05-22-2011 06:52 AM |
How often should an unused Kobo be charged? | Gary_M_Mugford | Kobo Reader | 2 | 10-30-2010 10:38 PM |
Unutterably Silly Zelda's gallery of used and unused avatars | Wetdogeared | Lounge | 40 | 05-16-2009 11:31 AM |
iLiad The six unused connections | design256 | iRex Developer's Corner | 10 | 09-13-2006 08:52 AM |