Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 02-23-2012, 02:37 PM   #1
cybmole
Wizard
cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.
 
Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
regex puzzle: finding paragraph before...

due to a badly formatted book I was trying to constuct a regex which would find any <p......./p> section which occured immediately beofre a <div, in order to then tweak that found chunk.

but I could not do it.
a find expression like <p class "whatever">(.*)</p>?\s*<div is too greedy - it grabbed a whole load of paragraphs

i.e. from
<p para 1...
<p para 2..
...
<p para n..
< div....

the above regex grabs n paragraphs , is there a way to grab only the nth one , and replace it's CSS class

PS I am still using 0.42 regex

or could I use a .p+div class in CSS ?

Last edited by cybmole; 02-23-2012 at 02:39 PM.
cybmole is offline   Reply With Quote
Old 02-23-2012, 02:56 PM   #2
WS64
WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.WS64 ought to be getting tired of karma fortunes by now.
 
WS64's Avatar
 
Posts: 660
Karma: 506380
Join Date: Aug 2010
Location: Germany
Device: Kobo Aura / PB Lux 2 / Bookeen Frontlight / Kobo Mini / Nook Color
<p class="whatever">([^<]*?)</p>\s*<div
WS64 is offline   Reply With Quote
Advert
Old 02-23-2012, 03:03 PM   #3
cybmole
Wizard
cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.
 
Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
Quote:
Originally Posted by WS64 View Post
<p class="whatever">([^<]*?)</p>\s*<div
thanks - if I read that correctly it's blocking any extra instances of < - will it cope with embedded style things like <em or < i inside of the main p tagged paragraphs ?
e.g. some of the paragraphs have extra embedded styles like:
<p class="calibre2">Without missing a beat, <em class="calibre4">High Wire</em> replies; “Without a job, I think I would head for the stars, to see what’s out there.”</p>
cybmole is offline   Reply With Quote
Old 02-23-2012, 03:36 PM   #4
mmat1
Berti
mmat1 ought to be getting tired of karma fortunes by now.mmat1 ought to be getting tired of karma fortunes by now.mmat1 ought to be getting tired of karma fortunes by now.mmat1 ought to be getting tired of karma fortunes by now.mmat1 ought to be getting tired of karma fortunes by now.mmat1 ought to be getting tired of karma fortunes by now.mmat1 ought to be getting tired of karma fortunes by now.mmat1 ought to be getting tired of karma fortunes by now.mmat1 ought to be getting tired of karma fortunes by now.mmat1 ought to be getting tired of karma fortunes by now.mmat1 ought to be getting tired of karma fortunes by now.
 
mmat1's Avatar
 
Posts: 1,196
Karma: 4985964
Join Date: Jan 2012
Location: Zischebattem
Device: Acer Lumiread
Quote:
Originally Posted by cybmole View Post
if I read that correctly it's blocking any extra instances of < - will it cope with embedded style things like <em or < i inside of the main p tagged paragraphs ?
e.g. some of the paragraphs have extra embedded styles like:
You're right, any <span>, <i> etc. will be not so good. ...

Actually
Code:
(<p.*?</p>)(\s*?<div>)
should do it, but test it carefully.

I'm not shure, if regex.dotall will work at 0.42, try to add a (?s) to the search-statement.

>>or could I use a .p+div class in CSS ?
if you realy want to change any <div> which follows a </p>, why not ?

Last edited by mmat1; 02-23-2012 at 04:22 PM.
mmat1 is offline   Reply With Quote
Old 02-23-2012, 03:47 PM   #5
Timur
Connoisseur
Timur can name that ebook in five wordsTimur can name that ebook in five wordsTimur can name that ebook in five wordsTimur can name that ebook in five wordsTimur can name that ebook in five wordsTimur can name that ebook in five wordsTimur can name that ebook in five wordsTimur can name that ebook in five wordsTimur can name that ebook in five wordsTimur can name that ebook in five wordsTimur can name that ebook in five words
 
Posts: 54
Karma: 37363
Join Date: Aug 2011
Location: Istanbul
Device: EBW1150, Nook STR
If your paragraphs are contained in single lines with newlines between them you can use your pattern with a slight modification:

Code:
<p class "whatever">([^\r\n]*)</p>\s*<div
Or you can upgrade to 0.5.1, in which .(dot) does not match newlines unless you choose "Regex Dotall" mode, and you can use your original pattern unmodified.
Timur is offline   Reply With Quote
Advert
Old 02-23-2012, 03:58 PM   #6
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,549
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
It's pretty hard to fine-tune an expression's (non)greediness in 0.4.2 when the "Minimal Matching" check-box is the only method of control you have over it.

In 0.5.x and higher, I'd use something like:
Code:
<p(.*?)?>.*?</p>(?=(\s+)?<div)

Last edited by DiapDealer; 02-23-2012 at 04:03 PM.
DiapDealer is offline   Reply With Quote
Old 02-24-2012, 02:12 AM   #7
cybmole
Wizard
cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.
 
Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
thanks all, esp for how 0.52 is better than 0.42. I am eventually going to have enough reason to upgrade.

I see that I'm going to have to add a couple of symbols to my limited regex repertoire!

so far I have muddled through without ? or ^
cybmole is offline   Reply With Quote
Old 02-24-2012, 03:43 AM   #8
Timur
Connoisseur
Timur can name that ebook in five wordsTimur can name that ebook in five wordsTimur can name that ebook in five wordsTimur can name that ebook in five wordsTimur can name that ebook in five wordsTimur can name that ebook in five wordsTimur can name that ebook in five wordsTimur can name that ebook in five wordsTimur can name that ebook in five wordsTimur can name that ebook in five wordsTimur can name that ebook in five words
 
Posts: 54
Karma: 37363
Join Date: Aug 2011
Location: Istanbul
Device: EBW1150, Nook STR
Sigil 0.5.2 search engine has some bugs while searching "all html files". Until 0.5.3 is released I suggest using 0.5.1 instead.

All Sigil 0.5 releases
Timur is offline   Reply With Quote
Old 02-24-2012, 09:06 AM   #9
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 29,801
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by Timur View Post
Sigil 0.5.2 search engine has some bugs while searching "all html files". Until 0.5.3 is released I suggest using 0.5.1 instead.

All Sigil 0.5 releases
Seconded

If you need to ADD Existing files, YOU need to use the File: New and not the Instant crash, right-click menu
theducks is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Puzzle help please! ApK Lounge 2 11-14-2011 03:18 PM
Preference: Paragraph indent or a little paragraph spacing? 1611mac General Discussions 48 11-11-2011 12:43 AM
Finding Sequences Puzzle pdurrant Lounge 12 08-03-2010 04:22 AM
Sock Puzzle pdurrant Lounge 16 06-20-2010 04:32 AM
Puzzle emonti8384 Lounge 60 02-08-2010 09:55 PM


All times are GMT -4. The time now is 12:14 PM.


MobileRead.com is a privately owned, operated and funded community.