![]() |
regex puzzle: finding paragraph before...
due to a badly formatted book I was trying to constuct a regex which would find any <p......./p> section which occured immediately beofre a <div, in order to then tweak that found chunk.
but I could not do it. a find expression like <p class "whatever">(.*)</p>?\s*<div is too greedy - it grabbed a whole load of paragraphs i.e. from <p para 1... <p para 2.. ... <p para n.. < div.... the above regex grabs n paragraphs , is there a way to grab only the nth one , and replace it's CSS class PS I am still using 0.42 regex or could I use a .p+div class in CSS ? |
<p class="whatever">([^<]*?)</p>\s*<div
|
Quote:
e.g. some of the paragraphs have extra embedded styles like: <p class="calibre2">Without missing a beat, <em class="calibre4">High Wire</em> replies; “Without a job, I think I would head for the stars, to see what’s out there.”</p> |
Quote:
Actually Code:
(<p.*?</p>)(\s*?<div>)I'm not shure, if regex.dotall will work at 0.42, try to add a (?s) to the search-statement. >>or could I use a .p+div class in CSS ? if you realy want to change any <div> which follows a </p>, why not ? |
If your paragraphs are contained in single lines with newlines between them you can use your pattern with a slight modification:
Code:
<p class "whatever">([^\r\n]*)</p>\s*<div |
It's pretty hard to fine-tune an expression's (non)greediness in 0.4.2 when the "Minimal Matching" check-box is the only method of control you have over it.
In 0.5.x and higher, I'd use something like: Code:
<p(.*?)?>.*?</p>(?=(\s+)?<div) |
thanks all, esp for how 0.52 is better than 0.42. I am eventually going to have enough reason to upgrade.
I see that I'm going to have to add a couple of symbols to my limited regex repertoire! so far I have muddled through without ? or ^ |
Sigil 0.5.2 search engine has some bugs while searching "all html files". Until 0.5.3 is released I suggest using 0.5.1 instead.
All Sigil 0.5 releases |
Quote:
If you need to ADD Existing files, YOU need to use the File: New and not the Instant crash, right-click menu :thumbsup: |
| All times are GMT -4. The time now is 07:54 PM. |
Powered by: vBulletin
Copyright ©2000 - 3.8.5, Jelsoft Enterprises Ltd.
MobileRead.com is a privately owned, operated and funded community.