View Single Post
Old 12-16-2012, 10:13 AM   #2
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 31,104
Karma: 60406498
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by cybmole View Post
I have not mastered how to do multi-line regex yet with the current engine.

I want to strip all but one layer of all blockquote nestings e.g,
go from 2 or more level deep like this
<blockquote class="a">
<blockquote class="a">
<blockquote class="a">
<p class="text">some text here</p>
</blockquote>
</blockquote>
</blockquote>

and reduce it to
<blockquote class="a">
<p class="text">some text here</p>
</blockquote>

i figure that the regex only needs to shrink 2 layers to 1, then if it is run multiple times it will progressively strip layers until only single layers remain.
assume also that I can by inspection insert the appropriate class names into the regex

I tried find
<blockquote class="calibre11">\s*<blockquote class="calibre11">(.*)</blockquote>\s*</blockquote>
with a suitable replace all, but it did not fully automate.( it seemed to only do one instance, evenwhen I said replace all) ; also I was not sure if I needed to tick any options e.g. Dotall
I got in the habit of starting with: (?sm)

(?sm)(?<blockquote class="a">\s+){2,}(.+?)(?</blockquote>\s+){2,}

replace is \2 <====

I only remove Nested BQ's the {2,}
theducks is online now   Reply With Quote