Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Editor

Notices

Reply
 
Thread Tools Search this Thread
Old 03-14-2016, 03:40 PM   #1
dicknskip
Zealot
dicknskip began at the beginning.
 
Posts: 134
Karma: 10
Join Date: Nov 2009
Location: Okotoks, AB, Canada
Device: iPad V-3
find: <i>(.*)</i>

I want to do a search that finds: <i>stuff and things</i> and there is more than one occurrence in the paragraph. I want to find them one at a time. I am using: <i>(.*)</i> as my search. I looked up the regex and can't seem to work out the code that will find each occurrence rather than span from the first <i> to the last </i> in the paragraph. The contents could be upper, lower, punctuation, numbers or white space, at random, thus the .*. TIA.
dicknskip is offline   Reply With Quote
Old 03-14-2016, 04:10 PM   #2
jackie_w
Grand Sorcerer
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 6,266
Karma: 16544702
Join Date: Sep 2009
Location: UK
Device: ClaraHD, Forma, Libra2, Clara2E, LibraCol, PBTouchHD3
I think you need Regex's advanced features 'lookahead' and 'lookbehind' to do what you want. I'm by no means expert in this but I think it would be something like
Code:
(?<=<i>)(.*)(?=</i>)
Google should find you more detailed info.
jackie_w is offline   Reply With Quote
Old 03-14-2016, 06:39 PM   #3
davidfor
Grand Sorcerer
davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.
 
Posts: 24,905
Karma: 47303824
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
I would have used:

Code:
<i>(.*?)</i>
The ? makes the match non-greedy.
davidfor is offline   Reply With Quote
Old 03-14-2016, 08:57 PM   #4
dicknskip
Zealot
dicknskip began at the beginning.
 
Posts: 134
Karma: 10
Join Date: Nov 2009
Location: Okotoks, AB, Canada
Device: iPad V-3
That worked perfectly. Thanks.
dicknskip is offline   Reply With Quote
Old 03-15-2016, 12:53 AM   #5
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
Quote:
Originally Posted by jackie_w View Post
I think you need Regex's advanced features 'lookahead' and 'lookbehind' to do what you want. I'm by no means expert in this but I think it would be something like
Code:
(?<=<i>)(.*)(?=</i>)
Google should find you more detailed info.
Wrong use of lookaround that restricts you to just matching the text instead of the tags as well.
Useful feature, but not quite what the OP wanted.

...


davidfor's solution works, kind of... but the proper way to do it looks like this:

Code:
<i>((?:(?!</?i>).)*)</i>
This (the section in red) uses the power of negative lookaheads to ensure that the dot-match-everything can only match characters that are NOT followed by an "i" opening/closing tag.

...

The difference between mine and davidfor's solutions is that, given the sample text:

Code:
<i>sample <i>text</i></i>
Mine will match the single internal unit "<i>text</i>" whereas davidfor's will erroneously match from the first opening "i" tag until the first closing "i" tag.

This would matter a lot more if we were talking about a tag which is meant to be nested.
Like spans.

Which is why it is a very useful principle to know, although in this case you really can just make do with the easy solution.





credits: I originally learned this trick here: https://stackoverflow.com/questions/.../406408#406408

Last edited by eschwartz; 03-15-2016 at 12:55 AM.
eschwartz is offline   Reply With Quote
Old 03-15-2016, 09:37 AM   #6
jackie_w
Grand Sorcerer
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 6,266
Karma: 16544702
Join Date: Sep 2009
Location: UK
Device: ClaraHD, Forma, Libra2, Clara2E, LibraCol, PBTouchHD3
Quote:
Originally Posted by eschwartz View Post
... that restricts you to just matching the text instead of the tags as well.
Unfortunately that was actually the question I thought was being asked. Now I've re-read the OP, I see I was mistaken.

It would appear my Regex skills exceed my skills in basic reading/comprehension in my native language I wonder how common that is?

Sorry for any confusion, dicknskip.
jackie_w is offline   Reply With Quote
Old 03-15-2016, 09:57 AM   #7
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
Quote:
Originally Posted by jackie_w View Post
It would appear my Regex skills exceed my skills in basic reading/comprehension in my native language I wonder how common that is?
Happens to us all.

(Except me, obviously. I deny everything. )
eschwartz is offline   Reply With Quote
Old 03-15-2016, 10:24 AM   #8
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 31,240
Karma: 61360164
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by jackie_w View Post
Unfortunately that was actually the question I thought was being asked. Now I've re-read the OP, I see I was mistaken.

It would appear my Regex skills exceed my skills in basic reading/comprehension in my native language I wonder how common that is?

Sorry for any confusion, dicknskip.
I think this line currently wraps around a New York City block (they tend to be BIG).

You are behind me, somewhere, way back there... (And I can't see the front-o-the-line )
theducks is offline   Reply With Quote
Old 03-15-2016, 10:50 AM   #9
jackie_w
Grand Sorcerer
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 6,266
Karma: 16544702
Join Date: Sep 2009
Location: UK
Device: ClaraHD, Forma, Libra2, Clara2E, LibraCol, PBTouchHD3
Quote:
Originally Posted by theducks View Post
I think this line currently wraps around a New York City block (they tend to be BIG).

You are behind me, somewhere, way back there... (And I can't see the front-o-the-line )
That many! Can you see any of the world's great thinkers, movers and shakers in that line? If so we're all DOOMED!
jackie_w is offline   Reply With Quote
Old 03-17-2016, 11:47 PM   #10
Barook
Junior Member
Barook began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Mar 2016
Device: none
I do a first pass like so:
Code:
   text: <span class="italic">something</span>
replace: <span class="italic">([^<]+)</span>
   with: <i>\1</i>
 result: <i>something</i>
It doesn't work when there are other tags inside the <i> tags...
Code:
<span class="italic"><span class="bold">something</span></span>
... but a non-greedy replace usually cleans those up:
Code:
   text: <span class="italic"><span class="bold">something</span></span>
replace: <span class="italic">(.*?)</span>
   with: <i>\1</i>
 result: <i><span class="bold">something</span></i>
replace: <span class="bold">(.*?)</span>
   with: <b>\1</b>
 result: <i><b>something</b></i>
A similar pattern is useful to isolate stuff in quotes:
Code:
   text: <a id="return_from_note_1"></a><a href="notes.html#note_1">see note 1</a>
replace: <a id="([^"]+)"></a><a href="([^"]+)">(.*?)</a>
   with: <a id="\1" href="\2">\3</a>
 result: <a id="return_from_note_1" href="notes.html#note_1">see note 1</a>
... or for removing all attributes from uselessly-extravagant tags:
Code:
   text: <i class="italic" style="margin:auto;padding:auto;font-size=1em;">something</i>
replace: <i[^>]+>something</i>
   with: <i>\1</i>
 result: <i>something</i>
Barook is offline   Reply With Quote
Old 03-18-2016, 12:07 AM   #11
davidfor
Grand Sorcerer
davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.
 
Posts: 24,905
Karma: 47303824
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
I haven't hit the anchor one, but for the rest, I usually use Diap's Editing Toolbag. Much easier than trying to wrap my brain around the regex.

For the last one, why do you care about matching the close tags? Just fixing the opening should be enough. So:

Code:
text: <i class="italic" style="margin:auto;padding:auto;font-size=1em;">
replace: <i[^>]+>
   with: <i>
 result: <i>
should work the same. Am I missing something?
davidfor is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Find/Replace Won't Find Rand Brittain Sigil 7 09-24-2011 04:35 AM
Get Find to find an exact phrase evwool Amazon Kindle 4 08-16-2011 08:47 AM
Find this NOT that Danger Sigil 5 12-27-2010 03:13 PM
Help me find this if you can... Spanner Reading Recommendations 3 10-12-2010 12:38 PM


All times are GMT -4. The time now is 02:02 AM.


MobileRead.com is a privately owned, operated and funded community.