Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 04-10-2012, 04:52 PM   #1
ElMiko
Addict
ElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileRead
 
ElMiko's Avatar
 
Posts: 359
Karma: 65460
Join Date: Jun 2011
Device: Kindle
Trying to limit a search to a single line...

I'm trying to catch strings that look like this:

Code:
<p class="calibre1">Don’t be late for school,” she called. [...]
(Note the missing opening quotation mark)

the regex search that I'm using
Code:
>([^“]*)”
returns a string that includes multiple lines. How do i get it to limit the search to a single line?
ElMiko is offline   Reply With Quote
Old 04-10-2012, 05:24 PM   #2
Perkin
Guru
Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.
 
Perkin's Avatar
 
Posts: 657
Karma: 64171
Join Date: Sep 2010
Location: Kent, England, Sol 3, ZZ9 plural Z Alpha
Device: Sony PRS-300, Kobo Aura HD, iPad (Marvin)
Quote:
Originally Posted by ElMiko View Post
the regex search that I'm using
Code:
>([^“]*)”
Without actually testing, add a question-mark after the asterisk, to make it non-greedy.

Code:
>([^“]*?)”
Perkin is offline   Reply With Quote
Advert
Old 04-10-2012, 06:04 PM   #3
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 28,352
Karma: 203720150
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
It depends on how you define a "line." If no line-break character is ever encountered, the regex Find doesn't really care if the code and/or text wraps around the screen several times... it still considers it all one "line."
DiapDealer is offline   Reply With Quote
Old 04-10-2012, 06:12 PM   #4
ElMiko
Addict
ElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileRead
 
ElMiko's Avatar
 
Posts: 359
Karma: 65460
Join Date: Jun 2011
Device: Kindle
I mean a line of code, not a line of text.

what's happening is that my search is returning the highlighted part of the following string:

Code:
<p class="calibre1">“I have to go pick up Cindy.”</p>

<p class="calibre1">Well, don’t be late for school,” she called. [...]
when all i want it to match is:
Code:
<p class="calibre1">“I have to go pick up Cindy.”</p>

<p class="calibre1">Well, don’t be late for school,” she called. [...]

Last edited by ElMiko; 04-10-2012 at 06:16 PM.
ElMiko is offline   Reply With Quote
Old 04-10-2012, 07:09 PM   #5
Perkin
Guru
Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.
 
Perkin's Avatar
 
Posts: 657
Karma: 64171
Join Date: Sep 2010
Location: Kent, England, Sol 3, ZZ9 plural Z Alpha
Device: Sony PRS-300, Kobo Aura HD, iPad (Marvin)
Try the following, works for that small sample.

Code:
(?<=[^p]>)([^“]*?)”
Perkin is offline   Reply With Quote
Advert
Old 04-10-2012, 07:26 PM   #6
ElMiko
Addict
ElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileRead
 
ElMiko's Avatar
 
Posts: 359
Karma: 65460
Join Date: Jun 2011
Device: Kindle
Quote:
Originally Posted by Perkin View Post
Try the following, works for that small sample.

Code:
(?<=[^p]>)([^“]*?)”
Indeed it does! Thanks, Perkin. If i could impose on your expertise just a little longer, though, what does the first parenthetical expression (?<=[^p]>) mean?

I tried referring to the reg-ex cheatsheet that (I believe) theducks recommended several months ago, but I couldn't really make heads or tales of it based on the descriptions.


EDIT:
aww heck, spoke too soon. If the preceding line of code doesn't contain dialogue, the expression captures multiple lines...

I still would love an explanation for why the expression you provided DID work in that case. I might just be able to figure out how to extrapolate a comprehensive solution from that without bothering you guys any further!

Last edited by ElMiko; 04-10-2012 at 07:31 PM.
ElMiko is offline   Reply With Quote
Old 04-10-2012, 07:49 PM   #7
Perkin
Guru
Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.
 
Perkin's Avatar
 
Posts: 657
Karma: 64171
Join Date: Sep 2010
Location: Kent, England, Sol 3, ZZ9 plural Z Alpha
Device: Sony PRS-300, Kobo Aura HD, iPad (Marvin)
(?<=[^p]>)

The (?<= means that it looks for the next bit, but doesn't include it in the match, the [^p]> is looking for a '>' that isn't preceeded by a 'p' so it isn't matching on the the end paragraph tags, and the close ) is closing that group, then your actual search takes place.

Hope that helps you work it out.

Edit: If you give a larger sample and what it's messing up on and what you want matched, I'll have a look again.

Last edited by Perkin; 04-10-2012 at 07:58 PM.
Perkin is offline   Reply With Quote
Old 04-11-2012, 05:56 PM   #8
Timur
Connoisseur
Timur can name that ebook in five wordsTimur can name that ebook in five wordsTimur can name that ebook in five wordsTimur can name that ebook in five wordsTimur can name that ebook in five wordsTimur can name that ebook in five wordsTimur can name that ebook in five wordsTimur can name that ebook in five wordsTimur can name that ebook in five wordsTimur can name that ebook in five wordsTimur can name that ebook in five words
 
Posts: 54
Karma: 37363
Join Date: Aug 2011
Location: Istanbul
Device: EBW1150, Nook STR
Adding newlines in the character class will prohibit multiline matches in this case:

Code:
>([^“\r\n]*)”
If you prefer it that way, you can augment the pattern with a look-behind assertion to get rid of the ">" character from the matched string:

Code:
(?<=>)([^“\r\n]*)”
Timur is offline   Reply With Quote
Old 04-11-2012, 06:18 PM   #9
ElMiko
Addict
ElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileRead
 
ElMiko's Avatar
 
Posts: 359
Karma: 65460
Join Date: Jun 2011
Device: Kindle
Quote:
Originally Posted by Timur View Post
Adding newlines in the character class will prohibit multiline matches in this case:

Code:
>([^“\r\n]*)”
If you prefer it that way, you can augment the pattern with a look-behind assertion to get rid of the ">" character from the matched string:

Code:
(?<=>)([^“\r\n]*)”
Oof... feeling really silly. I didn't know you could exclude multiple characters, much less "\X" expressions...

Thanks to both of you for gently continuing my reg-ex education!
ElMiko is offline   Reply With Quote
Old 04-11-2012, 06:34 PM   #10
Perkin
Guru
Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.
 
Perkin's Avatar
 
Posts: 657
Karma: 64171
Join Date: Sep 2010
Location: Kent, England, Sol 3, ZZ9 plural Z Alpha
Device: Sony PRS-300, Kobo Aura HD, iPad (Marvin)
A pretty good site for learning regex stuff is here
Perkin is offline   Reply With Quote
Old 04-12-2012, 03:39 PM   #11
ElMiko
Addict
ElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileReadElMiko has read every ebook posted at MobileRead
 
ElMiko's Avatar
 
Posts: 359
Karma: 65460
Join Date: Jun 2011
Device: Kindle
Quote:
Originally Posted by Perkin View Post
A pretty good site for learning regex stuff is here
Thanks, Perkin. As it happens, I already have that one bookmarked and have referenced it a few times in the past. The problem i have with all the reg-ex tutorials is that I never understand what they're talking about until after I've figured out how to do it. Either I interpret the language they use differently than what it's supposed to mean or I simply think it's gibberish. For what it's worth, the site you linked to has unquestionably produced the best results for me... but it's still hit-or-miss.
ElMiko is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Single line scrolling... Is it possible? maniac1181 Amazon Kindle 5 08-11-2011 12:19 AM
line feed search bobcdy Sigil 22 10-23-2010 09:21 PM
PRS-600 Single Vertical Line issue luluching Sony Reader 0 08-02-2010 09:36 PM
single word wrapped onto a separate line on kindle (pdf to mobi) shinew Calibre 2 03-21-2009 06:16 PM
Google Book Search to search full-text books online Bob Russell Deals and Resources (No Self-Promotion or Affiliate Links) 1 08-19-2006 12:13 PM


All times are GMT -4. The time now is 08:31 PM.


MobileRead.com is a privately owned, operated and funded community.