![]() |
#1 |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 358
Karma: 65460
Join Date: Jun 2011
Device: Kindle
|
restricting regex to single lines of code?
Not too long ago I asked a question about how do the opposite of this. Naturally now I need to now how to do the opposite of that...
I'm currently working on a document in which all double-quotations marks have been replaced with a question mark. Obviously, I'm trying to undo that. The way i had planned on doing that is by search for strings that begin with ? and end with two consecutive punctuation marks, the latter of which also being ? (i.e., .? or !? or ,? or ??). The search I was using was: Code:
\?([^\.]*)\.\? example: Code:
<p>?Or the television reports??</p> <p>?No.?</p> How do I do this right? (PS - using "<" as a marker won't work because not all dialogue finishes at the end of a paragraph ---> eg. <p>"Let's get out of here!" he yelled.</p>) Last edited by ElMiko; 01-28-2012 at 12:18 AM. |
![]() |
![]() |
![]() |
#2 | |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 30,891
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Quote:
(\!\?|\?\?|\.\?) just add all the escaped combinations you are looking for, separated by a pipe |
|
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 358
Karma: 65460
Join Date: Jun 2011
Device: Kindle
|
Yeah, I mean, that's how my search is set up right now (as per the example), but as I say, it's being too greedy in what it matches.
|
![]() |
![]() |
![]() |
#4 | |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 30,891
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Quote:
Is it possible you are trying to do this in just 1 pass? I would try for little bites ![]() ![]() |
|
![]() |
![]() |
![]() |
#5 | |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 358
Karma: 65460
Join Date: Jun 2011
Device: Kindle
|
Quote:
![]() So what you're suggesting in order to find: ?No.? in my initial example is: Code:
\?([^\.]*)*?\.\? |
|
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 30,891
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
For trailing with Punct inside the quotes
Search Code:
(\!|\.|\,|\?)\?+? Code:
\1" Code:
\?([A-Za-z])+? Code:
"\1 |
![]() |
![]() |
![]() |
#7 |
Evangelist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 416
Karma: 1045911
Join Date: Sep 2011
Location: Cape Town, South Africa
Device: Kindle 3
|
My guess would be something like this, tho I'm 90% asleep and off to bed... haven't tested it, but should work in most cases.
Code:
\?(\w[^?]+[[:punct:]])\? replace : “\1” |
![]() |
![]() |
![]() |
#8 |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 358
Karma: 65460
Join Date: Jun 2011
Device: Kindle
|
Thank you both!
How do i list all results? |
![]() |
![]() |
![]() |
#9 |
Evangelist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 416
Karma: 1045911
Join Date: Sep 2011
Location: Cape Town, South Africa
Device: Kindle 3
|
You can't at the moment within Sigil, I use my shells regex module to check inside the .epub... not too sure what the easiest method would be in Windows.
|
![]() |
![]() |
![]() |
#10 |
Member
![]() Posts: 19
Karma: 10
Join Date: Apr 2011
Device: kindle dx
|
I don't sigil can do that. You can do this in linux system like Serpentine said.
Last edited by congngo; 01-27-2012 at 02:58 PM. Reason: newer |
![]() |
![]() |
![]() |
#11 |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 358
Karma: 65460
Join Date: Jun 2011
Device: Kindle
|
Ah, i see. Thanks for the headsup.
Part of my confusion is that I found myself in a position earlier where I had to start the search string with (?s) in order to make it match more than one line of code. But now when I want to restrict the search to a single line of code, it automatically includes multiple lines! What gives? Am I just misunderstanding the mechanics of reg-ex searches? |
![]() |
![]() |
![]() |
#12 |
Member
![]() Posts: 19
Karma: 10
Join Date: Apr 2011
Device: kindle dx
|
because version 0.4.2 use QRegExp (regular expression engine) and version 0.5.0 use PCRE. It was explained earlier by user_none. PCRE is better but have different syntax.
|
![]() |
![]() |
![]() |
#13 |
Evangelist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 416
Karma: 1045911
Join Date: Sep 2011
Location: Cape Town, South Africa
Device: Kindle 3
|
(?s) is 'single line', it evaluates everything as a single line - as such .'s are not restricted to a single line, they will wrap. However if you are explicitly looking for \s's, those will also wrap around if you were not using single line, as they match the line break [\n\r].
In multiline (?m), you can use multiple [^$] to match the stard/end of lines, rather than the whole string. As always, check out http://www.pcre.org/pcre.txt It's surprisingly easy to read, just search around for a good starting point. |
![]() |
![]() |
![]() |
#14 | |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 358
Karma: 65460
Join Date: Jun 2011
Device: Kindle
|
Quote:
EDIT: WAAAAIT a minute. So are you saying that .* doesn't include \s, \n, or \r, but [^\.](for example) does? In other words, searches that use . match everything except \n \s and \r, whereas searches that use ^ match every value (including \n, \r, and \s) except the value that follows it? I think (i hope) this is becoming slightly clearer. So then is there an expression that would search for the kind of string i'm looking for now but restrict the search to a single line of code? ie maybe something that uses ^ to negate \n values? Last edited by ElMiko; 01-28-2012 at 12:38 AM. |
|
![]() |
![]() |
![]() |
#15 |
Evangelist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 416
Karma: 1045911
Join Date: Sep 2011
Location: Cape Town, South Africa
Device: Kindle 3
|
Let us use the following test text:
Code:
This is an example paragraph of text it is not very long, nor very correct. But it should be enough. Consider the expression: .+ You will have three matches : Code:
1(This is an example paragraph of text) 2(it is not very long, nor very correct.) 3(But it should be enough.) You will have two matches: Code:
1(This is an example paragraph of text) it is not very long, nor very correct.) 2(But it should be enough.) If we consider: (?s).* You will have one match: Code:
1(This is an example paragraph of text) it is not very long, nor very correct. But it should be enough.) This is caused because by default the searched string is treated as a single long line. This means that it's effectively seens as : [code]^This is an example paragraph of text\r\nit is not very long, nor very correct\.\r\n\r\nBut it should be enough\.$[code] \s is going to match those \r and \n always. So, you need to be pretty careful with \s's either way, dot matches or not. Which is why there is multiline matching, which means that the anchors in the above text are moved back to their logical positions, rather than being at the start and end of the whole string, they will now match at the start and end of each line. Making it look more like : [code]^This is an example paragraph of text$\r\n^it is not very long, nor very correct\.$\r\n\r\n^But it should be enough\.$[code] So that we can more accuratly evaluate lines, for example - let us match a line, and the following line which starts with "it is not": (?m)^(.+)\s+^(it is not.+)$ Code:
1/1(This is an example paragraph of text) 1/2(it is not very long, nor very correct.) But it should be enough. True to the line restriction, there would not be a match if it were searched for it in: Code:
This is an example paragraph of text it is not very long, nor very correct. But it should be enough. |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Restricting font size | maxivittor | ePub | 3 | 09-16-2011 12:15 PM |
Restricting the book list to the results | fartang | Library Management | 3 | 05-15-2011 10:13 AM |
Use Regex to Code an Inline TOC, from an External TOC's .ncx File | mostlynovels | ePub | 2 | 03-16-2011 12:15 PM |
restricting write access for calibre | Dopedangel | Calibre | 9 | 02-26-2010 09:55 AM |
PRS-600 Joined source code lines in pdf | ldwedari | Sony Reader | 2 | 09-14-2009 04:03 AM |