01-24-2012, 03:50 PM | #1 |
Jr. - Junior Member
Posts: 586
Karma: 2000358
Join Date: Aug 2010
Location: Alabama
Device: Archos, Asus, HP, Lenovo, Nexus and Samsung tablets in 7,8 and 10"
|
Yet another regex question
I want to find all instances of <p> followed by a lower case character. Testing just the first character.
Thanks - John |
01-24-2012, 04:02 PM | #2 |
Well trained by Cats
Posts: 29,804
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
|
Advert | |
|
01-24-2012, 05:27 PM | #3 |
Evangelist
Posts: 416
Karma: 1045911
Join Date: Sep 2011
Location: Cape Town, South Africa
Device: Kindle 3
|
Might be a bit overkill:
If you want to find paragraphs which might be incorrectly split, here's what I've come up with - it needs a little tweak sometimes, but generally rather good. I wouldn't recommend replacing everything, unless you grep first for results (think I have an alternative with span/[bsiu]'s ignored somewhere... mmm). Code:
(?smi)(?<=[^[:punct:]])</p>\s*<p[^<>]*>(?=[\.-?])|</p>\s*<p[^<>]*>(?!\s*(<[sbui]>|[[:punct:]\s])+[[:upper:]])(?=[[:punct:]\s]+[[:lower:]])|</p>\s*<p[^<>]*>((?=[ \.>]{2,}([[:punct:]]|[[:lower:]]))|(?=[[:lower:]]))|(?<=,)</p>\s*<p[^<>]*> |
01-24-2012, 05:36 PM | #4 |
Jr. - Junior Member
Posts: 586
Karma: 2000358
Join Date: Aug 2010
Location: Alabama
Device: Archos, Asus, HP, Lenovo, Nexus and Samsung tablets in 7,8 and 10"
|
Thanks ducks,
I still don't know what did but I ended up with a space, in the middle of a sentence, being replaced by </p><p> in a couple of dozen places in my document. Anyway...... This is what did it. Code:
</p>\s+<p>([a-z]) Regards = John |
01-24-2012, 05:45 PM | #5 |
Evangelist
Posts: 416
Karma: 1045911
Join Date: Sep 2011
Location: Cape Town, South Africa
Device: Kindle 3
|
Code:
(?s)</p>\s*<p\b[^<>]*>(?=[[:lower:]]) Last edited by Serpentine; 01-24-2012 at 05:48 PM. |
Advert | |
|
01-24-2012, 06:24 PM | #6 | |
Well trained by Cats
Posts: 29,804
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Quote:
[a-zI] says a thru z or I It is all in the hyphen |
|
01-25-2012, 02:39 AM | #7 | |
Wizard
Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
|
Quote:
|
|
01-30-2012, 09:11 AM | #8 | |
eBook FANatic
Posts: 18,301
Karma: 16071131
Join Date: Apr 2008
Location: Alabama, USA
Device: HP ipac RX5915 Wife's Kindle
|
Quote:
It works great! |
|
01-30-2012, 08:41 PM | #9 | |
Zealot
Posts: 119
Karma: 64428
Join Date: Aug 2011
Device: none
|
Quote:
Code:
<p>[a-z] Code:
[a-z]</p> |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Newbie question - Hardcode values on RegEx on import | PeterSm | Library Management | 1 | 10-04-2011 10:55 AM |
Regex Question involving multiple . (periods) | hanbalfrek | Conversion | 11 | 08-29-2011 05:06 PM |
Regex question and maybe some help | crutledge | Sigil | 9 | 03-10-2011 04:37 PM |
Regex Question | Archon | Conversion | 11 | 02-05-2011 10:13 AM |
Import files, regex question | al35 | Calibre | 0 | 03-22-2010 12:33 PM |