![]() |
#1 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
|
another regex puzzle - detect capitalised phrases
I suspect there is no easy answer to this but I will ask anyway.
given a book which uses capitalisation in lieu of scene breaks, with all paragraphs sharing the same CSS i.e. THIS IS HOW THE 1st paragraphs starts.......blah blah but not the next paragraph... Or the one after that...... .... YET SOME TIME LATER THERE is another instance ... I want to pick out those capitalised starts in order to assign a unique CSS class. but devising a rule is very hard. testing that 2nd letter of a paragraph is capitalised works most times but will miss I CANNOT GET THIS one... and will miss A TOUGH ACT TO follow and will mis-classify "I don't want this one" any better methods, anyone ? |
![]() |
![]() |
![]() |
#2 |
♫
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 661
Karma: 506380
Join Date: Aug 2010
Location: Germany
Device: Kobo Aura / PB Lux 2 / Bookeen Frontlight / Kobo Mini / Nook Color
|
<p>[^a-z]{4,}
Will find any paragraph with the first 4 being no lower ones. Of course this still will find <p>U.S.A. is the country... I guess you need to check if the 4 is enough. |
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Berti
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,197
Karma: 4985964
Join Date: Jan 2012
Location: Zischebattem
Device: Acer Lumiread
|
Code:
<p>([A-Z])([A-Z|\s[A-Z]) Actually Code:
<p>([A-Z])([\sA-Z]) Last edited by mmat1; 02-23-2012 at 03:13 PM. |
![]() |
![]() |
![]() |
#4 | |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 31,047
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Quote:
Code:
([A-Z]* ){2,} find 1 or more Upper followed by a space, 2 or more times |
|
![]() |
![]() |
![]() |
#5 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
|
thanks for all the different solutions - you guys make it look so easy !
|
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 657
Karma: 64171
Join Date: Sep 2010
Location: Kent, England, Sol 3, ZZ9 plural Z Alpha
Device: Sony PRS-300, Kobo Aura HD, iPad (Marvin)
|
Shouldn't that be a + instead of the *, as * is 0 or more times, which would match paragraphs that have several spaces at beginning, which are quite often (badly) used for indents.
|
![]() |
![]() |
![]() |
#7 |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 31,047
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
regex puzzle: finding paragraph before... | cybmole | Sigil | 8 | 02-24-2012 09:06 AM |
Common words/phrases too aggressively italicized. | carnivore | Conversion | 2 | 02-11-2011 06:36 PM |
Exact phrases search? Any readers with this feature? | Synergi | Which one should I buy? | 4 | 12-21-2010 12:09 PM |
What do need to detect a Kindle 2? | TallMomof2 | Calibre | 3 | 02-24-2009 05:00 PM |
Podzinger -- Searches for phrases within podcasts | Bob Russell | Lounge | 2 | 01-16-2006 04:36 PM |