![]() |
#1 | |
my parent's oops...
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 492
Karma: 1477572
Join Date: Feb 2009
Device: Vx->Handera->Clie-> Axim->505->650->KPW/Aura ->L2->iOS/CBW
|
Regex term for batch identifying word count in comments
I am trying to clean up my comments and want to remove word counts from some of them. For example, I have some books where the end of the comments have:
"x,xxx Words" or "xx,xxx Words" or "xxx,xxx Words" I know I can use (\d+) to select all numbers....but....this also selects any other random numbers, dates, or chapter numbers that might be present as well. Is there a specific regex that would only select the phrase "x,xxx Words" or "xx,xxx Words" or "xxx,xxx Words" where the x=any digit? A specific example of what I'm looking for would be seen in this book comment: Quote:
I don't mind splitting up the regex into several searches depending on number of words. Any suggestions or links to where I can find an answer would be appreciated. |
|
![]() |
![]() |
![]() |
#2 |
Custom User Title
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 10,952
Karma: 74999999
Join Date: Oct 2018
Location: Canada
Device: Kobo Libra H2O, formerly Aura HD
|
While there's probably a better way to handle it, this will match xxx,xxx words (or Words):
https://regex101.com/r/YhHpr3/1 For other digits, just edit the two {3} bits to reflect the number of digits. EDIT: Here's an improved version that can handle other numbers of digits: https://regex101.com/r/YhHpr3/2 Last edited by ownedbycats; 01-31-2022 at 03:44 AM. |
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 24,905
Karma: 47303824
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
|
Slightly better would be:
Code:
(\d{0,3}[,]?\d{1,3} [Ww]ords) If I hadn't see your solution, I think I would have suggested: Code:
([\d,]+ [Ww]ords) Code:
([\d,]{2,} [Ww]ords\b) |
![]() |
![]() |
![]() |
#4 | |||||
my parent's oops...
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 492
Karma: 1477572
Join Date: Feb 2009
Device: Vx->Handera->Clie-> Axim->505->650->KPW/Aura ->L2->iOS/CBW
|
Thank you both very much for the suggestions. Regex is wonderfully wacky! I ended up using the original suggestion prior to seeing the improved suggestions. In the end, I used:
Quote:
Quote:
Quote:
Quote:
Quote:
|
|||||
![]() |
![]() |
![]() |
#5 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 24,905
Karma: 47303824
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
|
A single one to do this is:
Code:
(\d{0,1}[,]?\d{0,3}[,]?\d{1,3} [Ww]ords) And semantically, "[Ww]" is a bit better than "(W|w)". The former is "One of this list of things" and the later is "W or w and put them into a group for the replacement". It is a little pedantic and doesn't make a difference in your situation. But, if you need to extend this because you found another case, or, needed to replace the text, the latter might not work as well. |
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Custom User Title
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 10,952
Karma: 74999999
Join Date: Oct 2018
Location: Canada
Device: Kobo Libra H2O, formerly Aura HD
|
Yes, I'd forgotten that square brackets worked better for that.
![]() |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
batch processing regex search/replace? | G2B | Editor | 21 | 11-24-2020 09:52 PM |
Regex to count line wraps? | kboogie222 | Library Management | 12 | 09-15-2019 09:12 PM |
Word Count and Page Count? | CrossReach | Library Management | 2 | 07-19-2018 05:44 PM |
Comments - batch add? | mezme | Calibre | 6 | 02-22-2015 08:32 PM |
COMMENTS batch formatting | ippopom | Library Management | 7 | 02-26-2013 01:23 PM |