View Single Post
Old 02-12-2023, 02:58 AM   #3
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 7,559
Karma: 20150435
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
So, what is a word (for your purposes)? I think it's something like:

Code:
[A-Z0-9][A-Z0-9\.,…’“”!?—-]*
A letter/digit followed by any number (as many as possible, possibly zero) of letter/digit/punctuation. You may want to include “ and ‘ in the "initial" class, and maybe & along with all the letters.

Now you want a number of words separated by spaces, how about (untested):

Code:
({word}\s+)\b
i.e.
Code:
([A-Z0-9][A-Z0-9\.,…’“”!?—-]*\s+)\b
Jellby is offline   Reply With Quote