Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 11-16-2023, 07:58 PM   #1
jwes
Enthusiast
jwes began at the beginning.
 
Posts: 39
Karma: 10
Join Date: Jul 2023
Device: none
Odd regex problem

I wanted to change chapter numbers to title case, e.g. TWENTY to Twenty or TWENTY-THREE to Twenty-Three. I came up with these patterns
Find:
Code:
([\p{Lu}])([\p{Lu}]+)((-)([\p{Lu}])([\p{Lu}]*))?
Replace:
Code:
\1\L\2\4\5\L\6
It works for TWENTY, but it turns TWENTY-THREE to Twenty-three. As an experiment, I put '|'s in the replace string like this:
Code:
\1|\L\2|\4|\5|\L\6
and I got
Code:
T|wenty|-|t|hree
, so I can see it is being split correctly, but the T in three is being lowercased. Any explanations why?
jwes is offline   Reply With Quote
Old 11-16-2023, 08:32 PM   #2
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 20,583
Karma: 26954694
Join Date: Mar 2012
Location: Sydney Australia
Device: none
No idea from a technical perspective, but "Twenty-three" is the de-facto standard for Chapter numbers expressed in words - hyphenated sentence case.

BR
BetterRed is offline   Reply With Quote
Old 11-16-2023, 09:18 PM   #3
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,552
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
You need to use an \E to terminate an \L or \U otherwise they'll just keep going. So I would suggest an \E immediately after the capture group that represents the '-' in the replace expression (or before it).

\1\L\2\E\4\5\L\6

Or:

\1\L\2\4\E\5\L\6

I'm not at a computer where I can test the replace expression exactly.

I also think you don't need the brackets [] around each individual \p{Lu} instance.

Last edited by DiapDealer; 11-16-2023 at 09:29 PM.
DiapDealer is offline   Reply With Quote
Old 11-17-2023, 12:41 AM   #4
jwes
Enthusiast
jwes began at the beginning.
 
Posts: 39
Karma: 10
Join Date: Jul 2023
Device: none
Quote:
Originally Posted by DiapDealer View Post
You need to use an \E to terminate an \L or \U otherwise they'll just keep going. So I would suggest an \E immediately after the capture group that represents the '-' in the replace expression (or before it).

\1\L\2\E\4\5\L\6

Or:

\1\L\2\4\E\5\L\6

I'm not at a computer where I can test the replace expression exactly.

I also think you don't need the brackets [] around each individual \p{Lu} instance.
Thank you, that fixed it. Is there documention about \L, \U, and \E? I didn't see it in the user guide or scanning through the links.
jwes is offline   Reply With Quote
Old 11-17-2023, 05:27 AM   #5
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,552
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by jwes View Post
Thank you, that fixed it. Is there documention about \L, \U, and \E? I didn't see it in the user guide or scanning through the links.
In our user guide? I doubt it. We only give the briefest of intros to regex. But I'm sure it's in any thorough documentation for the PCRE flavor of regex.
DiapDealer is offline   Reply With Quote
Old 11-17-2023, 08:27 PM   #6
DNSB
Bibliophagist
DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.
 
DNSB's Avatar
 
Posts: 35,498
Karma: 145557716
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Forma, Clara HD, Lenovo M8 FHD, Paperwhite 4, Tolino epos
Quote:
Originally Posted by jwes View Post
Thank you, that fixed it. Is there documention about \L, \U, and \E? I didn't see it in the user guide or scanning through the links.
You might want to try Perl-compatible Regular Expressions (PCRE) for the documentation and Regex101 for as the page says, build, test and debug regex which is especially handy since it allows you to play with multiple flavours of RegEx.
DNSB is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Odd link problem Julanna Conversion 10 06-19-2014 03:19 AM
PRS-T1 Odd USB problem TAG_Keri Sony Reader 12 07-16-2013 01:00 AM
Odd Problem Pope Viper Library Management 3 06-23-2011 12:16 PM
Odd problem with Sigil 0.4 bobcdy Sigil 7 06-23-2011 02:45 AM
Odd conversion problem speakingtohe Calibre 2 05-10-2010 10:18 AM


All times are GMT -4. The time now is 09:39 AM.


MobileRead.com is a privately owned, operated and funded community.