View Single Post
Old 03-23-2021, 09:59 AM   #4
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 28,691
Karma: 205039118
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
All available documentation I can find suggests that PCRE should be able to easily support both \1 through \9 and \10 through \99 backreferences, but clearly Sigil's bundled PCRE does not. But it seems the PCRE bundled with Sigil DOES allow for the \g{n} backreference syntax which can exceed the 9 backreference limit.

String: <p>0123456789abc</p>
Find: (\d)(\d)(\d)(\d)(\d)(\d)(\d)(\d)(\d)(\d)([a-z])([a-z])([a-z])
Replace: \1\2\3\4\5\6\7\8\9\g{10}\g{11}\g{12}\g{13}

The bottom line seems to be that anything other than a single digit (0-9) after the backslash is ambiguous. It could be a backreference, or it could be character code (or an octal number). For completely unambiguous double-digit backreferences, always use the \g{nn} syntax.

From Sigil's src/PCRE/SPCRE.cpp:

Code:
// The maximum number of catpures that we will allow.
const int PCRE_MAX_CAPTURE_GROUPS = 30;
So the number of backreferences will also be capped at 30 (provided 30 groups were, in fact, captured). Whether accessed by name or number via the \g{} syntax.

Last edited by DiapDealer; 03-23-2021 at 01:26 PM.
DiapDealer is online now   Reply With Quote