View Single Post
Old 03-01-2017, 05:03 PM   #59
Rob557
Zealot
Rob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-books
 
Posts: 108
Karma: 810
Join Date: Jul 2012
Device: Kobo
Question Double Quote in RegEx code for auto-generating ToC ?

As noted earlier, this Edit ToC feature is very useful. Thanks to Kovid.

Does anyone know what RegEx coding should be used within this Edit ToC function in order to include ToC items that are enclosed in double quotes?

FWIW, to provide some context, I have found the following code to be quite effective in generating TOC entries (copied into the first slot in the pop-up box "Create ToC from XPath") ... it will capture variations using roman numerals, chapter numbers, short descriptors, etc and can be retained using the "save settings" option using the code itself as the name of the setting:

Note: the fourth and fifth characters in the code below are a colon followed by a small letter p, but those two characters are auto-replaced by a pesky icon sticking out its tongue, so just reverse-substitute if you use the code:

//h:p[re:test(., "(^\s*[0-9]{1,2}\s*$)|(^.{1,80}[a-z]\s*$)|(^\s*[IVX]{1,6}\s*$)|(^\s*prologue)|(^\s*epilogue)|(^\s*chap ter)|(^\s*map)|(^\s*index)|(^\s*introduction)|(^\s *notes)", "i")]

I recently set up a modified version as follows to also catch ToC items that end with a question mark, exclamation mark, etc but use it only as an alternative approach when necessary because in some case it generates too many extraneous entries that need to be manually deleted:

Code:
//h:p[re:test(., "(^\s*[0-9]{1,2}\s*[\.\!\?]{0,1}\s*$)|(^.{1,80}[a-z]\s*$)|(^.{1,20}[0-9]\s*$)|(^.{1,20}\s*[\.\!\?]\s*$)|(^\s*[IVX]{1,6}\s*[\.]{0,1}\s*$)|(^\s*prologue)|(^\s*epilogue)|(^\s*chapter)|(^\s*map)|(^\s*index)|(^\s*introduction)|(^\s*notes)", "i")]
I also tried to include code that would catch ToC entries that might end in a double quote (") but found that because double quotes seen to have a special role in the Edit ToC process, the coding was rejected as invalid whether or not I use " by itself or with an escape indicator \" or self-enclosed in double quotes """. Does anyone know what would work?

Thanks for any help on this.

Last edited by BetterRed; 03-01-2017 at 05:32 PM. Reason: add some noparse tags
Rob557 is offline   Reply With Quote