View Single Post
Old 11-05-2022, 01:30 PM   #311
killo3967
Enthusiast
killo3967 began at the beginning.
 
Posts: 25
Karma: 10
Join Date: Sep 2020
Device: Kindle Paperwhite
Thumbs up

Quote:
Originally Posted by DaltonST View Post
To test your regular expressions before you try to update anything, I recommend: https://pythex.org/

See the 2 images at the bottom of: https://www.mobileread.com/forums/sh...8&postcount=24

Read the ToolTips in the image of the MCS Tab.

In MCS in the TXT Query Tab, there are actually 3 different regexes possible (only #1 is required):

[#1] Your (e|E)(PUB|pub|Pub)\s(v|r)(\d\.\d) which is the TXT Query itself.

[#2] The "Filter Using Custom Column" regex, which you do NOT need here.

[#3] The "Update This Custom Column" regex, which looks at the text returned by your #1 above: "(e|E)(PUB|pub|Pub)\s(v|r)(\d\.\d)". So, #3 must take the text like "ePub r1.3" and extract only the "1.3".

Your #revision Custom Column must be textual, not numeric or any other Type. Refer to the 2nd image in https://www.mobileread.com/forums/sh...8&postcount=24

The answer to your question of "What regex #3 should I use?" is shown with its test case in an image below: [0-9.]+

Personally, I would not have used any capture groups since the MCS function looks at all of the returned text, and not any single group results. You cannot specify in MCS which group to use. All of it is always used, so your regex must not require a specific group be used. See the 3d image below for a simpler regex to use instead of your grouped #1. "EPUB [rv][0-9.]+" using IGNORECASE.

The MCS TXT Query regular expression function always compiles with IGNORECASE. Makes things much simpler. The Python regex compile is:
re.escape("\\")
p = re.compile(re_string, re.IGNORECASE|re.DOTALL|re.MULTILINE)
match = p.search(s_string)



Note that since it uses MULTILINE and DOTALL, in some cases it would be necessary to specify where the selected text ends, such as with a trailing \s* . Search the web for: "regular expressions how to stop selection of characters at new-line"

Thank you very much.
Your answer was very enlightening and I have already seen my problem, using regular expressions. I come from .net and in python they are slightly different.
killo3967 is offline   Reply With Quote