Quote:
Originally Posted by Ted Friesen
Edit:
But adding \d+ after the trailing space resulted in just the volume number. Someone needs to explain why \s(.*?) \d+ would capture Jul-Aug 2000 and (.*?) would not
|
"needs to explain" is a bit strong...
Your expression needs to match the entire string. Anything not matched is left behind and included in the result. To that end you should understand the difference between greedy and non-greedy operators, and the semantics of anchors. The expression "(.*)" matches as much as possible to succeed, including nothing, leaving behind any unmatched text. The expression "(.*?)" matches as little as possible to succeed, including nothing, leaving behind any unmatched text. Adding the "\d+" forces the match to find at least one number.
If you add anchors then you can be sure that you match the entire line. Without spending a lot of time looking at the specifics, it seem that the anchored expression
Code:
^(.*?)(\#)(\d+)\s(.*?)$
does what you want, as does
Code:
^(.*?)(\#)(\d+)\s(.*)$