Quote:
Originally Posted by Starson17
I know that as long as the filename has author immediately followed by title I should try the second, and if it has author followed by series, I should try the first one.
|
Code:
(?P<author>[^-]+)(( - | *-- *)[[(]?(?P<series>[^-]+)[[( ]+(?P<series_index>[0-9.]+)?[])]?)?( - | *-- *)(?P<title>.+)
This one has series optional.
Works with series and *also* without.
It also has optional parenthesis around series and/or series number
So it matches:
Sir Arthur Conan Doyle - Sherlock Holmes 1 - Study in red.doc
Sir Arthur Conan Doyle - Study in red.doc
Sir Arthur Conan Doyle -- Sherlock Holmes 1 - Study in red.doc
Sir Arthur Conan Doyle -- Study in red.doc
Sir Arthur Conan Doyle--Sherlock Holmes 1.0--Study in red.doc
Sir Arthur Conan Doyle--Study in red.doc
Sir Arthur Conan Doyle - (Sherlock Holmes 1) - Study in red.doc
Sir Arthur Conan Doyle - [Sherlock Holmes 1] - Study in red.doc
Sir Arthur Conan Doyle - Sherlock Holmes (1) - Study in red.doc
Sir Arthur Conan Doyle - Sherlock Holmes [1] - Study in red.doc
If you find some case that my expression doesn't cover, do not hesitate to post, we can try to craft another, even more complex RE.
Here is another take on problem
Code:
^(?P<author>((?!\s-\s).)+)\s-\s(?:(?:\[\s*)?(?P<series>.+)\s(?P<series_index>[\d\.]+)(?:\s*\])?\s-\s)?(?P<title>[^(]+)(?:\(.*\))?
This one doesn't cover parenthesis around series number, like this
Sir Arthur Conan Doyle - Sherlock Holmes (1) - Study in red.doc