View Single Post
Old 08-17-2011, 03:32 PM   #7
kacir
Wizard
kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.
 
kacir's Avatar
 
Posts: 3,463
Karma: 10684861
Join Date: May 2006
Device: PocketBook 360, before it was Sony Reader, cassiopeia A-20
Quote:
Originally Posted by Starson17 View Post
I know that as long as the filename has author immediately followed by title I should try the second, and if it has author followed by series, I should try the first one.
Code:
(?P<author>[^-]+)(( - | *-- *)[[(]?(?P<series>[^-]+)[[( ]+(?P<series_index>[0-9.]+)?[])]?)?( - | *-- *)(?P<title>.+)
This one has series optional.
Works with series and *also* without.
It also has optional parenthesis around series and/or series number
So it matches:
Sir Arthur Conan Doyle - Sherlock Holmes 1 - Study in red.doc
Sir Arthur Conan Doyle - Study in red.doc
Sir Arthur Conan Doyle -- Sherlock Holmes 1 - Study in red.doc
Sir Arthur Conan Doyle -- Study in red.doc
Sir Arthur Conan Doyle--Sherlock Holmes 1.0--Study in red.doc
Sir Arthur Conan Doyle--Study in red.doc
Sir Arthur Conan Doyle - (Sherlock Holmes 1) - Study in red.doc
Sir Arthur Conan Doyle - [Sherlock Holmes 1] - Study in red.doc
Sir Arthur Conan Doyle - Sherlock Holmes (1) - Study in red.doc
Sir Arthur Conan Doyle - Sherlock Holmes [1] - Study in red.doc

If you find some case that my expression doesn't cover, do not hesitate to post, we can try to craft another, even more complex RE.

Here is another take on problem
Code:
^(?P<author>((?!\s-\s).)+)\s-\s(?:(?:\[\s*)?(?P<series>.+)\s(?P<series_index>[\d\.]+)(?:\s*\])?\s-\s)?(?P<title>[^(]+)(?:\(.*\))?
This one doesn't cover parenthesis around series number, like this
Sir Arthur Conan Doyle - Sherlock Holmes (1) - Study in red.doc
kacir is offline   Reply With Quote