View Single Post
Old 12-11-2014, 02:15 PM   #5
kacir
Wizard
kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.
 
kacir's Avatar
 
Posts: 3,463
Karma: 10684861
Join Date: May 2006
Device: PocketBook 360, before it was Sony Reader, cassiopeia A-20
This works:
Code:
^(?P<author>((?!\s-\s).)+)\s-\s(?:(?:\[\s*)?(?P<series>.+)\s(?P<series_index>[\d\.]+)(?:\s*\])?\s-\s)?(?P<title>[^(]+)(?:\(.*\))?
Slightly different implementation
Code:
^(?P<author>((?!\s-\s).)+)\s-\s(?:(?:[[(]\s*)?(?P<series>.+)\s(?P<series_index>[\d\.]+)(?:\s*[])])?\s-\s)?(?P<title>[^(]+)(?:\(.*\))?
I had to put those to "code" tags, because the forum server converted some substrings to smileys ;-)

Install plugin called "Quick preferences" - it contains the first of those REs, I think.
The second one was created by me, I think ;-). Search for my posts and phrase "Regular expression".
The RE covers lots of scenarios, including missing series info, series or series number in brackets - square and regular, varying number/kind of whitespace.

Calibre is programmed in Python, and uses Python version of Regular expressions.
kacir is offline   Reply With Quote