View Single Post
Old 07-17-2011, 09:32 PM   #7
charlweed
Enthusiast
charlweed began at the beginning.
 
Posts: 27
Karma: 30
Join Date: Jul 2011
Device: none
I settled on a solution.


As Calibre is great software, I will try to respond with some good, usable suggestions. In the meantime, here is what I eventually settled with.
Code:
 ((?P<series>\w+)?\W(?P<series_index>\d+).+?)?(?P<title>.*)\s+\((?P<author>.*)\)\s?(?P<published>\d+)?.*
This is very much a PERL thing, and I could not find a web tool that can parse it. None-the-less, the KEY for me was that if Calibre cannot match the entire expression, it dumps everything into <title>. If <title> is not in the expression, it seems to do nothing.
My first suggestion is that the test functionality do as-you-type validation and matching of the expression, so that the user knows when Calibre is not going to find any data given the expression and sample.
For the tutorial,it should explicitly state that the Calibre regular expressions are a extension of other regular expression ... um, grammars. And detail how symbolic grouping works, and how general parenthetical grouping works.
A table of recipes for pulling data out of some sample strings would be great. Maybe I can help with the first of those.
charlweed is offline   Reply With Quote