Regular Expression Help Needed
Importing books and I need to write a regular expression to handle the three file-name formats they're in. The file name formats are as follows:
1. author - title.type
2. author - title - series.type
3. author - title - series - series_index.type
Each category is separated by a hyphen with a space on either side. The regex needs to be able to handle the occasional hyphenated word in the title, but those hyphens are not preceded or followed by a space.
So far, I have the following regex:
(?P<author>[^-]+) - (?P<title>[^-]+) - (?P<series>[^-]+) - (?P<series_index>[^.]+)?
This works fine for case 3...though it seems to put a .0 at the end of the series number - i.e. book #3 in a series is given the series_index of 3.0 when I test the file name, but I don't know if this is a problem with my regex, or if this is simply how Calibre displays that number.
For case 1, The title is listed as "author - title" and the rest of the fields are unknown, and for case 2, the title is listed as "author - title - series" and the rest of the fields are listed as unknown.
Any one able to give me a hand with this?
Thanks,
Dennis
|