Quote:
Originally Posted by MSWallack
I thought that I was making good progress, but now I'm stumped. Some of my files don't have a series. For example:
King, Stephen - Under the Dome.epub
Here is the RE:
^(?P<author>([^\-_0-9]+)(?=\s*-\s*)(?!\s*-\s*[0-9.]+)|\b)(\s*-\s*)?(?P<title>([^\-_\[\(]+)) ((\[(?P<series>[^0-9\-]+) (- )?\#?(?P<series_index>[0-9.]+)\]))?
When I test that RE on that filename in Calibre, I get the following results:
Title: Under the
Author: _Stephen King [I've used an underscore to indicate a space that Calibre is putting bofore the author's name]
I can't quite figure out where the RE is breaking.
Thanks again for the help.
|
You have a space between title and series. That requires a space after the title, but before the optional series and series_index. Therefore, the last word in your title cannot be part of the title, since the last word is not followed by a space, and your regex requires a space after the last word in the title. The last word can't be the series either, since the series is required to have a series_index too.
Try this:
Code:
^(?P<author>([^\-_0-9]+)(?=\s*-\s*)(?!\s*-\s*[0-9.]+)|\b)(\s*-\s*)?(?P<title>([^\-_\[\(]+))((\[(?P<series>[^0-9\-]+) (- )?\#?(?P<series_index>[0-9.]+)\]))?
Edit: IIRC, the space preceding the author is not a problem. Calibre aggressively strips leading and trailing spaces where they might cause trouble.