Quote:
Originally Posted by mightymouse2045
How about
# title - author
# series ##_ title - author
series ##_ title - author
But strip the number # at the beginning?
|
OK.
Let's have a look at RE from my previous post, split across the lines for better readability
Code:
(?P<author>[^-]+)
(
( - | *-- *)
[[(]?
(?P<series>[^-]+)
[[( ]+
(?P<series_index>[0-9.]+)?
[])]?
)?
( - | *-- *)
(?P<title>.+)
Now, we shall rearrange various elements like so
Code:
(
[[(]?
(?P<series>[^-]+)
[[( ]+
(?P<series_index>[0-9.]+)?
[])]?
( - | *-- *)
)?
(?P<author>[^-]+)
( - | *-- *)
(?P<title>.+)
Now it matches
series seriesnumber - author - title
author - title
please note, if there is Series, it must be followed by seriesnumber.
I think it is possible to construct RE to make seriesnumber optional, but I do not know it it would be useful that way, and my regular expressions is complicate enough as it is.
Let's add regular expression
[0-9 ]*
at the beginning of the new RE, so it "eats up" any numbers and spaces at the beginning
If there are dots in number, put this at the beginning instead
[0-9. ]*
--------- doesn't work ---------
Now, we need to put underscore among possible delimiters, together with ' - ', '--', ' -- '.
So instead of
( - | *-- *)
at the end of the series, we put
( - | *-- *| *_ *)
Now possible delimiters are ' - ', '--', ' -- ', '-- ', ' --', '_',' _','_ ',' _ '.
-------- end of doesn't work -------
The above construction doesn't work, because you would have to modify also (?P<series>[^-]+) to (?P<series>[^-_]+). Even bigger problem is that Calibre automatically replaces underscores in filenames with spaces. Is there an option to switch off that option?
I recommend to replace underscore with ' - ' in filenames before processing the file in Calibre.
Here is the result
Code:
[0-9 ]*([[(]?(?P<series>[^-]+)[[( ]+(?P<series_index>[0-9.]+)?[])]?( - | *-- *))?(?P<author>[^-]+)( - | *-- *)(?P<title>.+)
I will leave extensive testing of the regular expression as an exercise for the reader ;-)