Hi there
Here's my problem: I got a bunch of pdf files named like those examples:
Name Surname - Name of the Series 01 - Title of the Boook.pdf
or
Name Surname - Title of the Boook.pdf
For the first one I use this:
(?P<author>[^_]+) - (?P<series>[^_]+) (?P<series_index>[0-9]+) - (?P<title>.+)
And for the second example I use:
(?P<author>[^_]+) - (?P<title>.+)
The problem is that the parsing cut the last word, so the title result in "Title of the"
Anyway, is possible to join those 2 expression so the parsing understand when there's a series space in the filename or not ( xxx - xxx instead of xxx - xxx 3 - xxx) ?
The other problem I got is that calibre look inside the pdf for the title and author field, and sometime this result in some garbled text, is there a way to override this and use only the data parsed from the filename?
Thanks in advance for any advices.
P.S.
sorry for my subpar english