View Single Post
Old 10-15-2014, 04:56 PM   #1
Rob557
Zealot
Rob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-booksRob557 has learned how to read e-books
 
Posts: 108
Karma: 810
Join Date: Jul 2012
Device: Kobo
Adding books - filename RegEx author FN (initials) LN

Apologies if an answer is already posted, but I couldn't find one as I struggle to understand RegEx (even with the help of RegEx Buddy). FWIW this is a special case situation where I do not have metadata and rely on the existing filename to determine the author and title.

(?P<author>.+) - (?P<title>[^_]+) is the default RegEx that I currently have under "preferences" for adding books to Calibre, for obtaining the author and title from the filename rather than from metadata, but that RegEx flips the order of FN LN for the author:

filename 1: Tom Jones - My Book.epub
resulting author: Jones Tom

filename 2: Tom G. Jones - My Book.epub
resulting author: G. Jones Tom

I've tried other RegEx expressions that I have seen posted that are even more sophisticated, to optionally accommodate series name and number, but those RegEx all seem to have the same strange effect when determining the author.
e.g.
^((?P<author>([^\-_0-9]+)(?=\s*-\s*)(?!\s*-\s*[0-9.]+)|\b))(\s*-\s*)?(\[?(?P<series>[^0-9\-]+) (- )?(?P<series_index>[0-9.]+)\]?\s*-\s*)?(?P<title>.+)

I think I must be missing something because it seems non-intuitive for the default for these various RegEx approaches to switch the order of FN and LN as it appears in the filename. The results don't even generate the comma version of FN, LN so that the problem can be fixed in Calibre.

Thanks for any help on this.
Rob557 is offline   Reply With Quote