View Single Post
Old 02-23-2020, 03:03 PM   #2
kacir
Wizard
kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.
 
kacir's Avatar
 
Posts: 3,463
Karma: 10684861
Join Date: May 2006
Device: PocketBook 360, before it was Sony Reader, cassiopeia A-20
Quote:
Originally Posted by kellykline View Post
Hi, how do I add multiple rules for the Adding books - Reading metadata section of the settings? Seems to have only one line for one regular expression, but I need it to be able to guess from different filename formats.
It has only one RE by design.
How do you want the Calibre to decide which RE to use?

Here is what you can do:

Examine the filenames.

Use file manager to separate the files into various groups according to filenames. Or write a script using Linux "find" command or whatever.

Download plugin "Quick preferences" by Grant Drake for Calibre. Here you can define multiple Regular expressions and then select the one to use for the next file that you drag and drop (or import) into Calibre. BEWARE. Since the plugin was first developed something happened in Calibre code, so the first book you drag and drop in uses the previous RE. But, I haven't used it in quite some time.

Define various REs for various book formats in the plugin options. You can then select various import options for next file from the menu.

A few examples:
Code:
Title - Author (Default)
(?P<title>.+) - (?P<author>[^_]+)

Author [- Series #]- Title
^(?P<author>((?!\s-\s).)+)\s-\s(?:(?:\[\s*)?(?P<series>.+)\s(?P<series_index>[\d\.]+)(?:\s*\])?\s-\s)?(?P<title>[^(]+)(?:\(.*\))?

Series # -  Author - Title
^(?P<series>((?!\s-\s).)+)\s(?P<series_index>[\d\.]+)\s-\s(?:(?:\[\s*)?(?P<author>.+)(?:\s*\])?\s-\s)?(?P<title>[^(]+)(?:\(.*\))?

Author [(- Series #)]- Title
^(?P<author>((?!\s-\s).)+)\s-\s(?:(?:[[(]\s*)?(?P<series>.+)\s(?P<series_index>[\d\.]+)(?:\s*[])])?\s-\s)?(?P<title>[^(]+)(?:\(.*\))?
As you can see, the Regular expressions can be quite elaborate and cover some variations in filename input. There can be some optional stuff that is picked up when present (such as series name and series index in the last three examples). This is indicated by square brackets in description. The second example is most powerful and picks up wide variety of inputs.
Author Name - Book title
Author Name - Series name 2 - Book title
You have also option to swap author names in the quick preferences plugin (firstname lastname vs lastname, firstname)

Those REs *do* work
I realize they might appear unreadable. I wrote and / or heavily modified those a few years ago and now they look as a bunch of scattered characters to me on the first glance. You have to take them apart piece by piece to understand them. Regular Expression language is quite dense and some people jokingly describe it as "Write Only".

Calibre uses Python flavour of RE language
I found book "Mastering Regular Expression" very enlightening when I was learning Regular Expressions (in the distant past). I am not sure it covers "Python flavour" of RE languages. There are a few threads here about Calibre RE specifics on Mobileread.
kacir is offline   Reply With Quote