Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Library Management

Notices

Reply
 
Thread Tools Search this Thread
Old 10-23-2014, 11:59 PM   #1
kite
enturbulated
kite can tell if an avocado is ripe without touching it.kite can tell if an avocado is ripe without touching it.kite can tell if an avocado is ripe without touching it.kite can tell if an avocado is ripe without touching it.kite can tell if an avocado is ripe without touching it.kite can tell if an avocado is ripe without touching it.kite can tell if an avocado is ripe without touching it.kite can tell if an avocado is ripe without touching it.kite can tell if an avocado is ripe without touching it.kite can tell if an avocado is ripe without touching it.kite can tell if an avocado is ripe without touching it.
 
kite's Avatar
 
Posts: 30
Karma: 130494
Join Date: May 2007
Device: Kobo Aura HD
A new user's collection of Regular Expressions (regex) for 'Add books > Control the a

'Add books > Control the adding of books'

Collected by a regex illiterate.
(regex = regular expression)
In part so I can find them easily, and in part so others can find them easily.
Please point out errors, I'll try and correct them.


I've tried to show where I found each regex. But some are unknown. Though probably from Starson17.

Reading "calibre User Manual> Tutorials> All about using regular expressions in calibre" will hopefully give fellow new users a general idea of what is going on and why a particular expression doesn't work for some file names.
Post numbers 9 and 10 at "understandng the sample add books regex" https://www.mobileread.com/forums/sho...d.php?t=121353 brings it down to specifics of how a simple regex for "Author - Title.pdf" works.
If you've read the manual, looked in this thread and still can't "Control the adding of books" to your satisfaction then the library management forum regulars are very helpful. And knowledgeable. Thank Heavens.



Authorname - Booktitle.pdf
(?P<author>[^_]+) - (?P<title>.+)
works on
William Shakespeare - Let's Dance Under the Waterfall.pdf
and
First-Name3 Sur-Hyphenated-Name3 & Firstname2 Surname2 - Let's Dance Under th....e Waterfall.pdf
notes: A calibre recognised file extension (.pdf .epub .html .zip etc) is necessary for the test panel to work correctly. So "Authorname - Booktitle.epub" works but "Authorname - Booktitle" and "Authorname - Booktitle.nfo" doesn't. File extension .txt is used in all further filename examples.
Untick the checkbox next to "Read metadata from file contents rather than file name" at the top of "The Add Process" page to allow your regex to work on books added.




Booktitle - Authorname.txt
(?P<title>.+) - (?P<author>[^_]+)


Authorname. Booktitle.txt
(?P<author>[^_]+)\. (?P<title>.+)


Authorname.Booktitle.txt
(?P<author>[^_]+)\.(?P<title>.+)


Booktitle. Authorname.txt
(?P<title>[^_]+)\.(?P<author>.+)
note; "Firstname.Surname" and "Surname.Firstname" doesn't work.


Authorname AnyDotless von Fancy. Title one. T.i.t.l.e 2. Books I-III.Title Three.txt
(?P<author>.+?)\. (?P<title>.+)
works on
Dionysius of Halicarnassus. Roman Antiquities, IV. Books VI.49-VII.pdf
"It considers all characters before the first dot followed by a space as author and the rest as title"
(by JustForFun https://www.mobileread.com/forums/sho...d.php?t=246859)


Authorname - seriesname series_indexnumber - Booktitle.txt
(?P<author>[^_-]+) -?\s*(?P<series>[^_0-9-]*)(?P<series_index>[0-9]*)\s*-\s*(?P<title>[^_].+) ?
note; "Smith J.S." works as author name, but "Smith-Jones J.S." does not work as author name. See domax's post of 11-04-2015, 07:51 PM for a regex that works with double names. "Nine Moons" works as a series name, but "9 Moons" doesn't. "45.5" works as series index, "Book IV" doesn't.


Author - Seriesname series_indexnumber - Title.txt
or
Author - Title.txt
Code:
^(?P<author>((?!\s-\s).)+)\s-\s(?:(?:\[\s*)?(?P<series>.+)\s(?P<series_index>[\d\.]+)(?:\s*\])?\s-\s)?(?P<title>[^(]+)(?:\(.*\))?
works with:
Bloggs, Joe - My title
Bloggs, Joe - Some Series 1 - My title.txt
Bloggs, Joe - Some Series 1.5 - My title.txt
Bloggs, Joe - Some Series 1.5 - My title with sub-title hyphen.txt
note: "9 Moons" works as seriesname, but "IV" does not work as series_index number.
(by kiwidude https://www.mobileread.com/forums/sho...d.php?t=108792)


Authorsurname, Authorfirstname - (Series Name - Book 01) Title of the book.txt
(?P<author>[^_-]+) - (\((?P<series>[^-]+) - Book (?P<series_index>\d\d?)\) )?(?P<title>[^-]+)
(by TheEldest https://www.mobileread.com/forums/showthread.php?t=89581)


isbn.publishernamea.publishernameb.publishernamec. title1.title2.title3.title4.Month.Year.txt
Code:
(?P<isbn>\d+\w)\.(?P<publisher>\w+\.\w+\.\w+)\.(?P<title>.*)\.(?P<published>\w\w\w\.\d+)
works with
012345678X.This.Is.Publisher.This.is.a.Title.Apr.2 007.pdf
876543210x.This.Is.Publisher.A.Different.Title.Tha t.is.Longer.Jan.1997.pdf
note: "It assumes three character month abbreviations. It doesn't remove periods, except between fields. It assumes three word publisher names."
Also year must be four numerals and >1900.
(from Starson17 https://www.mobileread.com/forums/sho...metadata+regex)


isbn.jumbled dat.es t.i.t.l.e...rubbish data.txt
Code:
(?mi)^(?P<isbn>[\d\-x]{9,17})
works to grab isbn and put it into correct field so that you can download the correct metadata for the book based on the isbn.
Whole filename is put as placeholder title till correct title is downloaded.
works on
0313308316.Jumbled dates.title...ru6bish data Jaf 19m8.txt
0313306419.Greenwood.Press.Rudolfo.A.Anaya.A.Criti cal.Companion.Oct.1999.txt
01505798756X.Silly Press.The.Strange.Professional.Title.Jul.1985.epub
(from Serpentine on https://www.mobileread.com/forums/sho...a+regex&page=2)

Last edited by kite; 11-20-2015 at 12:16 AM. Reason: clearer search, correction for double names
kite is offline   Reply With Quote
Old 10-25-2014, 06:00 AM   #2
kite
enturbulated
kite can tell if an avocado is ripe without touching it.kite can tell if an avocado is ripe without touching it.kite can tell if an avocado is ripe without touching it.kite can tell if an avocado is ripe without touching it.kite can tell if an avocado is ripe without touching it.kite can tell if an avocado is ripe without touching it.kite can tell if an avocado is ripe without touching it.kite can tell if an avocado is ripe without touching it.kite can tell if an avocado is ripe without touching it.kite can tell if an avocado is ripe without touching it.kite can tell if an avocado is ripe without touching it.
 
kite's Avatar
 
Posts: 30
Karma: 130494
Join Date: May 2007
Device: Kobo Aura HD
Quote:
Originally Posted by arberg View Post
...

Code:
^(?P<author>[^-]+)(\s*-\s*(\[?(?P<series>[^-0-9]+)\s*(?P<series_index>[0-9.]+)?]?)?)?.*?-\s*(?P<title>[^\]{[()]+\w)
The expression expects the filename to start with Author and end with Title (possibly followed by garbage in parentheses). It also optionally matches series, and series index in the middle. It requires the title to end in an alpha-numeric character, and it does not allow the title to contain any kinds of parentheses (anything following a parenthesis will be discarded)

It matches the following examples filenames
Author Harris - Kingdom come
Author Harris - Kingdom come (v1.0)
Author Harris - Kingdom series - The Very Magical Kingdom (v1.0)
Author Harris - Kingdom series 14.5 - The Very Magical Kingdom (v1.0)
Author Harris - [Kingdom series 14.5] - The Very Magical Kingdom (v1.0)
Author Harris - [Kingdom series 14.5] [another useless string]- The Very Magical Kingdom (v1.0)

Author: Author Harris
Title: Kingdom come
Series: Kingdom series
Series Index: 14.5
(from https://www.mobileread.com/forums/showthread.php?t=88896)

Last edited by kite; 10-25-2014 at 06:03 AM. Reason: add CODE /CODE
kite is offline   Reply With Quote
Advert
Old 11-04-2015, 06:51 AM   #3
domax
Member
domax began at the beginning.
 
Posts: 14
Karma: 10
Join Date: Oct 2014
Location: Koblenz, Germany
Device: Kindle Fire HD 6, Pocketbook Touch Lux, Tolino Vision
Wrong Regex

Hello,
it's a very good idea to help new users with regex. I need them especially for adding books and find it useful (and difficult) for searches in book list.
Therefor we should have a place where new users find many examples.
But here is one example that's wrong, if there's a double name like:
Morland-Miller, A. F. - Tony Ballard 001 - Help me.txt
Your regex
Quote:
(?P<author>[^_-]+) -?\s*(?P<series>[^_0-9-]*)(?P<series_index>[0-9]*)\s*-\s*(?P<title>[^_].+) ?
don't match exactly, but

Quote:
(?P<author>[^_]+) - (?P<series>[^_0-9-]+) (?P<series_index>[0-9]+) - (?P<title>[^_].+)
is correct.
domax is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Help with regular expressions MostlyCarbon Library Management 0 02-04-2012 03:00 PM
Could use a bit of help with regular expressions to edit books on conversion Flammy Conversion 3 12-29-2011 10:29 AM
Adding books - need help with regular expressions tweebee Library Management 10 08-05-2011 08:58 PM
Regular Expressions littleezza Conversion 1 07-15-2011 11:52 AM
Help with Regular Expressions ghostyjack Workshop 2 01-08-2010 11:04 AM


All times are GMT -4. The time now is 05:08 PM.


MobileRead.com is a privately owned, operated and funded community.