|
![]() |
|
Thread Tools | Search this Thread |
![]() |
#1 |
enturbulated
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 30
Karma: 130494
Join Date: May 2007
Device: Kobo Aura HD
|
A new user's collection of Regular Expressions (regex) for 'Add books > Control the a
'Add books > Control the adding of books'
Collected by a regex illiterate. ![]() (regex = regular expression) In part so I can find them easily, and in part so others can find them easily. Please point out errors, I'll try and correct them. I've tried to show where I found each regex. But some are unknown. Though probably from Starson17. Reading "calibre User Manual> Tutorials> All about using regular expressions in calibre" will hopefully give fellow new users a general idea of what is going on and why a particular expression doesn't work for some file names. Post numbers 9 and 10 at "understandng the sample add books regex" https://www.mobileread.com/forums/sho...d.php?t=121353 brings it down to specifics of how a simple regex for "Author - Title.pdf" works. If you've read the manual, looked in this thread and still can't "Control the adding of books" to your satisfaction then the library management forum regulars are very helpful. And knowledgeable. Thank Heavens. Authorname - Booktitle.pdf (?P<author>[^_]+) - (?P<title>.+) works on William Shakespeare - Let's Dance Under the Waterfall.pdf and First-Name3 Sur-Hyphenated-Name3 & Firstname2 Surname2 - Let's Dance Under th....e Waterfall.pdf notes: A calibre recognised file extension (.pdf .epub .html .zip etc) is necessary for the test panel to work correctly. So "Authorname - Booktitle.epub" works but "Authorname - Booktitle" and "Authorname - Booktitle.nfo" doesn't. File extension .txt is used in all further filename examples. Untick the checkbox next to "Read metadata from file contents rather than file name" at the top of "The Add Process" page to allow your regex to work on books added. Booktitle - Authorname.txt (?P<title>.+) - (?P<author>[^_]+) Authorname. Booktitle.txt (?P<author>[^_]+)\. (?P<title>.+) Authorname.Booktitle.txt (?P<author>[^_]+)\.(?P<title>.+) Booktitle. Authorname.txt (?P<title>[^_]+)\.(?P<author>.+) note; "Firstname.Surname" and "Surname.Firstname" doesn't work. Authorname AnyDotless von Fancy. Title one. T.i.t.l.e 2. Books I-III.Title Three.txt (?P<author>.+?)\. (?P<title>.+) works on Dionysius of Halicarnassus. Roman Antiquities, IV. Books VI.49-VII.pdf "It considers all characters before the first dot followed by a space as author and the rest as title" (by JustForFun https://www.mobileread.com/forums/sho...d.php?t=246859) Authorname - seriesname series_indexnumber - Booktitle.txt (?P<author>[^_-]+) -?\s*(?P<series>[^_0-9-]*)(?P<series_index>[0-9]*)\s*-\s*(?P<title>[^_].+) ? note; "Smith J.S." works as author name, but "Smith-Jones J.S." does not work as author name. See domax's post of 11-04-2015, 07:51 PM for a regex that works with double names. "Nine Moons" works as a series name, but "9 Moons" doesn't. "45.5" works as series index, "Book IV" doesn't. Author - Seriesname series_indexnumber - Title.txt or Author - Title.txt Code:
^(?P<author>((?!\s-\s).)+)\s-\s(?:(?:\[\s*)?(?P<series>.+)\s(?P<series_index>[\d\.]+)(?:\s*\])?\s-\s)?(?P<title>[^(]+)(?:\(.*\))? Bloggs, Joe - My title Bloggs, Joe - Some Series 1 - My title.txt Bloggs, Joe - Some Series 1.5 - My title.txt Bloggs, Joe - Some Series 1.5 - My title with sub-title hyphen.txt note: "9 Moons" works as seriesname, but "IV" does not work as series_index number. (by kiwidude https://www.mobileread.com/forums/sho...d.php?t=108792) Authorsurname, Authorfirstname - (Series Name - Book 01) Title of the book.txt (?P<author>[^_-]+) - (\((?P<series>[^-]+) - Book (?P<series_index>\d\d?)\) )?(?P<title>[^-]+) (by TheEldest https://www.mobileread.com/forums/showthread.php?t=89581) isbn.publishernamea.publishernameb.publishernamec. title1.title2.title3.title4.Month.Year.txt Code:
(?P<isbn>\d+\w)\.(?P<publisher>\w+\.\w+\.\w+)\.(?P<title>.*)\.(?P<published>\w\w\w\.\d+) 012345678X.This.Is.Publisher.This.is.a.Title.Apr.2 007.pdf 876543210x.This.Is.Publisher.A.Different.Title.Tha t.is.Longer.Jan.1997.pdf note: "It assumes three character month abbreviations. It doesn't remove periods, except between fields. It assumes three word publisher names." Also year must be four numerals and >1900. (from Starson17 https://www.mobileread.com/forums/sho...metadata+regex) isbn.jumbled dat.es t.i.t.l.e...rubbish data.txt Code:
(?mi)^(?P<isbn>[\d\-x]{9,17}) Whole filename is put as placeholder title till correct title is downloaded. works on 0313308316.Jumbled dates.title...ru6bish data Jaf 19m8.txt 0313306419.Greenwood.Press.Rudolfo.A.Anaya.A.Criti cal.Companion.Oct.1999.txt 01505798756X.Silly Press.The.Strange.Professional.Title.Jul.1985.epub (from Serpentine on https://www.mobileread.com/forums/sho...a+regex&page=2) Last edited by kite; 11-20-2015 at 12:16 AM. Reason: clearer search, correction for double names |
![]() |
![]() |
![]() |
#2 | |
enturbulated
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 30
Karma: 130494
Join Date: May 2007
Device: Kobo Aura HD
|
Quote:
Last edited by kite; 10-25-2014 at 06:03 AM. Reason: add CODE /CODE |
|
![]() |
![]() |
Advert | |
|
![]() |
#3 | ||
Member
![]() Posts: 14
Karma: 10
Join Date: Oct 2014
Location: Koblenz, Germany
Device: Kindle Fire HD 6, Pocketbook Touch Lux, Tolino Vision
|
Wrong Regex
Hello,
it's a very good idea to help new users with regex. I need them especially for adding books and find it useful (and difficult) for searches in book list. Therefor we should have a place where new users find many examples. But here is one example that's wrong, if there's a double name like: Morland-Miller, A. F. - Tony Ballard 001 - Help me.txt Your regex Quote:
Quote:
|
||
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Help with regular expressions | MostlyCarbon | Library Management | 0 | 02-04-2012 03:00 PM |
Could use a bit of help with regular expressions to edit books on conversion | Flammy | Conversion | 3 | 12-29-2011 10:29 AM |
Adding books - need help with regular expressions | tweebee | Library Management | 10 | 08-05-2011 08:58 PM |
Regular Expressions | littleezza | Conversion | 1 | 07-15-2011 11:52 AM |
Help with Regular Expressions | ghostyjack | Workshop | 2 | 01-08-2010 11:04 AM |