Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Library Management

Notices

Reply
 
Thread Tools Search this Thread
Old 12-10-2014, 11:50 AM   #1
Vortex
Groupie
Vortex began at the beginning.
 
Vortex's Avatar
 
Posts: 171
Karma: 10
Join Date: Dec 2008
Device: Likebook Mars
Help with regular expressions for adding books

I have a large group of old books without metadata that have good file names, I want to import them without messing up the Author and Title so I can use it to get the rest of the info from the 'Download Metadata & Cover's'. Every expression I try seems to mess up some of them. The books are all in the format:

Author - Series - Title.ext OR Author - Title.ext

I've been trying to write a simple regex to catch all the characters before the first " - " as the Authors name, no matter what it contains. and everything after the last " - " as the Title, no matter what it contains, and everything in between as the series info, if present.

This works as long as there's a Series but fails if not, and I cant seem to make the series optional:

^(?P<author>.+) - (?P<series>.*) - (?P<title>.+)

Can anyone help please, thanks.
Vortex is offline   Reply With Quote
Old 12-10-2014, 05:29 PM   #2
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 21,662
Karma: 29711016
Join Date: Mar 2012
Location: Sydney Australia
Device: none
I would separate the files into two folders - 'with series' and 'without series'. Stefan's Tools includes an easy to use GUI Grep for Windows

Then add from the two folders with different re's in Preferences->Add Books

BR
BetterRed is offline   Reply With Quote
Advert
Old 12-11-2014, 12:32 PM   #3
mzmm
Groupie
mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.mzmm has not lost his or her sense of wonder.
 
mzmm's Avatar
 
Posts: 171
Karma: 86271
Join Date: Feb 2012
Device: iPad, Kindle Touch, Sony PRS-T1
i haven't tested this, but seems like it would work:

Code:
^(?P<author>.+) - ((?P<series>.+) - )?(?P<title>.+)
mzmm is offline   Reply With Quote
Old 12-11-2014, 01:53 PM   #4
Vortex
Groupie
Vortex began at the beginning.
 
Vortex's Avatar
 
Posts: 171
Karma: 10
Join Date: Dec 2008
Device: Likebook Mars
Thanks for the suggestions.

mzmm, unfortunately it doesn't work, it adds any Series info to the Author. The problem is anything that allows a hyphenated name wont stop at the first " - " deliminator. I'll try one of the coding forums and see if I can get something there.

BR, I know I can do it that way but it's a long way round and shouldn't be necessary.
Vortex is offline   Reply With Quote
Old 12-11-2014, 02:15 PM   #5
kacir
Wizard
kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.
 
kacir's Avatar
 
Posts: 3,463
Karma: 10684861
Join Date: May 2006
Device: PocketBook 360, before it was Sony Reader, cassiopeia A-20
This works:
Code:
^(?P<author>((?!\s-\s).)+)\s-\s(?:(?:\[\s*)?(?P<series>.+)\s(?P<series_index>[\d\.]+)(?:\s*\])?\s-\s)?(?P<title>[^(]+)(?:\(.*\))?
Slightly different implementation
Code:
^(?P<author>((?!\s-\s).)+)\s-\s(?:(?:[[(]\s*)?(?P<series>.+)\s(?P<series_index>[\d\.]+)(?:\s*[])])?\s-\s)?(?P<title>[^(]+)(?:\(.*\))?
I had to put those to "code" tags, because the forum server converted some substrings to smileys ;-)

Install plugin called "Quick preferences" - it contains the first of those REs, I think.
The second one was created by me, I think ;-). Search for my posts and phrase "Regular expression".
The RE covers lots of scenarios, including missing series info, series or series number in brackets - square and regular, varying number/kind of whitespace.

Calibre is programmed in Python, and uses Python version of Regular expressions.
kacir is offline   Reply With Quote
Advert
Old 12-11-2014, 05:07 PM   #6
Vortex
Groupie
Vortex began at the beginning.
 
Vortex's Avatar
 
Posts: 171
Karma: 10
Join Date: Dec 2008
Device: Likebook Mars
Actually neither of those can cope with the file name "Author - Series - Title.txt"

Your suggestions were really helpful though as I was able to pull the code apart and figure out how to adapt it. This works perfect for my needs:

Code:
^(?P<author>((?!\s-\s).)+)\s-\s((?P<series>.+)?\s-\s)?(?P<title>.+)?
Vortex is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
A new user's collection of Regular Expressions (regex) for 'Add books > Control the a kite Library Management 2 11-04-2015 06:51 AM
Please help with regular expressions when adding books gill_d Library Management 1 10-25-2014 04:25 AM
Could use a bit of help with regular expressions to edit books on conversion Flammy Conversion 3 12-29-2011 10:29 AM
Adding books - need help with regular expressions tweebee Library Management 10 08-05-2011 08:58 PM
Custom Regular Expressions for adding book information bigbot3 Calibre 1 12-25-2010 06:28 PM


All times are GMT -4. The time now is 08:53 PM.


MobileRead.com is a privately owned, operated and funded community.