Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 09-20-2010, 03:57 PM   #1
asrrin29
Junior Member
asrrin29 began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Sep 2010
Device: Android
Regex and Metadata from filename.

Thanks very much for taking the time to read this.

I tried reading as much as I could about regular expressions, and even being a hobbyist programmer it makes my head spin. I have some books that are parsed strangely, and I was wondering if anyone could help me. Here is an example filename:

Code:
[Black Fleet Crisis] - 01 - Before The Storm (Michael Mcdowell).epub
I've altered default regex a little bit to get my authors to parse properly and this is what I currently use:

Code:
(?P<title>.+) \((?P<author>[^_]+)\)
What I need is a way to tell Calibre to ignore the series and series volume info, and only parse the title and author. To make it easier, none of my titles have numbers or brackets, so if I could get a regex that ignores everything inside a bracket, plus numbers outside brackets, I'd be set. Can anyone help me?

EDIT: I found a thread that lets me parse the brackets out into some into publishdate, so now all I need to do is figure out how to get the numbers to disappear.
asrrin29 is offline   Reply With Quote
Old 09-20-2010, 04:22 PM   #2
kacir
Wizard
kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.
 
kacir's Avatar
 
Posts: 2,778
Karma: 3093507
Join Date: May 2006
Device: PocketBook 360, before it was Sony Reader, cassiopeia A-20
Quote:
Originally Posted by asrrin29 View Post
Here is an example filename:

Code:
[Black Fleet Crisis] - 01 - Before The Storm (Michael Mcdowell).epub
I've altered default regex a little bit to get my authors to parse properly and this is what I currently use:

Code:
(?P<title>.+) \((?P<author>[^_]+)\)
What I need is a way to tell Calibre to ignore the series and series volume info, and only parse the title and author.
Code:
[Black Fleet Crisis] - 01 - Before The Storm (Michael Mcdowell).epub
Code:
.* - (?P<title>.+) \((?P<author>.+)\)
.* would eat up all the characters up to the last ' - ', because * quantifier is greedy

Have a look at this thread
http://www.mobileread.com/forums/showthread.php?t=99258

Last edited by kacir; 09-20-2010 at 04:24 PM.
kacir is offline   Reply With Quote
 
Enthusiast
Old 09-20-2010, 05:13 PM   #3
asrrin29
Junior Member
asrrin29 began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Sep 2010
Device: Android
Awesome! It works nearly perefectly. It fails on books that are not part of a series (title author only) but I have so few of those I can just group them in thier own folder and import them as a second group with different regex.

Calibre is the perfect tool for what I need, and I'm glad you guys are here to give answers. thanks!
asrrin29 is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Metadata in Title/filename mezme Calibre 0 08-18-2010 03:08 AM
Metadata Filename Syntax gandor62 Calibre 15 07-18-2010 03:46 AM
Need help with metadata by filename artbatista Calibre 17 12-19-2009 07:51 AM
Little Help with Metadata from Filename needed plunderydoo Calibre 4 09-06-2009 08:34 AM
Metadata from filename problem kad032000 Calibre 0 05-24-2009 02:26 AM


All times are GMT -4. The time now is 06:04 PM.


MobileRead.com is a privately owned, operated and funded community.