09-27-2011, 05:39 AM | #1 |
Junior Member
Posts: 8
Karma: 10
Join Date: Sep 2011
Device: IQ pocketbook
|
Importing Ebooks, regexp confusion
Hi all,
I'm trying to import fields from formated files to import to calibre. here is the regexp I'm trying to make work : ((?P<series>.+) # (?P<series_index>.+) - |)(?P<title>.+)( \[(?P<author>.+)\]|) If it belong to a series, then my filename will start by : - "**Serie name** # **Serie number ** - " There might not be a serie name, so after this, comes the title : - **title** And afterwards, I add authors, but not always : - "[**author**]" Here is a sample of one of the titles I'm willing to parse : Serie Name # 01 - Title name [author1 & Author2].epub This Works : ((?P<series>.+) # (?P<series_index>.+) - |)(?P<title>.+) This Works if there is a title : (?P<title>.+) \[(?P<author>.+)\] But this : ((?P<series>.+) # (?P<series_index>.+) - |)(?P<title>.+)( \[(?P<author>.+)\]|) Always gives up the following error : Attribute editor :'Nonetype' object has no attribute replace Can anyone help ? Regards |
09-27-2011, 10:35 AM | #2 |
Junior Member
Posts: 8
Karma: 10
Join Date: Sep 2011
Device: IQ pocketbook
|
tried some more ...
I tired again and got some good things ... withou the author
Here are all my possible filename list : 1) Series # 00.zip 2) Series # 00 - Title.zip 3) Series # 00 - Title [author].zip 4) Series # 00 [Author].zip 5) Title.zip 6) Title [Author].zip Without the author, I found out that this would work perfectly for 1, 2 and 5 (?P<series>.+) # (?P<series_index>\d+)( - (?P<title>.+)|) For case 1, series and series_index are filled, AND title = Series # 00 For case 2, series, index and title are filled For case 3 title is filled But everytime I try to add the " [Author]", I get messed up all around. I know the parsing should go like ( \[(?P<author>.+)\[) But I can not make it conditional added a "?" at the end, but after reading some other forums, I do not fully understant the "?" meaning Tried the trick (( \[(?P<author>.+)\[)|), but that throws me the error "Attribute editor :'Nonetype' object has no attribute replace" Can anyone help ? |
Advert | |
|
09-27-2011, 11:25 AM | #3 |
Grand Sorcerer
Posts: 11,741
Karma: 6997045
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
|
@kovid: an AttributeError exception is not caught in the try block in ebooks/metadata/meta.py lines 149-165.
This used to work. Was there an update somewhere that changed referencing None to an AttributeError? |
09-27-2011, 11:28 AM | #4 |
creator of calibre
Posts: 43,858
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
I'm going to need more context. Are you saying that match.group('authors') is raising an AttributeError?
|
09-27-2011, 11:31 AM | #5 | |
Grand Sorcerer
Posts: 11,741
Karma: 6997045
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
|
Quote:
Specifically, if (?P<author>... is in the pattern, and if that part of the pattern matches nothing, then match.group... returns None. What I don't understand is "what changed?" |
|
Advert | |
|
09-27-2011, 11:42 AM | #6 |
creator of calibre
Posts: 43,858
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
I dont recall anything changing that could affect this, but who knows...
I've committed a fix. |
09-27-2011, 12:17 PM | #7 |
Grand Sorcerer
Posts: 11,741
Karma: 6997045
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
|
@ZaMotH:
Try Code:
((?P<series>.+) # (?P<series_index>.+) -\s*)?(?P<title>[^\[]*)(\s*\[(?P<author>.+)\])? |
09-27-2011, 02:21 PM | #8 |
Junior Member
Posts: 8
Karma: 10
Join Date: Sep 2011
Device: IQ pocketbook
|
Thank you for this quick answer, and quick fix.
Acutally, I'm using Calibre Portable ... I prefere portable applications. I will wait until the next release I guess. Thanks to both of you. |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Help with Regexp for importing Magazines | donnelyn | Library Management | 5 | 08-24-2011 10:56 AM |
Problems importing ebooks | simogere | Calibre | 8 | 10-23-2010 10:14 AM |
importing ebooks | iconeo | Calibre | 4 | 05-05-2009 03:35 AM |
Buying ebooks direct from harpercollins confusion | stustaff | Sony Reader | 7 | 01-29-2008 06:59 PM |