Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Library Management

Notices

Reply
 
Thread Tools Search this Thread
Old 06-08-2011, 08:02 PM   #1
jevonbrady
Junior Member
jevonbrady began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Jun 2011
Device: none
Help with regular expressions

Hi,
I have some books that have been named in the following manner.

title [author, published year]{some text}.pdf

e.g. Rethink ~ Cut Costs Boost Innovation [Ric Merrifield, 2009]{Summary}.pdf

I want to add these correctly in calibre i.e. get title, author and published year information, while ignoring the string {Summary}.

What would be the regular expression for this? I am a complete newbie here so please help.
jevonbrady is offline   Reply With Quote
Old 06-09-2011, 02:07 AM   #2
Manichean
Wizard
Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.
 
Manichean's Avatar
 
Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
There's a tutorial for regular expressions available. I'm a little short on time right now, so I can't be more helpful, but I should be able to write a little more by tomorrow, if noone else has done it by then.
Manichean is offline   Reply With Quote
Advert
Old 06-09-2011, 02:16 PM   #3
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by Manichean View Post
if noone else has done it by then.
This will do it. The published date may be funky:
Code:
(?P<title>[^_-]+) \[(?P<author>[^_0-9-]*), (?P<published>[0-9]*)]
Starson17 is offline   Reply With Quote
Old 06-20-2011, 07:49 PM   #4
jevonbrady
Junior Member
jevonbrady began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Jun 2011
Device: none
Thanks, that worked! Is this standard python? Asking since I want to start doing this myself at some point in time.
jevonbrady is offline   Reply With Quote
Old 06-20-2011, 07:57 PM   #5
jevonbrady
Junior Member
jevonbrady began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Jun 2011
Device: none
Actually, I have a followup question. I have many books currently named such that the tags are included in their names. The tags are always included in curly braces. e.g. consider the previous book I had mentioned.

Rethink ~ Cut Costs Boost Innovation [Ric Merrifield, 2009].pdf

If there were tags for this book, the name of the book would be as follows:

Rethink ~ Cut Costs Boost Innovation [Ric Merrifield, 2009]{Innovation, Cost Control}.pdf

How would the regular expression be changed to import the tags as well? This would save me a huge amount of work to classify articles and books. Thanks in advance.

BTW, I tried the following:

(?P<title>[^_-]+) \[(?P<author>[^_0-9-]*), (?P<published>[0-9]*)] \{(?P<tags>[^_0-9-]*)}

but that just imports the entire filename as title, which does not help me at all! I also used <tag> for tag. Any pointers?

Last edited by jevonbrady; 06-20-2011 at 08:04 PM. Reason: adding details
jevonbrady is offline   Reply With Quote
Advert
Old 06-21-2011, 10:11 AM   #6
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by jevonbrady View Post
Thanks, that worked! Is this standard python? Asking since I want to start doing this myself at some point in time.
Yes, it's standard.
Starson17 is offline   Reply With Quote
Old 06-21-2011, 10:16 AM   #7
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by jevonbrady View Post
Actually, I have a followup question. I have many books currently named such that the tags are included in their names. The tags are always included in curly braces. e.g. consider the previous book I had mentioned.

Rethink ~ Cut Costs Boost Innovation [Ric Merrifield, 2009].pdf

If there were tags for this book, the name of the book would be as follows:

Rethink ~ Cut Costs Boost Innovation [Ric Merrifield, 2009]{Innovation, Cost Control}.pdf

How would the regular expression be changed to import the tags as well? This would save me a huge amount of work to classify articles and books. Thanks in advance.

BTW, I tried the following:

(?P<title>[^_-]+) \[(?P<author>[^_0-9-]*), (?P<published>[0-9]*)] \{(?P<tags>[^_0-9-]*)}

but that just imports the entire filename as title, which does not help me at all! I also used <tag> for tag. Any pointers?
You can't directly import into the tags field. The work-around is to import them into another field and then do a search and replace to move them into tags.
Starson17 is offline   Reply With Quote
Reply

Tags
regular expressions


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Problem with regular expressions Manichean Conversion 10 02-03-2011 02:27 PM
An introduction to regular expressions Manichean Conversion 0 01-26-2011 05:05 PM
Help with Regular Expressions ghostyjack Workshop 2 01-08-2010 11:04 AM
Regular Expressions help needed Phil_C Workshop 20 10-03-2009 12:14 AM
BookDesigner v5 and regular expressions ShineOn Sony Reader 11 08-25-2008 04:06 PM


All times are GMT -4. The time now is 12:32 AM.


MobileRead.com is a privately owned, operated and funded community.