Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 06-30-2010, 07:36 AM   #1
bigbot3
Junior Member
bigbot3 began at the beginning.
 
Posts: 4
Karma: 30
Join Date: Jun 2010
Location: Amsterdam
Device: Kobo Forma and H2O 1st gen
Custom Regular Expressions for adding book information

Hi,
I would like to start a thread about recipes for regular expressions
to extract book information from the file names

Here are a few to start with:

Standaard Calibre:

(?P<title>.+) - (?P<author>[^_]+)
example "Murder on the golf links - Agatha Christie.epub"

Title: Murder on the golf links
Author: Agatha Christie
Series:
Series Index:



\s*(?P<series_index>[0-9]*)\s*(?P<title>[^_].+) ?- (?P<author>[^_]+)

example "40 Murder on the golf links - Agatha Christie.epub"

Title: Murder on the golf links
Author: Agatha Christie
Series:
Series Index: 40.0


\s*((?P<title>(?P<series_index>[0-9]*)\s*[^_].+)) ?- (?P<author>([^_]+))

example "40 Murder on the golf links - Agatha Christie.epub"

Title: 40 Murder on the golf links
Author: Agatha Christie
Series:
Series Index: 40.0


\s*((?P<title>(?P<series_index>[0-9]*)\s*[^_].+)) ?- (?P<series>(?P<author>([^_]+)))

example "40 Murder on the golf links - Agatha Christie.epub"

Title: 40 Murder on the golf links
Author: Agatha Christie
Series: Agatha Christie
Series Index: 40.0


Maybe it would be handy to have a dropdown box for the regular expression field with a history in it to choose from.
Iam not to handy with reg. exp. and when experimenting sometimes i lose
the correct syntax.
Further I love Calibre, it is a wel composed program and a natural complement to my Sony PRS-600 reader.
bigbot3 is offline   Reply With Quote
Old 12-25-2010, 06:28 PM   #2
arberg
Junior Member
arberg began at the beginning.
 
Posts: 1
Karma: 10
Join Date: Dec 2010
Device: Sony PRC-650
I think it might be more helpful for people to learn that there exists tools which can help us humans create regular expressions. This is one such tool:

http://gskinner.com/RegExr/

It's pretty good, unfortunately it doesn't show which group matches which strings, and it doesn't acknowledge certain special python/calibre rules (such as []] and [-]) but it does allow you to write your filename you want to match, and then type and see how far your current expression matches the string.

Here's my expression (to get back on topic :

^(?P<author>[^-]+)(\s*-\s*(\[?(?P<series>[^-0-9]+)\s*(?P<series_index>[0-9.]+)?]?)?)?.*?-\s*(?P<title>[^\]{[()]+\w)

The expression expects the filename to start with Author and end with Title (possibly followed by garbage in parentheses). It also optionally matches series, and series index in the middle. It requires the title to end in an alpha-numeric character, and it does not allow the title to contain any kinds of parentheses (anything following a parenthesis will be discarded)

It matches the following examples filenames
Author Harris - Kingdom come
Author Harris - Kingdom come (v1.0)
Author Harris - Kingdom series - The Very Magical Kingdom (v1.0)
Author Harris - Kingdom series 14.5 - The Very Magical Kingdom (v1.0)
Author Harris - [Kingdom series 14.5] - The Very Magical Kingdom (v1.0)
Author Harris - [Kingdom series 14.5] [another useless string]- The Very Magical Kingdom (v1.0)

Author: Author Harris
Title: Kingdom come
Series: Kingdom series
Series Index: 14.5
arberg is offline   Reply With Quote
Advert
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Calibre book adding: Regular expression request... Spiffy Calibre 34 01-19-2016 01:03 PM
Problem with regular expressions Manichean Conversion 10 02-03-2011 02:27 PM
Help with Regular Expressions ghostyjack Workshop 2 01-08-2010 11:04 AM
Regular Expressions help needed Phil_C Workshop 20 10-03-2009 12:14 AM
BookDesigner v5 and regular expressions ShineOn Sony Reader 11 08-25-2008 04:06 PM


All times are GMT -4. The time now is 05:06 AM.


MobileRead.com is a privately owned, operated and funded community.