![]() |
#1 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,498
Karma: 5199835
Join Date: May 2010
Location: Norway
Device: Sony PRS-505, PRS-950
|
Regex assistance request - importing books.
Dear all,
I'm having some trouble with complex book titles, such as this fictional example: Some Author - Ancient Egypt BC 3000-732 - The Sixth Dynasty BC 2345-2181 In Calibre three first columns are Title, Author, Series and the title above ends up like this: Ancient Egypt BC | Some Author | -nothing- when I want it to end up thus: The Sixth Dynasty BC 2345-2181 | Some Author | Ancient Egypt BC 3000-732 Sadly my brain is biologically incompatible with everything to do with scripts, programming and such, so I haven't a clue what's going on. The regex script I use was copied from an old post here and I quite frankly have no idea how it does what it does. I just use it because it works just fine with most books and better than the other scripts I tried. The script is: (?P<author>[a-zA-Z&' \.]+?) - \[?((?P<series>[a-zA-Z' ]+) (?P<series_index>[0-9\.]+)\]? - )?(?P<title>[a-zA-Z,'\.\- ]+).* It's hardly a big deal, but it would be very cool if it could be made to work and if some kind soul could give me some guidance I'd be very grateful. |
![]() |
![]() |
![]() |
#2 |
Evangelist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 416
Karma: 1045911
Join Date: Sep 2011
Location: Cape Town, South Africa
Device: Kindle 3
|
It would help if you could give a larger sample set to work with, I am sure that this is over simplistic example that you have given
![]() However, assuming that your set will always have the formatting given in your post as : {author} - {series name} - {book title} or as an example: Some Author - Some Series - This Title Hrrr The regex would be as simple as : (?P<author>.+?) - (?P<series>.+?) - (?P<title>.+) The pitfall would be titles that could have a spaced dash in them "Novel A - A Book About Things". There is also no provision for a series index. The following examples include two ways that may be more correct for series indices. I think your real case is most likely closer to something like this : Some Author - Some Series [4.0] - This Title Hrrr regex : (?P<author>.+?) - (?P<series>.+?) (?:\[(?P<series_index>[\d\.]{1,4})\] )?- (?P<title>.+) Or as the old regex expects : Some Author - [Some Series 4.0] - This Title Hrrr regex : (?P<author>.+?) - (?:\[(?P<series>.+?) ?(?P<series_index>[\d\.]{1,4})?\]) - (?P<title>.+) If none of them match the case, feel free to either PM me a list or reply here with more examples. Last edited by Serpentine; 10-23-2011 at 10:29 PM. Reason: a little fix |
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,498
Karma: 5199835
Join Date: May 2010
Location: Norway
Device: Sony PRS-505, PRS-950
|
See, now I just feel stupid....
Your first example, which is so simple I could *perhaps* have constructed it myself, worked just fine. I guess I simply didn't consider going right back to basics before coming straight here, crying for help. Thank you so much for saving me from my own ineptitude! |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Regex: File Renaming Pre-Import & Importing | penguinaka | Library Management | 20 | 08-14-2012 06:11 PM |
Need a regex for importing books | flinkdeldinky | Calibre | 31 | 10-29-2011 08:31 AM |
Importing RegEx Line | TheEldest | Calibre | 1 | 07-05-2011 10:18 PM |
Request: more Regex support | drMerry | Calibre | 4 | 05-02-2011 01:23 AM |
regex Issue when Importing | river | Calibre | 3 | 06-16-2009 11:03 AM |