View Single Post
Old 08-15-2011, 10:24 AM   #3
mightymouse2045
Enthusiast
mightymouse2045 began at the beginning.
 
Posts: 30
Karma: 10
Join Date: May 2011
Device: xoom
Quote:
Originally Posted by Manichean View Post
This is not as trivial as you may think it is, because it requires one expression to match all variations of a title occuring in the book, one for all variations of author names etc. I don't believe it's possible to do this robustly enough and still retain the simplicity of just having to write (title) or something- at best, you'd end up with some sort of inferior copy of regular expressions.

Bottom line: Learn about regexes, it helps in more places than just using Calibre
Yeah I am learning a lot about reg ex, but it's just you know how it is - 3 months down the track you get a whole bunch of files you want to import and having to recall the best way to do it, so it can capture all or most of the files without having to edit them all after importing, and trial and error, and 1 to 2 hours later is a pain in the butt.

I don't mean to have one reg ex that can capture all possible combinations. What I mean is some sort of plugin that can popup a box and allow you to drag the fields you want to match in the order you want them and allows you to drag and AND, OR or some other variable in between each field - then click a create button that will populate a text field with the correct reg ex for that combination of fields and AND or OR etc that you can then copy and go into preferences and paste that reg ex into the 'Adding Books' preferences for example....

I downloaded another persons library today and they haven't exported books or just haven't put the books in a very import friendly structure, so for example I have:

1 Divine by Mistake - P.C. Cast.epub
1 How to Train Your Dragon - Cressida Cowell.epub
Kingmaker, Kingbreaker 02_ Awakened Mage - Karen Miller.epub

So basically some are
title - author

and others are
series series index_ title - author

What I have to do now is split up the files and then write 2 reg ex to capture the 2 variations, but also allow for oddities in the names as well ie some having spaces after the _ and some don't etc and that's what I'm not so clever with :P

It would be fantastic if some clever reg ex guru out there could write some reg ex wizard that does even 80% of the job which makes it easier for us not so brilliant book worms to play with and fine tune for our purposes, or even better if a wizard could be made that is clever enough to do the whole thing would of course be even better.

I'm not saying it's easy - just putting the thought out there and hoping to capture the interest of someone who might be talented enough to do that
mightymouse2045 is offline   Reply With Quote