01-02-2011, 03:51 AM | #1 |
Junior Member
Posts: 8
Karma: 10
Join Date: Jun 2009
Device: iPhone 3G
|
Calibre conversion without adding to library
Hi all,
I have a large collection of eBooks (several thousand) many of which have no meta-data or rubbish meta-data. I've been collecting for years and have used a range of devices to view these over time, meaning they are in varying formats, have been format converted multiple times etc. About 3-4 years ago I actually converted the majority to .txt files as, at the time, this seemed to be the easiest way around the whole "format wars" situation. Needless to say I now regret this action. I now have an android tablet and am looking at converting all the files to epub and exporting them all to an SD card so I can read them on the go. But I need to be able to recognise the files in my android eBook reader, which means I would ideally like to retain my original file names, or at leat have calibre export with recognisable file names. My problem is when I import them into calibre, the author/title information gets completely scrambled. The original files are named in a fairly consistent way - "author first name" "author surname" - "series name" "series no." (if part of a series) - "title" but I don't know enough about python coding and regular expressions to instruct calibre how to read this info and convert it into useful meta-data The result is my calibre library is pretty messed up. Some books have the whole original file name in the "title" field, while others are almost right for author and title fields, while some come out as complete gibberish - a couple even have the apple logo or weird "wingdings" fonts thrown in the title!! Knowing how protective of calibre people on this forum are, I would like to highlight that I am NOT critisizing calibre, I freely accept that the problem is caused by my own ignorance of calibre and python coding. I am asking for advise and help. The easiest way (from my perspective) would be if I could use calibre's batch conversion tools without actually importing the files into the library and having the newly converted files use my original file names. Is this possible? (I suspect not) Second option would be to find another batch conversion utility that can just convert from pdf, lit, html and txt to epub. Any suggestions? Finally (and most difficult for me) is to somehow figure out the necessary regular expressions to import and export my files through calibre, reading the metadata from the file names. Anyone able to advise on these? I appreciate any help that might be offered |
01-02-2011, 04:19 AM | #2 | |
Grand Sorcerer
Posts: 11,742
Karma: 6997045
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
|
Quote:
You might want to look generating a file of command lines, using an editor to carve the metadata from the file names and create the appropriate options. That way you can set the author and title metadata inside the book. This will help when using readers that get info from the book instead of the file name. |
|
Advert | |
|
01-02-2011, 04:31 AM | #3 | |
US Navy, Retired
Posts: 9,864
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Nexus 7
|
Chaley already pointed you to the command line interface for conversions.
Quote:
Code:
^((?P<author>([^\-_0-9]+)(?=\s*-\s*)(?!\s*-\s*[0-9.]+)|\b))(\s*-\s*)?((?P<series>[^0-9\-]+)(\s*-\s*)?(?P<series_index>[0-9.]+)\s*-\s*)?(?P<title>[^\-_0-9]+) I didn't create the above regex, its been wandering around for a while. |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Is there a way to prevent Calibre adding to the TOC upon conversion? | PatNY | Conversion | 21 | 03-17-2011 03:09 PM |
Adding .prc files to the Calibre Library | ne14st | Calibre | 4 | 11-25-2010 05:29 AM |
Convert without adding to library? | Nexutix | Calibre | 3 | 11-17-2010 12:16 AM |
Calibre only showing part of a file while adding to library | confusednow | Calibre | 2 | 09-20-2010 08:00 PM |
eBook Library Conversion in Calibre | river | Calibre | 1 | 07-15-2009 11:16 AM |