View Single Post
Old 12-10-2012, 07:09 PM   #1
dancal
Member
dancal began at the beginning.
 
Posts: 16
Karma: 10
Join Date: Dec 2012
Device: PC
Script: store calibre ebooks as symlinks to existing lib; advanced CLI import

Calibre is great for browsing a library, but it takes the position that "all your file(names) are belong to us". Some of us have had an e-book library for years, and want to retain ownership of filenames and directories. My script enables you to store existing e-books in Calibre WITHOUT DUPLICATING STORAGE.

Run "calibre_import_with_links.py -h" for help, and read below.

On Linux and Mac, e-books are stored as symlinks to existing files; on Windows, as hardlinks (for XP compatibility).

This script allows you to import a list of e-books and generate metadata automatically from filenames, using either simple regexps, or arbitrary Python expressions.

Example: my filenames are formatted as "AUTHOR1, AUTHOR2-TITLE [ATTR1=VAL1, ATTR2=VAL2].pdf" (attributes, such as ISBN, publication date etc. appear in random order):

Floyd Marinescu-EJB design patterns [license=free, isbn=0471208310, googleid=hpFdoX2ICckC, date=2002-02-19].pdf

To import these files into Calibre with metadata, I used

calibre_import_with_links.py --links --match '([^-]+)-([^[.]+)( \[)|\.' --tag title=2 --tag authors='lambda (g): g [1].replace (", ", " & ")' --match '.*isbn=([-0-9X]+)' --tag isbn=1 --dir ~/Downloads/CalibreLibIds *.pdf

Each --match is executed and stored, then subsequent --tag arguments can assign authors, date etc. based on that match.

In this example, for titles it was enough to pull the 2nd match group of the --match regex (--tag title=2). For AUTHORS -- since they were separated by commas instead of ampersands, I used a custom str.replace expression to fix that (--tag author='lambda g:...').

The script stores the original filenames in a separate directory (above, ~/Downloads/CalibreLibIds). Calibre tends to copy files around even for simple operations such as downloading metadata, destroying the links. Thus you might want to run "calibre_import_with_links.py --dir MYDIR --links --rebuild" to reclaim storage.

The --dir should be stored safely, preferably outside Calibre's dirs.
Attached Files
File Type: zip calibre_import_with_links.zip (3.4 KB, 262 views)
dancal is offline   Reply With Quote