|  01-29-2015, 04:51 PM | #1 | 
| Fully Converged            Posts: 18,175 Karma: 14021202 Join Date: Oct 2002 Location: Switzerland Device: Too many to count here. | 
				
				Importing MobileRead library into calibre library
			 
			
			Hi there, We are currently contemplating the migration of the existing MobileRead library to a calibre library. Ultimately we would like to detach the MobileRead library from the forums and present it as its own OPDS-powered website, with the content powered and managed by calibre. One aspect that I am currently struggling with is how we could preserve a link to the original attachment id. In other words, once books have been (batch) imported to calibre, we need to have a marker (a customer column for example) that contains the id of the original attachment that we can refer back to if needed. I see two possibilities right now: 
 What I am curious about is, are there any better ways of handling this situation? There may be even more information from the original attachment that we would like to extract and embed in the calibre database. For example, the timestamp of the original upload date. Again, this could probably be encoded in the filename if we could somehow tell calibre that this part signifies a custom field named "upload date", or it could later be inserted manually in the calibre database using the book id<->attachment id correlation. Thanks for your help.  Alex | 
|   |   | 
|  01-29-2015, 07:39 PM | #2 | 
| null operator (he/him)            Posts: 21,996 Karma: 30277294 Join Date: Mar 2012 Location: Sydney Australia Device: none | 
			
			@Alex - as things stand the only metadata items that can be extracted from file names are those you see below However - if you add the custom columns you want to the library, and you add the books via/with an opf, then if the opf contains the data for a custom column it will be populated - eg if you created a column called origattach/Original Attachment and you had something like this in the opf files Code: <meta name="calibre:user_metadata:#origattach" blah-blah, "#value#":"MR attachment id 1234",blah blah> Which begs the question - how would you go about getting such values into an opf file - I'll leave that to others, who have the requisite scripting and regular expression skills. BR | 
|   |   | 
|  01-29-2015, 07:55 PM | #3 | 
| Ex-Helpdesk Junkie            Posts: 19,421 Karma: 85400180 Join Date: Nov 2012 Location: The Beaten Path, USA, Roundworld, This Side of Infinity Device: Kindle Touch fw5.3.7 (Wifi only) | 
			
			@BR -- I finally have my excuse, I guess... @ Alexander, I have for a while wanted to be able to store the original filename in a custom column (for reasons explained HERE). But... I was always too lazy.  One way would be to patch calibre, to add in a feature that allows keeping the file contents metadata and overriding it with the filename regex (and also enabling custom columns in the from_filename metadata). The other way would be to add it via a script, possibly via calibre-debug, possibly via bash. I will have to see sometime soon, if I can bash something together. Last edited by eschwartz; 01-29-2015 at 07:58 PM. | 
|   |   | 
|  01-29-2015, 08:10 PM | #4 | |
| null operator (he/him)            Posts: 21,996 Karma: 30277294 Join Date: Mar 2012 Location: Sydney Australia Device: none | Quote: 
    The file name could encapsulate multiple properties (attachment id, upload timestamp, number of downloads...), which could be subsequently pulled apart into separate custom columns with bulk edit S&R. BR | |
|   |   | 
|  01-29-2015, 08:15 PM | #5 | 
| Ex-Helpdesk Junkie            Posts: 19,421 Karma: 85400180 Join Date: Nov 2012 Location: The Beaten Path, USA, Roundworld, This Side of Infinity Device: Kindle Touch fw5.3.7 (Wifi only) | 
			
			I am thinking of duplicating the regex functionality, actually. I personally would match it all to the #origfile field, but others might prefer differently. Hence -- fluidity.
		 | 
|   |   | 
|  01-29-2015, 08:53 PM | #6 | ||
| null operator (he/him)            Posts: 21,996 Karma: 30277294 Join Date: Mar 2012 Location: Sydney Australia Device: none | Quote: 
 Quote: 
 BR | ||
|   |   | 
|  01-29-2015, 09:11 PM | #7 | 
| null operator (he/him)            Posts: 21,996 Karma: 30277294 Join Date: Mar 2012 Location: Sydney Australia Device: none | 
			
			On the other hand once the attachment id is in the database, producing a csv with calibre book number, attachment id is easy enough.  As Alex alluded, it could be used to extract data from the attachment id file for use in calibredb set_custom commands. BR | 
|   |   | 
|  01-29-2015, 09:30 PM | #8 | 
| Well trained by Cats            Posts: 31,238 Karma: 61360164 Join Date: Aug 2009 Location: The Central Coast of California Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A | 
			
			What about adding extra, Unique pattern,  tags to the OPF, then use S&R to move them to the extra fields: MRID123456
		 | 
|   |   | 
|  01-29-2015, 09:32 PM | #9 | 
| Grand Sorcerer            Posts: 24,905 Karma: 47303824 Join Date: Jul 2011 Location: Sydney, Australia Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos | 
			
			As calibre can read an OPF file that is supplied with the book, maybe the way to go is to generate that. I'm pretty sure this doesn't need the manifest and spine, so it would be relatively simple. The OPF can include custom columns, so adding an original file name that way would work. The other thing is that it sounds like the attachment id could be used as an identifier. Storing it as an identifier and creating a MobileRead metadata source plugin would allow easy navigation back to the source. And the metadata source plugin doesn't have to have search capabilities. It just needs to translate the identifier into a display string and a URL. | 
|   |   | 
|  01-29-2015, 10:00 PM | #10 | 
| creator of calibre            Posts: 45,592 Karma: 28548962 Join Date: Oct 2006 Location: Mumbai, India Device: Various | 
			
			If I were you, I'd write a simple script to do it with calibredb, like this, in bash like code: Code: for filename, attachment_id in filenames: book_id=$(calibredb add filename | grep 'Added book ids:' | cut -d: -f2 | cut -d' ' -f2) calibredb set_metadata book_id --field identifiers:mobileread:attachment_id calibredb set_metadata --list-fields to get their names. You can even create the custom columns using calibredb, if you dont want to involve the GUI at all. | 
|   |   | 
|  01-30-2015, 07:25 AM | #11 | 
| Fully Converged            Posts: 18,175 Karma: 14021202 Join Date: Oct 2002 Location: Switzerland Device: Too many to count here. | 
			
			Thanks! Following your tips, I did some test on the following book: https://www.mobileread.com/forums/sho...d.php?t=255021 Code: $ calibredb set_metadata --list-fields Title Field name Attachment ID #attachmentid Uploader #uploader Author Sort author_sort ... $ calibredb add ~/mrlibrary/AlexBell/255021/134208_What\ Diantha\ Did\ -\ Charlotte\ Perkins\ Gilman.mobi Backing up metadata Added book ids: 6 Notifying calibre of the change $ calibredb set_metadata 6 --field \#uploader:"AlexBell" Title : What Diantha Did Title sort : What Diantha Did Author(s) : Charlotte Perkins Gilman [Gilman, Charlotte Perkins] Publisher : Bellware Tags : humanism, servant question, romance Languages : eng Timestamp : 2015-01-30T09:51:46+00:00 Published : 2015-01-24T13:00:00+00:00 Identifiers : mobi-asin:cec96dd0-e53e-4f26-b392-166f8c160ce4 Comments : <p class="description">'What Diantha Did' was serialised in 'The Forerunner' from November 1909 to October 1910, and published separately in 1910. The main themes are 'The servant question' and the grief caused by having to do work in which one is not interested; set against a background of future female in-laws who would be ashamed to earn their own living, and a fiance who believes that 'No man - that is a man - would marry a woman and let her run a business.'</p> Uploader : AlexBell Backing up metadata Notifying calibre of the change $ calibredb set_metadata 6 --field \#attachmentid:"134208" Title : What Diantha Did Title sort : What Diantha Did Author(s) : Charlotte Perkins Gilman [Gilman, Charlotte Perkins] Publisher : Bellware Tags : humanism, servant question, romance Languages : eng Timestamp : 2015-01-30T09:51:46+00:00 Published : 2015-01-24T13:00:00+00:00 Identifiers : mobi-asin:cec96dd0-e53e-4f26-b392-166f8c160ce4 Comments : <p class="description">'What Diantha Did' was serialised in 'The Forerunner' from November 1909 to October 1910, and published separately in 1910. The main themes are 'The servant question' and the grief caused by having to do work in which one is not interested; set against a background of future female in-laws who would be ashamed to earn their own living, and a fiance who believes that 'No man - that is a man - would marry a woman and let her run a business.'</p> Uploader : AlexBell Attachment ID : 134208 Backing up metadata Notifying calibre of the change Also changing the identifier would work as suggested: Code: $ calibredb set_metadata 6 --field identifiers:mobileread:134208 Title : What Diantha Did Title sort : What Diantha Did Author(s) : Charlotte Perkins Gilman [Gilman, Charlotte Perkins] Publisher : Bellware Tags : humanism, servant question, romance Languages : eng Timestamp : 2015-01-30T11:42:14+00:00 Published : 2015-01-24T13:00:00+00:00 Identifiers : mobileread:134208 Comments : <p class="description">'What Diantha Did' was serialised in 'The Forerunner' from November 1909 to October 1910, and published separately in 1910. The main themes are 'The servant question' and the grief caused by having to do work in which one is not interested; set against a background of future female in-laws who would be ashamed to earn their own living, and a fiance who believes that 'No man - that is a man - would marry a woman and let her run a business.'</p> Uploader : AlexBell Attachment ID : 134208 Backing up metadata Notifying calibre of the change As Kovid suggested using a small script to parse the filenames and to embed the info via calibredb set_metadata during the import process seems like an easy solution. @BetterRed, @davidfor, I tested importing the extra metadata via an opf file (which would be another easy solution as I could include all the extra data in the opf while extracting the attachments from the MR database), but it appears that calibre then erases part of the existing metadata. Code: $ calibredb add ~/mrlibrary/AlexBell/255021/134208_What\ Diantha\ Did\ -\ Charlotte\ Perkins\ Gilman.mobi   
Backing up metadata
Added book ids: 8
Notifying calibre of the change
$ calibredb show_metadata 8
Title               : What Diantha Did
Title sort          : What Diantha Did
Author(s)           : Charlotte Perkins Gilman [Gilman, Charlotte Perkins]
Publisher           : Bellware
Tags                : romance, humanism, servant question
Languages           : eng
Timestamp           : 2015-01-30T12:18:08+00:00
Published           : 2015-01-24T13:00:00+00:00
Identifiers         : mobi-asin:cec96dd0-e53e-4f26-b392-166f8c160ce4
Comments            : <p class="description">'What Diantha Did' was serialised in 'The Forerunner' from November 1909 to October 1910, and published separately in 1910. The main themes are 'The servant question' and the grief caused by having to do work in which one is not interested; set against a background of future female in-laws who would be ashamed to earn their own living, and a fiance who believes that 'No man - that is a man - would marry a woman and let her run a business.'</p>
$ cat ~/mrlibrary/AlexBell/255021/metadata.opf        
<?xml version='1.0' encoding='utf-8'?>
<package xmlns="http://www.idpf.org/2007/opf" unique-identifier="uuid_id" version="2.0">
    <metadata xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:opf="http://www.idpf.org/2007/opf">
        <meta name="calibre:user_metadata:#uploader" content="{"kind": "field", "#value#": "alex", "column": "value", "colnum": 1, "is_multiple": null, "is_multiple2": {}, "search_terms": ["#uploader"], "is_csp": false, "is_category": true, "table": "custom_column_1", "is_custom": true, "is_editable": true, "rec_index": 22, "link_column": "value", "label": "uploader", "#extra#": null, "datatype": "text", "name": "Uploader", "category_sort": "value", "display": {"use_decorations": 0}}"/>
    </metadata>
</package>
$ calibredb set_metadata 8 ~/mrlibrary/AlexBell/255021/metadata.opf     
Title               : Unknown
Title sort          : Unknown
Author(s)           : Unknown
Publisher           : Bellware
Tags                : romance, humanism, servant question
Languages           : eng
Timestamp           : 2015-01-30T12:18:08+00:00
Published           : 2015-01-24T13:00:00+00:00
Identifiers         : mobi-asin:cec96dd0-e53e-4f26-b392-166f8c160ce4
Comments            : <p class="description">'What Diantha Did' was serialised in 'The Forerunner' from November 1909 to October 1910, and published separately in 1910. The main themes are 'The servant question' and the grief caused by having to do work in which one is not interested; set against a background of future female in-laws who would be ashamed to earn their own living, and a fiance who believes that 'No man - that is a man - would marry a woman and let her run a business.'</p>
Uploader            : alex
Backing up metadata
Notifying calibre of the change@davidfor, I love the idea of a metadata source plugin that could translate the attachment id to a relevant URL! Would that be very difficult to write? | 
|   |   | 
|  01-30-2015, 08:01 AM | #12 | 
| creator of calibre            Posts: 45,592 Karma: 28548962 Join Date: Oct 2006 Location: Mumbai, India Device: Various | 
			
			No there is no perfomance advantage to identifiers vs custom columns. The main advantage is that in the calibre GUI identifiers can become clickable links. Note that you dont have to bother with writing a metadata plugin for MR. Simply add the full URL, like this identifiers:url:http://whatever that will automatically become a link in the UI | 
|   |   | 
|  01-30-2015, 01:51 PM | #13 | 
| frumious Bandersnatch            Posts: 7,570 Karma: 20150435 Join Date: Jan 2008 Location: Spaniard in Sweden Device: Cybook Orizon, Kobo Aura | 
			
			But we would need a single book record to have several identifiers (of the same kind), one for each format. With a custom column we can store the different ids as a list, can we do this with standard identifiers?
		 | 
|   |   | 
|  01-30-2015, 02:14 PM | #14 | 
| Ex-Helpdesk Junkie            Posts: 19,421 Karma: 85400180 Join Date: Nov 2012 Location: The Beaten Path, USA, Roundworld, This Side of Infinity Device: Kindle Touch fw5.3.7 (Wifi only) | 
			
			See the OverDrive Link plugin, which does the same thing for OverDrive library links. {identifiers:select(odid)} == Code: UUID****-****-****-****-************@library1.lib.overdrive.com&\ UUID****-****-****-****-************@library2.lib.overdrive.com&\ UUID****-****-****-****-************@library3.lib.overdrive.com | 
|   |   | 
|  01-30-2015, 03:15 PM | #15 | |
| null operator (he/him)            Posts: 21,996 Karma: 30277294 Join Date: Mar 2012 Location: Sydney Australia Device: none | Quote: 
 That approach required me to remove the .opf file masquerading as a format file. But no matter, calibre invariably has more than one way to get the desired result - sometimes I wonder if too many ways  BR | |
|   |   | 
|  | 
| 
 | 
|  Similar Threads | ||||
| Thread | Thread Starter | Forum | Replies | Last Post | 
| Importing one library into another one | chemi | Library Management | 4 | 01-02-2013 10:24 AM | 
| Howto unzip/unrar some books while importing the calibre library ? | maxarsys | Library Management | 5 | 11-08-2012 08:18 AM | 
| Helping importing to Calibre library please | himitsuhieki | Library Management | 3 | 08-18-2011 10:10 AM | 
| Importing the Calibre library into the Sony Reader Library | Fortissimo | Reading and Management | 0 | 02-02-2011 02:18 PM | 
| Kindle and Calibre user with problem importing large library into Calibre | pleabargain | Calibre | 1 | 12-07-2010 10:19 AM |