07-26-2012, 04:48 AM | #1 |
Enthusiast
Posts: 32
Karma: 12
Join Date: Jul 2012
Device: Kindle 4nt 4.1.3 jailbreak
|
Extra metadata import from ODT
Hi!
I'm working on some conversion which involves ODT as it's base format. But it seems that on the metadata side only the most basic properties are imported from the ODT. You can add custom properties to a ODF file (File->Properties->Custom Properties), so it would be very easy to add stuff like publisher and dates here and add them to the metadata import for odt files. Has nobody yet thought of this, or is this an unwanted feature? Or differently put: should I submit a patch for it? The custom fields could be named 'opf.OPFKEY' to keep them separated from other custom fields and to prevent unintentional import of custom fields not meant for eBooks... Cheers, Oliver. |
07-26-2012, 05:56 AM | #2 |
creator of calibre
Posts: 43,852
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
For metadata import to be generally useful, the set of fields needs to be broadly recognized/used. Defining a special set of fields that only calibre recognizes is of limited utility.
|
Advert | |
|
07-26-2012, 06:41 AM | #3 |
Enthusiast
Posts: 32
Karma: 12
Join Date: Jul 2012
Device: Kindle 4nt 4.1.3 jailbreak
|
LibreOffice has a dropdown with predefined custom value keys. But for ebook publishing this set is very limited (there is Publisher, but no ISBN for example).
And bevor something is 'broadly used' someone needs to give the ability to use it. Just because nobody thought of it yet does not mean it will be of limited utility But I understand your point of view and I'm fine with doing a wrapper script instead. Thanks for the quick answer! |
07-26-2012, 07:49 AM | #4 |
creator of calibre
Posts: 43,852
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
I actually wont refuse a patch that adds the ability to read custom fields. In the future, if there is a more widely accepted set of fields, then calibre can allows switch to using those in preference to the custom ones.
|
07-26-2012, 01:12 PM | #5 |
Enthusiast
Posts: 32
Karma: 12
Join Date: Jul 2012
Device: Kindle 4nt 4.1.3 jailbreak
|
Ok... I think the 'intentional approach' might be the better idea. So you have to know what you do and add the right properties to your ODT document so that calibre will use them.
For this I would first check for a bool property 'opf.metadata'. If this is true, the metadata module will parse properties named opf.XXX and add them to the metadata, possibly overwriting stuff set before (original metadata module sets authors = creator, which needs not to be true). The fields I currently see are: opf.authors opf.author_sort opf.publisher opf.pubdate opf.isbn opf.language I have not yet found the preferred way of patch submission for the calibre project, and the forum does not allow file attachments. How should I submit the patch? (I already checked out the repository and the patch will be made with bzr) |
Advert | |
|
07-26-2012, 02:35 PM | #6 |
Enthusiast
Posts: 32
Karma: 12
Join Date: Jul 2012
Device: Kindle 4nt 4.1.3 jailbreak
|
Still experimenting with some more... I think I can also enable direct cover image conversion (if the cover image is really an image and not a text page). A proof of concept with my test document already works.
By giving the the image in question a name in the ODT, I can find it quickly in the document content (open with odfpy, use element function to find the elements in question). My test is to just set mi.cover in odt.get_metadata to the href of the image inside the document. After a run of ebook-convert this results in the id of the resource in question becoming 'cover' instead of some id+count. |
07-26-2012, 02:38 PM | #7 |
creator of calibre
Posts: 43,852
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
|
07-26-2012, 03:34 PM | #8 |
creator of calibre
Posts: 43,852
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
The docx metadata reader plugin takes the first image as the cover, if it is large enough and roughly the right shape (not too wide or too narrow). You should be able to do the same with odt.
|
07-26-2012, 03:46 PM | #9 |
Enthusiast
Posts: 32
Karma: 12
Join Date: Jul 2012
Device: Kindle 4nt 4.1.3 jailbreak
|
Sure, but using a named image is even better. I don't like guessing, and I already coded it
It also seems that the name of a picture in LibreOffice (I assume that this is true for other incarnations, too) has to be unique. At least I was unable to create a file with two equally named images. I'm just searching for a place where I can add a bit of user documentation about my addition. |
07-26-2012, 03:54 PM | #10 |
creator of calibre
Posts: 43,852
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Its better for people that are creating an odt to be converted/imported by calibre. But for people receiving a random odt, picking the first image is better. You should treat it the same way you tret title and authors. If the special named metadata is present, use it, otherwise fallback to guessing.
|
07-26-2012, 04:30 PM | #11 |
Enthusiast
Posts: 32
Karma: 12
Join Date: Jul 2012
Device: Kindle 4nt 4.1.3 jailbreak
|
Good idea, will do so. The height/width division in docx seems to be wrong without a from __future__ import division.
|
07-26-2012, 11:08 PM | #12 |
creator of calibre
Posts: 43,852
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
docx.py has
from __future__ import (unicode_literals, division, absolute_import, print_function) |
07-27-2012, 02:13 AM | #13 |
Enthusiast
Posts: 32
Karma: 12
Join Date: Jul 2012
Device: Kindle 4nt 4.1.3 jailbreak
|
It obviously was to late yesterday
I'm currently adding a piece of doc to conversion.rst |
07-27-2012, 09:03 AM | #14 |
Enthusiast
Posts: 32
Karma: 12
Join Date: Jul 2012
Device: Kindle 4nt 4.1.3 jailbreak
|
I have committed my changes to the branch lp:~lydon/calibre/odt-convert
|
07-27-2012, 01:03 PM | #15 |
creator of calibre
Posts: 43,852
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Merged, with a little minor refactoring. I didn't test my changes, so have a look.
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Auto Download Metadata on Import | ebookrights | Calibre | 2 | 12-18-2012 10:51 AM |
Import MetaData an Tags | adrian142 | Library Management | 0 | 04-03-2012 11:40 AM |
Import metadata from file | Vinavil | Library Management | 2 | 01-28-2012 03:48 PM |
Mixing metadata on import | PeteMan | Calibre | 2 | 01-03-2011 02:21 PM |
Import: prioritization of metadata source? | ATimson | Calibre | 2 | 02-28-2010 03:57 PM |