03-11-2017, 10:41 AM | #1 |
Grand Sorcerer
Posts: 6,212
Karma: 16534894
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
|
Add extra namespace to html tag using EpubContainer
If I am using the standard calibre EpubContainer and I want to add an extra namespace into each <html> tag, what would be the best way to go about it? e.g.
Code:
container = get_container(path_to_epub) for name in container.manifest_items_of_type(OEB_DOCS): root = container.parsed(name) ??? what comes next ??? |
03-11-2017, 10:54 AM | #2 |
creator of calibre
Posts: 43,856
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
There's no direct way to modify namespaces in lxml. Instead you'd have to clone the tree, somethinf like
Code:
from calibre.ebooks.oeb.parse_utils import clone_element nsmap = root.nsmap.copy() nsmap['my namespace prefix'] = 'my namespace' newroot = clone_element(root, nsmap, in_context=False) container.replace(name, newroot) |
03-11-2017, 11:19 AM | #3 |
Grand Sorcerer
Posts: 6,212
Karma: 16534894
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
|
Thanks I'd never have figured that one out by myself.
|
03-13-2017, 02:01 PM | #4 |
Grand Sorcerer
Posts: 6,212
Karma: 16534894
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
|
With hindsight, perhaps I should have called this thread 'Tinkering with auto-shift EPUB2 to EPUB3 ...
I know calibre doesn't claim to fully support EPUB3 but I thought I'd see what I could achieve using the calibre EPUB3 functions which do exist. I'm nowhere near fully-conversant with the EPUB3 spec, but early tests have been quite encouraging insomuch as an EPUB2 with zero errors in Check Book usually results in an EPUB3 with zero errors in Check Book. However, I did accidentally come across the following TOC anomaly when using one of my test EPUB2s so I thought I'd mention it. This particular epub has: - zero errors in Check Book - an NCX file with an empty <navmap> section I auto-add the nav document using: Code:
if not find_existing_nav_toc(container): toc = get_toc(container) commit_toc(container, toc) Code:
errors = run_checks(container) if errors: fix_errors(container, errors) Code:
<body> <nav xmlns:ns0="http://www.idpf.org/2007/ops" ns0:type="toc"><ol></ol></nav> </body> For my purposes, I think I know how to auto-add a ToC item to avoid getting the error but, before I do, I thought I'd ask whether you'd consider the possibility of adding an extra auto-fix to the standard fix_errors() method? |
03-14-2017, 01:52 AM | #5 |
creator of calibre
Posts: 43,856
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
The reason the fix_errors does not auto-add an entry is that having an empty ToC in a book is a bad thing, and the user should really create a proper toc, if I just have fix_errors() auto create an element lazy people will just use it instead of using the ToC Editor to create a proper toc.
Or, in other words, I dont want to encourage people to just end-run around the requirement for a toc. |
03-14-2017, 01:53 AM | #6 |
creator of calibre
Posts: 43,856
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Really the correct approach is to have check book flag the case of an empty NCX TOC as a warning as well.
|
03-14-2017, 10:42 AM | #7 | |
Grand Sorcerer
Posts: 6,212
Karma: 16534894
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
|
Quote:
I think flagging some kind of error would be a good idea. From a personal POV I'd prefer it if the error level reported for EPUB2 was the same as that for EPUB3 when, in effect, "nothing should really change" as a result of a format-shift. |
|
03-20-2017, 10:34 PM | #8 |
Grand Sorcerer
Posts: 6,212
Karma: 16534894
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
|
Still inching forward with EPUB2 to EPUB3 experimenting ...
I'm at the stage where I'm test format-shifting a larger number of error-free (in Check Book) EPUB2s to see what happens. My first step was to make sure that Check Book also thinks the EPUB3 is error-free. Thankfully every book I've tried passed this test. The second step was to run the EPUB3 through EpubCheck. I know you don't rate it highly but I thought it might at least highlight some extra 'problems' (meaningless or otherwise) which I might choose to fix during the shift if it can be done easily. One of the first 'problems' I came across is with an EPUB2 which has 2 <dc:creator> tags, one for Author, the other for Translator: Code:
<metadata xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:opf="http://www.idpf.org/2007/opf">
<dc:creator opf:file-as="Adler-Olsen, Jussi" opf:role="aut">Jussi Adler-Olsen</dc:creator>
<dc:creator opf:file-as="Hartford, Lisa" opf:role="trl">Lisa Hartford</dc:creator>
Code:
<metadata xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:opf="http://www.idpf.org/2007/opf">
<dc:creator opf:file-as="Hartford, Lisa" opf:role="trl">Lisa Hartford</dc:creator>
<dc:creator id="id-1">Jussi Adler-Olsen</dc:creator>
<meta property="role" refines="#id-1" scheme="marc:relators">aut</meta>
<meta property="file-as" refines="#id-1">Adler-Olsen, Jussi</meta>
BTW, I also did a follow-up test where I changed the EPUB2's opf:role="trl" to a second opf:role="aut" before shifting to EPUB3. In that case both 'authors' had their <dc:creator> handled the same way and therefore no EpubCheck errors. ETA: In case it's important, I used apply_metadata() from calibre.ebooks.metadata.opf3 to change the metadata from v2 to v3. Last edited by jackie_w; 03-20-2017 at 10:43 PM. Reason: ETA |
03-20-2017, 11:27 PM | #9 |
creator of calibre
Posts: 43,856
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
The code only handle's authors, i.e. opf:role="aut" entries -- others are left alone, as calibre does not know anything about them (it only has author metadata). I suppose one could create a translation from opf 2 role to epub 3 properties -- but since the purpose of the code is not upgrading epubs but making sure that setting metadata in epub 3 works, it's not implemented. If you want to implement it, you can do so in your plugin, it should not be too hard, see the code in metadata/opf3.py for inspiration.
|
03-21-2017, 10:17 AM | #10 |
Grand Sorcerer
Posts: 6,212
Karma: 16534894
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
|
Thanks, I'll figure something out. At least I now know I wasn't doing anything wrong.
For my own books I don't really care whether the Translator metadata survives the process as I'll never use it. But I don't like the idea of leaving my metadata in a semi-converted state. |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Add extra language 'No text' or 'Pictogram' | ottovdv | Calibre | 7 | 01-27-2017 04:56 AM |
Suggested extra option for Add Cover | slowsmile | Sigil | 18 | 01-19-2017 11:14 AM |
Aura Is it possible to add the extra games from HD? | christopher22 | Kobo Reader | 13 | 12-22-2014 05:33 PM |
Adding date as extra tag | ghudod | Recipes | 3 | 01-07-2013 11:54 PM |
Add extra <dc:> metadata field | Agama | Development | 1 | 08-08-2012 02:52 PM |