![]() |
#1 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 681
Karma: 929286
Join Date: Apr 2014
Device: PW-3, iPad, Android phone
|
info as comments in OPF lost
I was looking at a batch of epubs and wondering how they were made. Opening in Sigil I could not find any declarations of the software used to make them in file headers or OPF file.
Then I opened one epub as a zip and looked in the OPF. This had a line in the metadata section: <!-- Created by Jutoh 3.13.5 at 21/04/2022 17:22:31 - config "Epub" --> Opening the epub in Sigil, this is not shown, and if the epub is saved, it's lost. Sigil leaves its own mark like this: <meta content="0.9.13" name="Sigil version" /> <meta property="dcterms:modified">2022-04-02T17:43:56Z</meta> And Calibre like: <meta name="calibre:timestamp" content="2017-05-05T17:14:23.743000+00:00" /> <dc:contributor opf:role="bkp">calibre (2.83.0) [https://calibre-ebook.com]</dc:contributor> "bkp" being "Book producer". Annoying that there is no standard for the creating software, when there is a myriad of esoteric creator/contributor codes: https://www.loc.gov/marc/relators/relaterm.html Maybe that's why Jutoh never bothered to try. So, I know that it's perfectly correct for Sigil to ignore such HTML comments, but in the interest of preserving info that someone thought was important enough to put there, maybe on opening these can be converted to valid metadata. Sigil already does a lot of fixes on non-standard files on first opening. This seems to work: <meta content="Created by Jutoh 3.13.5 at 21/04/2022 17:22:31 - config "Epub"" name="comment" /> |
![]() |
![]() |
![]() |
#2 |
Sigil Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8,783
Karma: 6000000
Join Date: Nov 2009
Device: many
|
The OPF is and must be machine parseable and regenerable at all times. That is how Sigil's tools work to change the opf each and every time a file is renamed, split, merged, things added, things moved, things deleted etc. Inside Sigil, the opf is actually parsed and stored in data structures to make this possible and reasonably fast. All xml comments are removed from the opf after the first Sigil command that touches the OPF as they are free form and have no place in the data structure. Sigil has always worked liked this. End Users never see the OPF directly nor are they meant to.
You can set an environment variable to turn off Sigil adding its own metadata. And the date modified must always be added according to the epub3 spec each time the epub is modified. And converting comments to metadata is no solution and is technically incorrect. The OPF as with the nav and ncx are truly meant to be machine parseable. And Comments meant for humans do not belong in files only meant to be machine parseable. Jutoh users or whomever adds that comment can add that metadata if so desired and Sigil will keep it but Sigil will not add it. And fwiw, that Jutoh "comment" is no longer correct as soon as Sigil open the OPF, cleans and parses it to set up the file. Sigil is the generator of that file from that moment onward. Last edited by KevinH; 11-03-2022 at 08:48 AM. |
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Sigil Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8,783
Karma: 6000000
Join Date: Nov 2009
Device: many
|
Jon, please at least read what I wrote before saying something incorrect about Sigil.
As I explained above, Comments in the opf are valid and allowed on input to Sigil but as soon as Sigil is used to do something that forces it to modify the OPF or when the epub is saved, then (since the OPF is meant to be machine read and not read by the end user) comments are removed. And it allows Sigil to perform common OPF housekeeping tasks faster. This is how Sigil has always worked. Sigil is not treating opf comments like errors, it is ignoring them as the spec allows and for valid performance reasons. And if you want respond to *unasked* questions about how better to use Calibre, then please do so in the Calibre forum and not here in the Sigil forum. Last edited by KevinH; 11-03-2022 at 09:21 AM. |
![]() |
![]() |
![]() |
#4 | |||
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 681
Karma: 929286
Join Date: Apr 2014
Device: PW-3, iPad, Android phone
|
Quote:
Consider converting comments in OPFs to something valid that retains the information on opening. "Cleaning" the OPF doesn't have to mean deleting every imperfect element. Quote:
I manually remove junk or fix stuff in them all the time. The "<meta content" I used passes epubcheck, but I don't care what tag it has, the important thing to me is that the information is not lost before I even know it was there. Quote:
Jutoh unfortunately chose a method that isn't kosher. I have occasionally noticed other programs putting info in HTML comments in the OPF. Only rarely do I peek at the OPF outside Sigil, so I can't say just how common this is, but it is not just Jutoh. Last edited by AlanHK; 11-04-2022 at 03:46 AM. |
|||
![]() |
![]() |
![]() |
#5 | |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 79,792
Karma: 146391129
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
Quote:
According to epubcheck, comments in the OPF are valid. So why blow them away? |
|
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Sigil Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8,783
Karma: 6000000
Join Date: Nov 2009
Device: many
|
Yes and as one of the authors of KindleUnpack those comments can happily go away. I added them originally as a debug tool while KindleUnpack was still evolving to support KF8. I will remove them in a future release of KindleUnpack.
And those comments are never seen by any end user or used by any reader and will continue to be removed upon regeneration of the opf by Sigil for performance reasons which is allowed by the spec. Do try and read what I wrote earlier as to why this is done. I already explained the reasons. Sigil controls and rewrites the opf period. So no I am not implementing a comment to metadata converter for Sigil. Do it as a standalone python tool if you want. I am getting really tired of getting badgered about requests for things in Sigil I have already turned down and explained why. Please stop. Last edited by KevinH; 11-04-2022 at 10:59 AM. |
![]() |
![]() |
![]() |
#7 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 28,585
Karma: 204624552
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
comments deleted from OPF? | AlanHK | Sigil | 13 | 02-15-2018 02:04 PM |
Comments in content.opf with Sigil 0.9.2 | turbulent | Sigil | 4 | 02-15-2016 05:29 PM |
Migrated library, lost fonts from comments | travger | Library Management | 6 | 04-03-2012 10:16 PM |
Lost Info In Conversions | Carlj | Conversion | 2 | 05-05-2011 04:53 AM |
Comments or book info on the PRS 505 | AFK_Matrix | Sony Reader | 6 | 02-03-2010 10:20 AM |