![]() |
#1 | |
Hedge Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 802
Karma: 19999999
Join Date: May 2011
Location: UK/Philippines
Device: Kobo Touch, Nook Simple
|
Update of Cleaning content.opf Plugim
Would anyone be interested in recreating the Cleaning content.opf plugin?
I would do it myself but I do not have the requisite coding knowledge. A numer of people have expressed an interest in using the the plugin but the author is no longer active on Mpbileread. DiapDealer says in the ClleanOpf thread Quote:
Last edited by Thasaidon; 03-18-2020 at 10:49 PM. Reason: missing word |
|
![]() |
![]() |
![]() |
#2 |
Sigil Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8,475
Karma: 5703586
Join Date: Nov 2009
Device: many
|
Since I have never used the plugin, I have no idea what an opf cleaner plugin is used for.
So if you can describe exactly what it should do, I would be happy to take a shot at it from scratch so we are not violating any licensing here. But please be as specific as you can exactly what you want removed from the opf and why. KevinH |
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 809
Karma: 2416112
Join Date: Jan 2017
Location: Poland
Device: Various
|
Functional description of the CleanOPF plugin (it can be easily fixed for Python 3, but it is useless for me):
1. At the beginning, it asks ("Insert series elements?") whether we want to add the calibre series in the metadata and (after answer "Yes") adds two lines in the content.opf file: Code:
<meta content="" name="calibre:series"/> <meta content="" name="calibre:series_index"/> 3. Inserts a new UUID identifier. 4. Removes existing entries from the metadata (leaves others unchanged): Code:
<dc:identifier.*> <dc:contributor.*calibre.*> <dc:type.*> <dc:rights.*> <dc:date.*> <dc:publisher.*> <dc:genre.*> <dc:subject.*> 7. If it detects that there are calibre series, it also restores them. 6. Replaces MIME type for fonts: Code:
ttf --> application/x-font-ttf otf --> application/vnd.ms-opentype ttc --> application/x-font-truetype-collection woff --> application/font-woff ![]() The calibre series can be inserted through the Metadata Editor or even through Clips. It's possible that it's about generating a new UUID, but there are at least two other plugins that can do this. Is deleting metadata? But deleting publisher data or rights information seems strange to me. ![]() MIME type for fonts inserted by default by Sigil: Code:
ttf --> font/ttf otf --> font/otf ttc --> font/collection woff --> font/woff |
![]() |
![]() |
![]() |
#4 |
Sigil Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8,475
Karma: 5703586
Join Date: Nov 2009
Device: many
|
I really see nothing truly useful here either. So exactly why do people want this plugin?
|
![]() |
![]() |
![]() |
#5 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 28,356
Karma: 203720150
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
Where/who are the "number of people" who expressed an interest in using this plugin?
|
![]() |
![]() |
Advert | |
|
![]() |
#6 |
null operator (he/him)
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 21,619
Karma: 29710338
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
FWIW - I never expressed an interest as such, but I did install it out of curiosity. Then I discovered it didn't seem to do anything I couldn't do with the Metadata edit tool, so I deleted it.
BR |
![]() |
![]() |
![]() |
#7 |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 79,093
Karma: 144284184
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
What an odd plugin. I'd like to ask anyone that uses it, why do you use it?
|
![]() |
![]() |
![]() |
#8 | |
Hedge Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 802
Karma: 19999999
Join Date: May 2011
Location: UK/Philippines
Device: Kobo Touch, Nook Simple
|
Quote:
I was interested in the plugin because I thought it may automatically fix at least some errors in the OPF. Unfortunately this does not appear to be the case so it is no longer of any interest, |
|
![]() |
![]() |
![]() |
#9 |
Sigil Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8,475
Karma: 5703586
Join Date: Nov 2009
Device: many
|
The only fixing part is to fix incorrect or outdated font mimetypes but that can be done in other ways since that plugin is now using old mimetypes that have since been deprecated.
Perhaps a plugin that can audit or correct the opf mimetype (or even updates them to current values) might prove useful. We can map all file extensions except for .xml to a specific mimetype and we could try to verify each mimetype matches roughly the file contents but that depends in large part on magic byte strings being identifiable in each binary type file. Is that what you are looking for? |
![]() |
![]() |
![]() |
#10 |
Banned
![]() Posts: 20
Karma: 10
Join Date: Feb 2020
Device: tolino epos
|
Hmmm i'm using it and i adopted it to python 3 as mostly the print functionality was affected. I dont want to have all this crap in there espacially calibre is inserting as i'm not using calibre.
They do not hurtm, thats true, but they are useless from my point of view and it males editing the remaining entries easier to maintain. copy right information is useless as well. In most cases these are presnt in the imprint page. I dont know any reader which is extracting more than title, author, cover, sometime series from it. So why should i keep them? subject in most cases is highhly based on individuals and in 100% useless for me. And it does each of these things with a single click. I ghuess i changed something else so it only queries for series if these are not present, dont know if and when i did that. And its adjusting the media types for fonts so flightcrew and epubcheck are not complaining if they dio have the wrong settings. But well, a feature with only little value for me as i'm deleting in most cases all fonts espacially when it comes with dejavu or linlibertine as thea do not have any value. At least, this plugin has the the problem of doing something very specifik and is not really customizable. i had the idea to have GUI to select subjects which are my favourites, but at least i didnt had the time and i noticed that only mantano is making use of that, but not the tolino readers. Ah just had a look, iÄm delting nby regex all html entities from the description as neither mantano nor tolino can handle those. A plain text is good for me. Summing up: this is very specific and not really customizable. For me its a one click solution doing a rough clean up how I would like to have my ebooks. |
![]() |
![]() |
![]() |
#11 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 5,680
Karma: 23983815
Join Date: Dec 2010
Device: Kindle PW2
|
Quote:
The only problem that I occasionally encounter is: Code:
<metadata> Code:
<metadata xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:opf="http://www.idpf.org/2007/opf"> |
|
![]() |
![]() |
![]() |
#12 | |
Hedge Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 802
Karma: 19999999
Join Date: May 2011
Location: UK/Philippines
Device: Kobo Touch, Nook Simple
|
Quote:
Thankfully I only occasionally get problems with the OPF when I run ePubCheck. A couple of simple problems are when there are lines relating to an embedded font that is no longer embedded. There is also part of a line (usually first one) that relates to Calibre, which unfortunately ePubCheck does not like. These are problems are easy sorted as I can manually delete them Unfortunately if I hit ePubCheck errors in the OPF, I am insufficiently familiar with what should be in the OPF, that I cannot identify what is specifically wrong. I cannot remember what these other problems are specifically. My old computer died and I have been offline for a couple of weeks and have been taking a holiday from editing ePubs for a few weeks longer. In these cases I cannot identify the actual problem and end up doing an ePub to ePub conversion in Calibre, which usually fixes things. I would like to avoid this which is why I was interested in the plugin If it would help I will document any future problems. Alternatively it would acceptable if you added a magic button to Sigil which automatically fixes all errors in the ePub . ![]() Last edited by Thasaidon; 03-21-2020 at 08:38 AM. Reason: redundant words |
|
![]() |
![]() |
![]() |
#13 | |||
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 5,680
Karma: 23983815
Join Date: Dec 2010
Device: Kindle PW2
|
Quote:
Quote:
Quote:
It doesn't exist in Sigil, however, Calibre Editor has a Try to correct all fixable errors automatically button, but it only fixes certain types of errors. |
|||
![]() |
![]() |
![]() |
#14 | |
Hedge Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 802
Karma: 19999999
Join Date: May 2011
Location: UK/Philippines
Device: Kobo Touch, Nook Simple
|
Quote:
Anyway I have been able to find more details about the errors I was talking about the string "prefix="calibre: https://calibre-ebook.com" is sometimes included in row one of the OPF and ePubcheck throws it up as an error ePub check also throws up errors if it finds the following in the OPF <manifest> <item id="id4" href="Fonts/LiberationSerif-Bold.ttf" media-type="application/octet-stream"/> <item id="id3" href="Fonts/LiberationSerif-BoldItalic.ttf" media-type="application/octet-stream"/> <item id="id2" href="Fonts/LiberationSerif-Italic.ttf" media-type="application/octet-stream"/> <item id="id1" href="Fonts/LiberationSerif-Regular.ttf" media-type="application/octet-stream"/> </manifest> As I said earlier these two problems can be easily solved with a simple manual deletion. The other problems I have come across are rarer and I was not been able to work out a manual fix. If you are interested I will be starting working on my books again in a few days and can post details of these rarer errors here when I find them. Murphy ruling the universe though, will probably mean it may take some time to find some. |
|
![]() |
![]() |
![]() |
#15 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 28,356
Karma: 203720150
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
In my experience, all of calibre's various attributes are properly formatted and/or namespaced and thus valid accoording to epub specs (and epubcheck compliant). If Epubcheck complains about any of them, I suspect it's because a piece that made them valid was manually removed.
|
![]() |
![]() |
![]() |
Thread Tools | Search this Thread |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
IngramSpark citing content.opf errors I can't find in the .opf | persand | Calibre | 4 | 03-21-2019 10:49 AM |
Cleaning content.opf | rubeus | Plugins | 5 | 09-04-2017 11:12 AM |
Change opf filename from content.opf to title.opf | northstar7 | Sigil | 3 | 09-23-2013 12:44 PM |
Sort plugin or update content.opf | velde046 | Plugins | 1 | 06-17-2012 05:08 AM |
cleaning the content.opf file | Adjust | ePub | 6 | 09-01-2010 05:54 PM |