Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Conversion

Notices

Reply
 
Thread Tools Search this Thread
Old 06-28-2014, 09:04 PM   #1
Sneddles
Member
Sneddles began at the beginning.
 
Posts: 12
Karma: 10
Join Date: Feb 2012
Device: Kindle keyboard
How to Stop Calibre from Automatically Changing OPF metadata

I have noticed that Calibre changes the description metadata read from the OPF entry in an epub file. It will convert from plain text to HTML, and even add attributes to existing HTML. It will also replace a </p><p> sequence with <br>.

Is it possible to have Calibre not change the description metadata read from the file?

I have also noticed that, when editing the description, Calibre will automatically add a DIV wrapper around an entry that has multiple <p> entries. Is there a way to stop it doing this?
Sneddles is offline   Reply With Quote
Old 06-28-2014, 09:30 PM   #2
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 29,768
Karma: 54401244
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by Sneddles View Post
I have noticed that Calibre changes the description metadata read from the OPF entry in an epub file. It will convert from plain text to HTML, and even add attributes to existing HTML. It will also replace a </p><p> sequence with <br>.

Is it possible to have Calibre not change the description metadata read from the file?

I have also noticed that, when editing the description, Calibre will automatically add a DIV wrapper around an entry that has multiple <p> entries. Is there a way to stop it doing this?
NO

The Metadata update comes from the Library, if it is the same, you see no change.

I have never seen </p> <p> replaced with a <br />. Your original souce must have some weird error embedded.
I have converted many a book and not seen a <div> used. Again weird SOURCE is the only logical.
Validate your HTML before passing it to Calibre is the only recommendation I can suggest.
theducks is offline   Reply With Quote
Old 06-29-2014, 03:36 AM   #3
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,422
Karma: 85397180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
I have noticed that calibre adds a div around the comments metadata if there is certain types of html in it, which I assume is because of sanitizing. If calibre does it I trust there is a good reason. (@theducks I think this is not a conversion thing, but rather, still metadata update.)

I have never seen the p to br, or any other changes besides the div.
eschwartz is offline   Reply With Quote
Old 07-03-2014, 08:28 PM   #4
Sneddles
Member
Sneddles began at the beginning.
 
Posts: 12
Karma: 10
Join Date: Feb 2012
Device: Kindle keyboard
Quote:
Originally Posted by theducks View Post
NO

The Metadata update comes from the Library, if it is the same, you see no change.
I should have made it clear that it happens when importing an epub.

I viewed the original epub content.opf description entry and it was plain text with blank lines between paras. When Calibre imported it the text it was changed to multiple <p class="description"> elements, using the blank lines to delimit the paras, and the <p> elements were wrapped in a <div> element.

I edited the file to remove the <div> elements and re-imported the epub. This time Calibre combined the <p> elements with a <br> element.

I have also seen Calibre wrap multiple<p> elements with a <div> element when editing the description using the metadata editor in Calibre.

Quote:
Originally Posted by eschwartz View Post
I have noticed that calibre adds a div around the comments metadata if there is certain types of html in it, which I assume is because of sanitizing. If calibre does it I trust there is a good reason.
I can't think of any good reason to wrap using a <div> element.

The OPF spec doesn't state any specific requirements, and there is nothing in the HTML spec that would require it either.

Neither can I think of a good reason to specify a class attribute on the <p> element. The class attribute is useless without an entry in an associated CSS file, and there doesn't seem to be anything in the OPF spec to allow a CSS file to be provided for the .opf file.

Last edited by Sneddles; 07-03-2014 at 08:44 PM. Reason: Additional info.
Sneddles is offline   Reply With Quote
Old 07-03-2014, 11:43 PM   #5
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,835
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
calibre does not change *anything* in any file on import. As for changes to the HTML comments stored in the *calibre* database. They are made for a good reasons, if you want to understand those reasons, read the code.
kovidgoyal is online now   Reply With Quote
Old 07-03-2014, 11:57 PM   #6
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,422
Karma: 85397180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
Here is where it is processed: https://github.com/kovidgoyal/calibr...ry/comments.py

@Kovid

I don't see anything there about divs, and my search skills proved woefully inadequate.

Simple curiosity.

EDIT: Thanks, found it!

Last edited by eschwartz; 07-04-2014 at 12:36 AM.
eschwartz is offline   Reply With Quote
Old 07-04-2014, 12:24 AM   #7
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,835
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
That's the sanitization code, you also need to look at the comments_editor.py
kovidgoyal is online now   Reply With Quote
Old 07-04-2014, 10:02 PM   #8
Sneddles
Member
Sneddles began at the beginning.
 
Posts: 12
Karma: 10
Join Date: Feb 2012
Device: Kindle keyboard
I had an expectation of how Calibre would work that resulted in me making some assumptions about the behavior I was seeing. For that I apologize.

The behavior I am seeing is that the .azw3 file created from the epub file has the changed description.

I haven't looked at the code for some time, (in my view it doesn't follow Robert Martin's clean code guidelines or the SOLID principles which makes it difficult to read and understand; and I shouldn't need to do so), but I seem to remember that it uses an intermediate form when converting from one format to another.

I never looked at how the metadata is handled, but I am assuming that the data used in the output file is coming from the Calbre database, and not the source file.

So I guess that is how and why the conversion happens.
Sneddles is offline   Reply With Quote
Old 07-04-2014, 10:12 PM   #9
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,835
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Quote:
Originally Posted by Sneddles View Post
I haven't looked at the code for some time, (in my view it doesn't follow Robert Martin's clean code guidelines or the SOLID principles which makes it difficult to read and understand; and I shouldn't need to do so), but I seem to remember that it uses an intermediate form when converting from one format to another.
If you want to understand something, you need to read the code. No one is going to hold your hand and explain every little detail to you. And if you find the calibre code base hard to follow, you are welcome to ask for pointers as to where to begin. Although, making ill-informed statements about the quality of a codebase you dont understand is not a good way to begin. It says far more about your limitations than those of the code base.
kovidgoyal is online now   Reply With Quote
Old 07-11-2014, 11:18 PM   #10
Sneddles
Member
Sneddles began at the beginning.
 
Posts: 12
Karma: 10
Join Date: Feb 2012
Device: Kindle keyboard
Actually it is possible to stop Calibre from making any changes to the metadata in the generated .azw3 file. It doesn't require any changes to Calibre code either.
Sneddles is offline   Reply With Quote
Old 07-19-2014, 06:57 PM   #11
Sneddles
Member
Sneddles began at the beginning.
 
Posts: 12
Karma: 10
Join Date: Feb 2012
Device: Kindle keyboard
Interesting.

The behaviour of Calibre 1.45 has changed and the description in the .azw3 file is now the same as in the original epub.
Sneddles is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
calibre - Not using existing metadata from *.opf files when Add books rolandt99 Library Management 19 06-15-2020 04:43 PM
Calibre destroys ebooks after changing metadata Flo112 Calibre 4 07-26-2013 11:31 PM
How to stop Calibre changing location numbers D0nQu1x0te Conversion 4 01-22-2013 08:39 PM
How to delete/suppress Calibre-specific metadata in .opf file? Doitsu Calibre 1 10-30-2012 06:31 AM
How do I stop Calibre from changing my book titles? HarryT Calibre 2 12-16-2010 11:41 AM


All times are GMT -4. The time now is 10:28 PM.


MobileRead.com is a privately owned, operated and funded community.