Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Conversion

Notices

Reply
 
Thread Tools Search this Thread
Old 07-04-2014, 11:40 AM   #1
shotsky
Enthusiast
shotsky began at the beginning.
 
Posts: 39
Karma: 10
Join Date: Jul 2012
Device: none
Epub (or other) to HTMLZ attributes renamed

I use the htmlz format because it places all the html into a single file. That is awesome. However, a lot of the attributes get renamed in the process, and I can't find a way to preserve the original attributes.
For example, in the epub, this is a tag:
<p class="RM-recipe-method">
But after conversion, this is that same tag:
<p class="intit">
I do the conversions via command line. I do not care about css, it would be removed anyway, but I need the original attributes, because I convert the html to tagged text, and the original attribute tells me what the text is, while the converted attribute has no meaning.
I have tried several ways to control it, but the original attributes are always changed. Is there a way to preserve these attributes (on the command line)?
shotsky is offline   Reply With Quote
Old 07-04-2014, 11:52 AM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,835
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
No class names are not preserved by conversion.
kovidgoyal is offline   Reply With Quote
Advert
Old 07-04-2014, 01:09 PM   #3
shotsky
Enthusiast
shotsky began at the beginning.
 
Posts: 39
Karma: 10
Join Date: Jul 2012
Device: none
So, could a change be made that preserves the original class AND adds something for Calibre to recognize the classes? For example, from my previous post:
<p class="intit_RM-recipe-method">
That would be sufficient for my code to identify the class.
Actually, a lot of classes ARE preserved already. It's just not all of them.

Last edited by shotsky; 07-04-2014 at 01:16 PM.
shotsky is offline   Reply With Quote
Old 07-04-2014, 01:11 PM   #4
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,835
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
I have no interest in making such a change.
kovidgoyal is offline   Reply With Quote
Old 07-05-2014, 12:05 PM   #5
jackie_w
Grand Sorcerer
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 6,205
Karma: 16228558
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
@shotsky,

I believe a calibre conversion rationalises the css classes so that there are no classes, in the output css, with exactly the same css attributes as a differently named class. This would result in the classes in the html tags also being rationalised to match. Any unused classes would also be removed.

Is it possible that your input css for "RM-recipe-method" and "intit" are so similar that one of them is superfluous?
jackie_w is offline   Reply With Quote
Advert
Old 07-05-2014, 02:15 PM   #6
shotsky
Enthusiast
shotsky began at the beginning.
 
Posts: 39
Karma: 10
Join Date: Jul 2012
Device: none
I wonder if you would reconsider changing class names from the original class names? As it is, some of them do not change, others do, and it is not obvious why some change and others don't. I have about 100 users using my tools, and each one of them also uses Calibre as the ebook conversion tool. Classes often describe what a given entity is, as opposed to how it looks.
If underscores and hyphens are causing the attribute name changes, it would be satisfactory to simply eliminate those characters and use the remaining letters. Case is also unimportant for the attribute name. Numbers could be added to them if needed to keep them organized as well.
If I knew how to write that kind of code, I would tackle it myself, but I don't so I have to rely on someone else that is willing to look at it.
Please reconsider - I am sure I'm not the only one that post processes the output of Calibre, and retaining attribute names would help us all.
Regards,
John
shotsky is offline   Reply With Quote
Old 07-05-2014, 02:26 PM   #7
shotsky
Enthusiast
shotsky began at the beginning.
 
Posts: 39
Karma: 10
Join Date: Jul 2012
Device: none
Quote:
Originally Posted by jackie_w View Post
@shotsky,

I believe a calibre conversion rationalises the css classes so that there are no classes, in the output css, with exactly the same css attributes as a differently named class. This would result in the classes in the html tags also being rationalised to match. Any unused classes would also be removed.

Is it possible that your input css for "RM-recipe-method" and "intit" are so similar that one of them is superfluous?
Actually, what is happening is that the original attribute name is replaced with the intit attribute in the html itself. The request is simply to not change the attribute in such a way that the object it describes can no longer be discerned by a post processer.
It is possible to do both, simply by separating the two classes with a space. That would look like:
class="RM-recipe-method intit"
Or the reverse, if preferable to Calibre:
class="intit RM-recipe-method"
In this way, the css is certain to remain as wanted by the Calibre converter, yet the 'meaning' of the class is retained.
Regards,
John
shotsky is offline   Reply With Quote
Old 07-05-2014, 05:37 PM   #8
jackie_w
Grand Sorcerer
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 6,205
Karma: 16228558
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
Quote:
Originally Posted by shotsky View Post
Actually, what is happening is that the original attribute name is replaced with the intit attribute in the html itself.
Yes, I understood you the first time. All I was trying to say was that if the conversion process, which minimises and simplifies all the css, decides that only one of "RM-recipe-method" and "intit" are needed then all attributes in the html for the 'superfluous' class name would be renamed to the 'retained' class name ... but conversion is complicated and there's probably a lot more to it than that.

I think you've had the official answer to your request. I've never seen calibre produce html tags with multiple classes e.g. <p class="xxx yyy"> so I'd be surprised if it started now (thankfully from my POV, I'd hate that).

Speaking only for myself, I try to keep my input tags and classes as simple/minimal as possible and have found that calibre seems to retain my input class names during conversion these days (except, of course, when the input html has classless tags like <h1>, <p> etc). Whether this is pure dumb luck or whether I've inadvertently found a 'magic formula', I don't know. I suspect the former
jackie_w is offline   Reply With Quote
Old 07-11-2014, 08:24 AM   #9
shotsky
Enthusiast
shotsky began at the beginning.
 
Posts: 39
Karma: 10
Join Date: Jul 2012
Device: none
Quote:
Originally Posted by jackie_w View Post
Yes, I understood you the first time. All I was trying to say was that if the conversion process, which minimises and simplifies all the css, decides that only one of "RM-recipe-method" and "intit" are needed then all attributes in the html for the 'superfluous' class name would be renamed to the 'retained' class name ... but conversion is complicated and there's probably a lot more to it than that.
I don't think you did understand the first time. There IS no intit attribute in the original - it is an invented attribute by Calibre, that REPLACES all instances of "RM-recipe-method". There are other attributes that it does not rename, it appears to be random, or based on some algorithm that is not evident, but it would seem to me that an attribute that already exists in a book would not need to be renamed, since it always has the same meaning and style throughout the book.
This is similar to an attribute named 'copyright', which Calibre would leave alone, since it is a 'recognized' part of a book. In my case, it is not a recognized part of a book, but it IS a clue to what follows - a direction step in a cookbook. However, in the same book, there is an attribute "INGREDIENT" that shows up in many places, but which is untouched. I don't think that is a recognized part of a book.
Note that this is not MY html in the first place - it is whatever is in the ebook to be converted, and the quality varies greatly, but this is not a quality issue, it is a mystery why Calibre should change attribute names that are perfectly valid in the first place.
shotsky is offline   Reply With Quote
Old 07-11-2014, 09:06 AM   #10
jackie_w
Grand Sorcerer
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 6,205
Karma: 16228558
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
OK, I'm not in a position to argue as I don't have your source files. All I can say is that when calibre decides it needs to create new a class name in my conversions they always have names like "calibrenn". I can't think of any circumstances where calibre would pull the name "intit" out of thin air.
jackie_w is offline   Reply With Quote
Old 07-11-2014, 09:43 AM   #11
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 29,768
Karma: 54401244
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by jackie_w View Post
OK, I'm not in a position to argue as I don't have your source files. All I can say is that when calibre decides it needs to create new a class name in my conversions they always have names like "calibrenn". I can't think of any circumstances where calibre would pull the name "intit" out of thin air.

calibre# or calibre## (I don't think I have seen 3 digits even withe the worst Word cr*p )
theducks is offline   Reply With Quote
Old 08-23-2014, 04:18 PM   #12
Gunaddho
Junior Member
Gunaddho began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Dec 2011
Device: none
Class renaming and Sigil

This issue is why I switched to Sigil for creating my epubs from html and then I just use Calibre to create .mobi (or .azw3) and pdf versions. I don't find it worthwhile to have to sort through (mild) machine language to figure out what .calibre32 was or .calibre14 if ever I have to update or correct errors in the epub after conversion. Maybe there's a way to keep my "master" version in htmlz which doesn't rename classes, and then spit out an epub, but It seems I would have to go through and break up the html again into chapters. I would like to use Calibre for everything and some of its more advanced features, but apparently there is no off switch or workaround for this behavior. If anyone knows a workaround in the software or workflow, I'm all ears. Appreciated.
Gunaddho is offline   Reply With Quote
Old 08-27-2014, 12:58 AM   #13
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,422
Karma: 85397180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
I don't know if this counts as a "workaround", but instead of comparing the calibre converter to the Sigil editor, why don't you try comparing the calibre editor to the Sigil editor?

In other words, the calibre editor absolutely does not rename things, as it is meant for </gasp> editing. Conversion, on the other hand, will gleefully rename things as it is NOT repeat NOT meant for editing!
eschwartz is offline   Reply With Quote
Old 09-06-2014, 02:32 PM   #14
Gunaddho
Junior Member
Gunaddho began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Dec 2011
Device: none
sigil vs. calibre editor/converter

So, when I'm doing the initial creation/conversion of an epub from an html file and a css file, if I do it with Sigil my classes are respected. If I do it with Calibre I get "class=calibrenn". If there is a way to use the Calibre editor or converter to import the html and spit out an epub without rewriting the classes, I would be very interested to know.
Gunaddho is offline   Reply With Quote
Old 09-06-2014, 07:23 PM   #15
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 20,544
Karma: 26944418
Join Date: Mar 2012
Location: Sydney Australia
Device: none
@Gunaddho - the editor can be run standalone, in the File menu there's a Create New Epub option, once you have one of those you can add component files - HTML, CSS, Images etc.

See Using calibre's editor independently

BR
BetterRed is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Markdown to ePub generation of ID attributes Agama Conversion 2 10-18-2012 02:45 AM
EPUB -> MOBI -> HTMLz margin/blockquote annoyance therealjoeblow Conversion 2 07-20-2012 01:20 PM
htmlz to epub? shootist Other formats 1 03-19-2012 10:28 PM
Epub is renamed when loaded on KOBO kljewelrydesign Kobo Reader 3 09-11-2010 08:25 AM
Epub is renamed when loaded on KOBO kljewelrydesign General Discussions 3 09-11-2010 02:00 AM


All times are GMT -4. The time now is 12:26 AM.


MobileRead.com is a privately owned, operated and funded community.