|  07-04-2014, 11:40 AM | #1 | 
| Enthusiast  Posts: 39 Karma: 10 Join Date: Jul 2012 Device: none | 
				
				Epub (or other) to HTMLZ attributes renamed
			 
			
			I use the htmlz format because it places all the html into a single file. That is awesome. However, a lot of the attributes get renamed in the process, and I can't find a way to preserve the original attributes. For example, in the epub, this is a tag: <p class="RM-recipe-method"> But after conversion, this is that same tag: <p class="intit"> I do the conversions via command line. I do not care about css, it would be removed anyway, but I need the original attributes, because I convert the html to tagged text, and the original attribute tells me what the text is, while the converted attribute has no meaning. I have tried several ways to control it, but the original attributes are always changed. Is there a way to preserve these attributes (on the command line)? | 
|   |   | 
|  07-04-2014, 11:52 AM | #2 | 
| creator of calibre            Posts: 45,604 Karma: 28548974 Join Date: Oct 2006 Location: Mumbai, India Device: Various | 
			
			No class names are not preserved by conversion.
		 | 
|   |   | 
|  07-04-2014, 01:09 PM | #3 | 
| Enthusiast  Posts: 39 Karma: 10 Join Date: Jul 2012 Device: none | 
			
			So, could a change be made that preserves the original class AND adds something for Calibre to recognize the classes? For example, from my previous post: <p class="intit_RM-recipe-method"> That would be sufficient for my code to identify the class. Actually, a lot of classes ARE preserved already. It's just not all of them. Last edited by shotsky; 07-04-2014 at 01:16 PM. | 
|   |   | 
|  07-04-2014, 01:11 PM | #4 | 
| creator of calibre            Posts: 45,604 Karma: 28548974 Join Date: Oct 2006 Location: Mumbai, India Device: Various | 
			
			I have no interest in making such a change.
		 | 
|   |   | 
|  07-05-2014, 12:05 PM | #5 | 
| Grand Sorcerer            Posts: 6,268 Karma: 16544702 Join Date: Sep 2009 Location: UK Device: ClaraHD, Forma, Libra2, Clara2E, LibraCol, PBTouchHD3 | 
			
			@shotsky, I believe a calibre conversion rationalises the css classes so that there are no classes, in the output css, with exactly the same css attributes as a differently named class. This would result in the classes in the html tags also being rationalised to match. Any unused classes would also be removed. Is it possible that your input css for "RM-recipe-method" and "intit" are so similar that one of them is superfluous? | 
|   |   | 
|  07-05-2014, 02:15 PM | #6 | 
| Enthusiast  Posts: 39 Karma: 10 Join Date: Jul 2012 Device: none | 
			
			I wonder if you would reconsider changing class names from the original class names? As it is, some of them do not change, others do, and it is not obvious why some change and others don't. I have about 100 users using my tools, and each one of them also uses Calibre as the ebook conversion tool. Classes often describe what a given entity is, as opposed to how it looks.  If underscores and hyphens are causing the attribute name changes, it would be satisfactory to simply eliminate those characters and use the remaining letters. Case is also unimportant for the attribute name. Numbers could be added to them if needed to keep them organized as well. If I knew how to write that kind of code, I would tackle it myself, but I don't so I have to rely on someone else that is willing to look at it. Please reconsider - I am sure I'm not the only one that post processes the output of Calibre, and retaining attribute names would help us all. Regards, John | 
|   |   | 
|  07-05-2014, 02:26 PM | #7 | |
| Enthusiast  Posts: 39 Karma: 10 Join Date: Jul 2012 Device: none | Quote: 
 It is possible to do both, simply by separating the two classes with a space. That would look like: class="RM-recipe-method intit" Or the reverse, if preferable to Calibre: class="intit RM-recipe-method" In this way, the css is certain to remain as wanted by the Calibre converter, yet the 'meaning' of the class is retained. Regards, John | |
|   |   | 
|  07-05-2014, 05:37 PM | #8 | |
| Grand Sorcerer            Posts: 6,268 Karma: 16544702 Join Date: Sep 2009 Location: UK Device: ClaraHD, Forma, Libra2, Clara2E, LibraCol, PBTouchHD3 | Quote: 
 I think you've had the official answer to your request. I've never seen calibre produce html tags with multiple classes e.g. <p class="xxx yyy"> so I'd be surprised if it started now (thankfully from my POV, I'd hate that). Speaking only for myself, I try to keep my input tags and classes as simple/minimal as possible and have found that calibre seems to retain my input class names during conversion these days (except, of course, when the input html has classless tags like <h1>, <p> etc). Whether this is pure dumb luck or whether I've inadvertently found a 'magic formula', I don't know. I suspect the former   | |
|   |   | 
|  07-11-2014, 08:24 AM | #9 | |
| Enthusiast  Posts: 39 Karma: 10 Join Date: Jul 2012 Device: none | Quote: 
 This is similar to an attribute named 'copyright', which Calibre would leave alone, since it is a 'recognized' part of a book. In my case, it is not a recognized part of a book, but it IS a clue to what follows - a direction step in a cookbook. However, in the same book, there is an attribute "INGREDIENT" that shows up in many places, but which is untouched. I don't think that is a recognized part of a book. Note that this is not MY html in the first place - it is whatever is in the ebook to be converted, and the quality varies greatly, but this is not a quality issue, it is a mystery why Calibre should change attribute names that are perfectly valid in the first place. | |
|   |   | 
|  07-11-2014, 09:06 AM | #10 | 
| Grand Sorcerer            Posts: 6,268 Karma: 16544702 Join Date: Sep 2009 Location: UK Device: ClaraHD, Forma, Libra2, Clara2E, LibraCol, PBTouchHD3 | 
			
			OK, I'm not in a position to argue as I don't have your source files. All I can say is that when calibre decides it needs to create new a class name in my conversions they always have names like "calibrenn". I can't think of any circumstances where calibre would pull the name "intit" out of thin air.
		 | 
|   |   | 
|  07-11-2014, 09:43 AM | #11 | |
| Well trained by Cats            Posts: 31,249 Karma: 61360164 Join Date: Aug 2009 Location: The Central Coast of California Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A | Quote: 
  calibre# or calibre## (I don't think I have seen 3 digits even withe the worst Word cr*p    ) | |
|   |   | 
|  08-23-2014, 04:18 PM | #12 | 
| Junior Member  Posts: 3 Karma: 10 Join Date: Dec 2011 Device: none | 
				
				Class renaming and Sigil
			 
			
			This issue is why I switched to Sigil for creating my epubs from html and then I just use Calibre to create .mobi (or .azw3) and pdf versions. I don't find it worthwhile to have to sort through (mild) machine language to figure out what .calibre32 was or .calibre14 if ever I have to update or correct errors in the epub after conversion. Maybe there's a way to keep my "master" version in htmlz which doesn't rename classes, and then spit out an epub, but It seems I would have to go through and break up the html again into chapters. I would like to use Calibre for everything and some of its more advanced features, but apparently there is no off switch or workaround for this behavior. If anyone knows a workaround in the software or workflow, I'm all ears. Appreciated.
		 | 
|   |   | 
|  08-27-2014, 12:58 AM | #13 | 
| Ex-Helpdesk Junkie            Posts: 19,421 Karma: 85400180 Join Date: Nov 2012 Location: The Beaten Path, USA, Roundworld, This Side of Infinity Device: Kindle Touch fw5.3.7 (Wifi only) | 
			
			I don't know if this counts as a "workaround", but instead of comparing the calibre converter to the Sigil editor, why don't you try comparing the calibre editor to the Sigil editor? In other words, the calibre editor absolutely does not rename things, as it is meant for </gasp> editing. Conversion, on the other hand, will gleefully rename things as it is NOT repeat NOT meant for editing! | 
|   |   | 
|  09-06-2014, 02:32 PM | #14 | 
| Junior Member  Posts: 3 Karma: 10 Join Date: Dec 2011 Device: none | 
				
				sigil vs. calibre editor/converter
			 
			
			So, when I'm doing the initial creation/conversion of an epub from an html file and a css file, if I do it with Sigil my classes are respected. If I do it with Calibre I get "class=calibrenn". If there is a way to use the Calibre editor or converter to import the html and spit out an epub without rewriting the classes, I would be very interested to know.
		 | 
|   |   | 
|  09-06-2014, 07:23 PM | #15 | 
| null operator (he/him)            Posts: 22,010 Karma: 30277294 Join Date: Mar 2012 Location: Sydney Australia Device: none | 
			
			@Gunaddho - the editor can be run standalone, in the File menu there's a Create New Epub option, once you have one of those you can add component files - HTML, CSS, Images etc. See Using calibre's editor independently BR | 
|   |   | 
|  | 
| 
 | 
|  Similar Threads | ||||
| Thread | Thread Starter | Forum | Replies | Last Post | 
| Markdown to ePub generation of ID attributes | Agama | Conversion | 2 | 10-18-2012 02:45 AM | 
| EPUB -> MOBI -> HTMLz margin/blockquote annoyance | therealjoeblow | Conversion | 2 | 07-20-2012 01:20 PM | 
| htmlz to epub? | shootist | Other formats | 1 | 03-19-2012 10:28 PM | 
| Epub is renamed when loaded on KOBO | kljewelrydesign | Kobo Reader | 3 | 09-11-2010 08:25 AM | 
| Epub is renamed when loaded on KOBO | kljewelrydesign | General Discussions | 3 | 09-11-2010 02:00 AM |