Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Conversion

Notices

Reply
 
Thread Tools Search this Thread
Old 06-11-2018, 07:45 PM   #1
spiralpath
Member
spiralpath doesn't litterspiralpath doesn't litter
 
spiralpath's Avatar
 
Posts: 10
Karma: 116
Join Date: May 2011
Device: Multiple
Angry Calibre changes HTML markup during convert

Running Calibre 3.13 and when I "Convert books" to EPUB much of my markup changes. Some code I have added to stylesheet.css is removed.

And, it completely removed a file (TableOfContents.html) which seems to be lost forever.
spiralpath is offline   Reply With Quote
Old 06-11-2018, 09:25 PM   #2
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 29,800
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by spiralpath View Post
Running Calibre 3.13 and when I "Convert books" to EPUB much of my markup changes. Some code I have added to stylesheet.css is removed.

And, it completely removed a file (TableOfContents.html) which seems to be lost forever.
Ah! did you have that CSS properly linked to the TOC.HTML ?
AFAIK, calibre only uses what is linked.
AND you DO NOT want calibre to replace the TOC

There are so many settings.

Remember ! Preferences: ... is the DEFAULT that is used for the INITIAL conversion.

Once used, the book retains the settings that were used (may be modified on the conversion start). Fro then on, the Conversion screen shows the settings that were used previously. This is PER BOOK
There is a button (tick in bulk mode) to cause conversion to forget, and grab a fresh 'default'
theducks is offline   Reply With Quote
Old 06-11-2018, 09:37 PM   #3
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,856
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
The only way calibre will completely remove a file is if you incorrectly marked it as the titlepage or it is not in the spine.

As for removing CSS, calibre completely rewrites all css, flattening it and keeping only the CSS that actually applies to your markup.
kovidgoyal is offline   Reply With Quote
Old 06-11-2018, 09:44 PM   #4
spiralpath
Member
spiralpath doesn't litterspiralpath doesn't litter
 
spiralpath's Avatar
 
Posts: 10
Karma: 116
Join Date: May 2011
Device: Multiple
Quote:
Originally Posted by theducks View Post
Ah! did you have that CSS properly linked to the TOC.HTML ?
AFAIK, calibre only uses what is linked.
AND you DO NOT want calibre to replace the TOC

There are so many settings.

Remember ! Preferences: ... is the DEFAULT that is used for the INITIAL conversion.

Once used, the book retains the settings that were used (may be modified on the conversion start). Fro then on, the Conversion screen shows the settings that were used previously. This is PER BOOK
There is a button (tick in bulk mode) to cause conversion to forget, and grab a fresh 'default'
Thank you, theducks, for your time and help.

My biggest challenge at the moment is having my HTML markup changed when I run "Convert Book" (to EPUB). I add markup like <span id="chap1-2">some text </span> and after the conversion runs and I "Edit Book," the <span... markup is removed.

Also, I will add <p class="bodyText"> to the beginning of each paragraph and after "Convert Book" that markup is changed to <p class="calibre7"> and the bodyText rule is removed from the css file.
spiralpath is offline   Reply With Quote
Old 06-11-2018, 11:09 PM   #5
PeterT
Grand Sorcerer
PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.
 
PeterT's Avatar
 
Posts: 12,166
Karma: 73448616
Join Date: Nov 2007
Location: Toronto
Device: Nexus 7, Clara, Touch, Tolino EPOS
I hate to be the bearer of bad news but if you do a conversion, calibre will rewrite / merge all the CSS and assign the new classes its own names.

Sent from my Nexus 7 using Tapatalk
PeterT is offline   Reply With Quote
Old 06-12-2018, 12:42 AM   #6
spiralpath
Member
spiralpath doesn't litterspiralpath doesn't litter
 
spiralpath's Avatar
 
Posts: 10
Karma: 116
Join Date: May 2011
Device: Multiple
Quote:
Originally Posted by PeterT View Post
I hate to be the bearer of bad news but if you do a conversion, calibre will rewrite / merge all the CSS and assign the new classes its own names.

Sent from my Nexus 7 using Tapatalk
Thank you, Peter.

I am not attached to the css classes I name, however I am very partial to the markup that is being removed completely. I am using the following markup to identify specific sub-headings in the text that are linked to the table of contents I created in TableOfContents.html, e.g.:

excerpt from TableOfContents.html:

<p class="toc-level1"><a href="chapter-1.html#chap1-3">Checking Goals at the Door</a></p>

excerpt from chapter-1.html:

<p class="subhead"><span id="chap1-3">Checking Goals at the Door</span></p>

Oddly, the Calibri conversion process does not change the subhead class but it removes the span tags completely. Hence, my strategy to link my table of contents via HTML is foiled.

excerpt from chapter1.html, post-conversion:

<p class="subhead">Checking Goals at the Door</p>
spiralpath is offline   Reply With Quote
Old 06-12-2018, 12:45 AM   #7
spiralpath
Member
spiralpath doesn't litterspiralpath doesn't litter
 
spiralpath's Avatar
 
Posts: 10
Karma: 116
Join Date: May 2011
Device: Multiple
Quote:
Originally Posted by kovidgoyal View Post
The only way calibre will completely remove a file is if you incorrectly marked it as the titlepage or it is not in the spine.

As for removing CSS, calibre completely rewrites all css, flattening it and keeping only the CSS that actually applies to your markup.
Thank you very much, kovidgoyal,

I am not attached to the css classes I name, however I am very partial to the markup that is being removed completely. I am using the following markup to identify specific sub-headings in the text that are linked to the table of contents I created in TableOfContents.html, e.g.:

excerpt from TableOfContents.html:

<p class="toc-level1"><a href="chapter-1.html#chap1-3">Checking Goals at the Door</a></p>

excerpt from chapter-1.html:

<p class="subhead"><span id="chap1-3">Checking Goals at the Door</span></p>

Oddly, the Calibri conversion process does not change the subhead class but it removes the span tags completely. Hence, my strategy to link my table of contents via HTML is foiled.

excerpt from chapter1.html, post-conversion:

<p class="subhead">Checking Goals at the Door</p>
spiralpath is offline   Reply With Quote
Old 06-12-2018, 12:57 AM   #8
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,856
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
I dont see how calibre could possibly be removing span tags, unless you have set some conversion setting telling it to do so, such as heuristics or serach and replace, etc.

In any case those span tag are completely superflous, simply put your id on the <p> tag and you can link to it just the same.
kovidgoyal is offline   Reply With Quote
Old 06-12-2018, 01:25 AM   #9
DNSB
Bibliophagist
DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.
 
DNSB's Avatar
 
Posts: 35,393
Karma: 145435140
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Forma, Clara HD, Lenovo M8 FHD, Paperwhite 4, Tolino epos
Quote:
Originally Posted by spiralpath View Post
Thank you very much, kovidgoyal,

I am not attached to the css classes I name, however I am very partial to the markup that is being removed completely. I am using the following markup to identify specific sub-headings in the text that are linked to the table of contents I created in TableOfContents.html, e.g.:

excerpt from TableOfContents.html:

<p class="toc-level1"><a href="chapter-1.html#chap1-3">Checking Goals at the Door</a></p>

excerpt from chapter-1.html:

<p class="subhead"><span id="chap1-3">Checking Goals at the Door</span></p>

Oddly, the Calibri conversion process does not change the subhead class but it removes the span tags completely. Hence, my strategy to link my table of contents via HTML is foiled.

excerpt from chapter1.html, post-conversion:

<p class="subhead">Checking Goals at the Door</p>
A quick look at your code suggests a possibility that your spans do not include a class so they are being removed as do nothing code. My personal feeling is that should only happen to <span> tags that do not include other information.

As suggested, using <p class="subhead" id="chap1-3"> would be cleaner. If you really want to use the spans, something like <span class="dummy" id="chap1-3"> would survive the conversion process.

Last edited by DNSB; 06-12-2018 at 01:27 AM.
DNSB is offline   Reply With Quote
Old 06-12-2018, 01:52 AM   #10
DNSB
Bibliophagist
DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.
 
DNSB's Avatar
 
Posts: 35,393
Karma: 145435140
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Forma, Clara HD, Lenovo M8 FHD, Paperwhite 4, Tolino epos
Interesting. I added a span with a id to a test epub and then converted it from epub to epub. Not only was a span with an id not removed, it had a class added. My theory about the spans being removed seems to have been shot down. Calibre 3.25, BTW.

Original epub:
Code:
<body class="epub">
  <p class="paranon"><span id="tarfu_1.3">This is a sample ebook with a few lines of text though it's hard to say what is actually a line of text when font size, screen width, margin size are variables beyond the control of the author. Of course, we could go to absolute measurements if we really want odd effects. EOT</span></p>
</body>
Converted epub:
Code:
  <body class="epub">
  <p class="paranon"><span id="tarfu_1.3" class="calibre">This is a sample ebook with a few lines of text though it's hard to say what is actually a line of text when font size, screen width, margin size are variables beyond the control of the author. Of course, we could go to absolute measurements if we really want odd effects. EOT</span></p>
</body>
The added class:
Code:
.calibre {
    line-height: 1.2
    }
DNSB is offline   Reply With Quote
Old 06-12-2018, 01:53 AM   #11
spiralpath
Member
spiralpath doesn't litterspiralpath doesn't litter
 
spiralpath's Avatar
 
Posts: 10
Karma: 116
Join Date: May 2011
Device: Multiple
Thanks, again, kovidgoyal,

Heuristics is turned off and there are no search and replace rules.

I put the id in the <p> tag, as you suggested, and the id was retained in the first occurrence and removed in subsequent occurrences.

This time, I named my table of contents file "Contents.html" and confirmed that it was included in content.opf. However, after conversion, Contents.html was removed.
spiralpath is offline   Reply With Quote
Old 06-12-2018, 01:56 AM   #12
spiralpath
Member
spiralpath doesn't litterspiralpath doesn't litter
 
spiralpath's Avatar
 
Posts: 10
Karma: 116
Join Date: May 2011
Device: Multiple
Thank you, David,

Good suggestion to put the id in the <p> tag.

I put the id in the <p> tag, as you suggested, and the id was retained in the first occurrence and removed in subsequent occurrences.

This time, I named my table of contents file "Contents.html" and confirmed that it was included in content.opf. However, after conversion, Contents.html was removed.
spiralpath is offline   Reply With Quote
Old 06-12-2018, 02:45 AM   #13
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,856
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Including it in content.opf is not enough you need to put it in the spine. Anyway, it's too difficult trying to guess what you are doing wrong, see https://www.mobileread.com/forums/sh...d.php?t=186697 for how to provide enough information to get useful answers.

Last edited by kovidgoyal; 06-12-2018 at 09:01 AM.
kovidgoyal is offline   Reply With Quote
Reply

Tags
calibre, convert, css, html, markup


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Calibre convert Chinese PDF to EPUB well, but not TXT and HTML jimmyzou ePub 15 12-27-2013 04:02 PM
Calibre convert to html 247wd Calibre 3 11-28-2013 02:48 AM
Calibre does not convert HTML to MOBI completely perchiper Conversion 1 09-03-2011 10:10 AM
[Old Thread] unable to convert ebooks(rtf, txt,lit,html,pdf) to lrf in calibre .4.131 jackdeth191 Calibre 9 05-02-2009 02:55 AM
Why does Calibre need to go to the web to convert a zipped HTML file? FizzyWater Calibre 4 06-30-2008 12:51 AM


All times are GMT -4. The time now is 07:48 PM.


MobileRead.com is a privately owned, operated and funded community.