Quote:
Originally Posted by Artha
I have 0.4.2 and try to do by hand a book from PDF to ePub. I have changed the file to barebones HTML and will attach a CSS file later. Now, things should be nice and clean with only the HTML tags and nothing more.
Yet when I hit „Generate TOC from headings” an id="heading_id_2" or id="heading_id_3" is attached to the headings. Why is that?
And can it be disabled?
|
You don't need the id="heading_id_2" if each chapter is a separate file. All you do in the NCX is call the file you want for each chapter entry without needing the # anchor.
What I do is use regex to strip it. I would search for od="heading_id_[0-9]*" and replace with nothing. This works in Notepad++. I've not tried it in Sigil so I do not know if that regex would work. Someone may be able to fix it if it's incorrect.
Quote:
Originally Posted by Artha
Weird. Why would Calibre use span, or <i>, when there's <em> for that?
|
Because that's what is in the HTML generated from the PDF.
I've seen code from some conversions were there was something like <p class=para"><span>text of the book</span></p> in every line and it got worse with italics. I was able to regex remove most of it and then manually remove it for every line that had italics.
With Calibre, a lot of the oddities are in the source fed to it.