Hi Jackie. Here's what's been happening. I've done all of the following, starting with the ZIP file you attached in Post 9:
Quote:
Originally Posted by jackie_w
Import your cleaned up HTML (blank lines removed etc) into Calibre and convert to EPUB using the settings detailed in post #9 step 2, with the following minor changes:
Code:
Table of Contents - Level 1 TOC - //*[@class="invisible"]
Table of Contents - Level 2 TOC - //h:h3
Table of Contents - Level 3 TOC - leave empty
|
but I haven't been able to do this:
Quote:
Originally Posted by jackie_w
Afterwards, you would still need to manually tweak the EPUB's toc.ncx for the 5 'Part' entries, as detailed in post #9 step 3.
|
because I can't get a workable edited HTML file to convert to EPUB. I did the editing in Kompozer 0.8b3, an open source WYSIWYG HTML editor that I used to create our website. I started with the Post 9 ZIP file & unzipped it with "The Archiver", then did the edits in source code. When I checked the text view in Kompozer, all looked as it should. But when I saved it, some of the punctuation in the resulting html went awry: an apostrophe becomes ’, left quote becomes “, right quote becomes ” and so forth.
I posted a query on a Kompozer forum and got several suggestions that didn't work and one that shows promise. Here it is:
The examples you cite are characteristic of character encoding issues. You can change the character encoding on the format>page title and properties window (in Kompozer). Select the character encoding to match your original document. If you look at the html file in a text editor there should be a line in the head section similar to:
Code: Select all
<meta http-equiv="content-type" content="text/html; charset=UTF-8" />
You would be interested in the charset= part and see if it is in the KompoZer list.
However, I'm having a problem viewing the HTML file in the resident Mac text editor, TextEdit (all that can be seen are the illustrations on black background with no visible text) so I'm at an impasse.
Do you know what the character encoding is? Is it derived from the original source file in Word or does Calibre alter it? Do you have any other insight or approach?
One possibility is to avoid working in HTML and convert the Post 9 Zip to EPUB using the parameters you stated in Posts 9 & 18, then do the cleanup in Sigil ... which is the way I created the EPUB attached in Post 15. I share your preference to avoid Sigil, but the only
known glitch was the blank page inserted before the book cover, which you resolved.
I do hope these complications aren't vexing to you.