![]() |
HTML Entities placed in ToC break Kobo Aura
This took me a couple hours to figure out, but it seems to be reproducible and fixable now.
I have a certain ePub with filenames with commas in them. For example, TaleofTwoCities,A_split_000.html, something like that. If I transfer it to my Kobo Aura without converting in Calibre, everything works as expected regarding the Table of Contents and chapters and it displays the chapter title at the bottom of the screen. However, if I convert the ePub > ePub with Calibre, the toc.ncx file lists these files as: Code:
<content src="TaleofTwoCities%2cA_split_000.html"/>Is there a way to prevent Calibre from putting in these HTML entities? |
That isn't an HTML entity it is URL encoding and is IIRC perfectly legal in ncx. You can always prevent it from happening by renaming the html file in the calibre editor to remove the comma from the name. If you open a bug report and attach the original epub file, I'll look into getting the conversion to unquote URLs in the ncx.
|
|
1 Attachment(s)
I don't want to post copyrighted material, so I made a test file to demonstrate the issue. It has html files labeled "TestFile,A-01.html", etc.
When loaded onto the Kobo Aura as is, it should show Chapter One, Chapter Two, and Chapter Three at the bottom of the page for the chapter title. If you then remove the book from the Kobo, convert it with Calibre ePub > ePub, then add the book back, it will no longer show the chapter titles at the bottom of the screen. Remove the file again, and edit the toc.ncx file to replace %2c with , and add the file back to the Kobo. The chapter titles will reappear. Now, to be fair, ePub validator does complain about these commas, and the book should probably not have commas in the filenames. However, I feel that Calibre is still breaking functionality here by replacing characters with URL codes in filenames listed in the toc.ncx file. If the publisher is putting commas in their filenames, maybe it should be left alone? |
Once again, URL encoding is perfectly legal in NCX files. That the Kobo does not support it is a bug in the Kobo. Despite that, being the nice guy that I am, I am willing to investigate changing calibre to workaround the bug in the Kobo -- there are already dozens of workarounds for device specific bugs in calibre's conversion pipeline. But, lets not mistake where the bug is.
|
Quote:
|
I thought it was recommended not to use any special characters in the file names; just a-z A-Z and 0-9
|
Quote:
Code:
D:\CalibreLibraries\_Test\William Shakespeare\Romeo et Juliette (401)\Romeo et Juliette - William Shakespeare.pdfFWIW: - according to the IETF, underscore, hyphen/minus, full stop, and tilde are acceptable in URI names. BR <rant>Why is it that in the content.opf, the manifest and spine refer to the XHTML files by their 'physical' file names, whereas in the toc.ncx the same files are referred to by their 'percent encoded' URI names. Inconsistencies such as this drives those of us not steeped in the intricacies of 'current technology' nuts. I sometimes wonder if the TPTB do it to feed their love of obscurantism.</rant> |
Quote:
|
I too converted the attached ePub and using RMDSK (ADE) on my H2O, it worked. I did not try as kepub via Access.
I agree you need to give more details. This issue would be best served during the conversion from ePub to kepub which means it is not a Calibre issue. |
Quote:
|
Quote:
Edit: If the problem affects ePubs then it is something Kobo would have to fix in the device firmware, but if it only affects Kobo's proprietry KePub format then they might fix it by adding a requirement to their publishing guidelines that the NCX toc must not contain those html entities, or by removing the html entities when they convert the publisher's ePub into KePub format. |
| All times are GMT -4. The time now is 10:56 PM. |
Powered by: vBulletin
Copyright ©2000 - 3.8.5, Jelsoft Enterprises Ltd.
MobileRead.com is a privately owned, operated and funded community.