![]() |
#1 |
Member
![]() Posts: 20
Karma: 10
Join Date: Apr 2013
Device: Kindle Paperwhite
|
HTML Entities placed in ToC break Kobo Aura
This took me a couple hours to figure out, but it seems to be reproducible and fixable now.
I have a certain ePub with filenames with commas in them. For example, TaleofTwoCities,A_split_000.html, something like that. If I transfer it to my Kobo Aura without converting in Calibre, everything works as expected regarding the Table of Contents and chapters and it displays the chapter title at the bottom of the screen. However, if I convert the ePub > ePub with Calibre, the toc.ncx file lists these files as: Code:
<content src="TaleofTwoCities%2cA_split_000.html"/> Is there a way to prevent Calibre from putting in these HTML entities? |
![]() |
![]() |
![]() |
#2 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,253
Karma: 27110894
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
That isn't an HTML entity it is URL encoding and is IIRC perfectly legal in ncx. You can always prevent it from happening by renaming the html file in the calibre editor to remove the comma from the name. If you open a bug report and attach the original epub file, I'll look into getting the conversion to unquote URLs in the ncx.
|
![]() |
![]() |
Advert | |
|
![]() |
#3 |
null operator (he/him)
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 21,662
Karma: 29711016
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
|
![]() |
![]() |
![]() |
#4 |
Member
![]() Posts: 20
Karma: 10
Join Date: Apr 2013
Device: Kindle Paperwhite
|
I don't want to post copyrighted material, so I made a test file to demonstrate the issue. It has html files labeled "TestFile,A-01.html", etc.
When loaded onto the Kobo Aura as is, it should show Chapter One, Chapter Two, and Chapter Three at the bottom of the page for the chapter title. If you then remove the book from the Kobo, convert it with Calibre ePub > ePub, then add the book back, it will no longer show the chapter titles at the bottom of the screen. Remove the file again, and edit the toc.ncx file to replace %2c with , and add the file back to the Kobo. The chapter titles will reappear. Now, to be fair, ePub validator does complain about these commas, and the book should probably not have commas in the filenames. However, I feel that Calibre is still breaking functionality here by replacing characters with URL codes in filenames listed in the toc.ncx file. If the publisher is putting commas in their filenames, maybe it should be left alone? Last edited by trekky0623; 12-11-2016 at 08:11 PM. |
![]() |
![]() |
![]() |
#5 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,253
Karma: 27110894
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Once again, URL encoding is perfectly legal in NCX files. That the Kobo does not support it is a bug in the Kobo. Despite that, being the nice guy that I am, I am willing to investigate changing calibre to workaround the bug in the Kobo -- there are already dozens of workarounds for device specific bugs in calibre's conversion pipeline. But, lets not mistake where the bug is.
|
![]() |
![]() |
Advert | |
|
![]() |
#6 | |
Member
![]() Posts: 20
Karma: 10
Join Date: Apr 2013
Device: Kindle Paperwhite
|
Quote:
|
|
![]() |
![]() |
![]() |
#7 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 13,380
Karma: 78877538
Join Date: Nov 2007
Location: Toronto
Device: Libra H2O, Libra Colour
|
I thought it was recommended not to use any special characters in the file names; just a-z A-Z and 0-9
|
![]() |
![]() |
![]() |
#8 | |
null operator (he/him)
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 21,662
Karma: 29711016
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
Quote:
Code:
D:\CalibreLibraries\_Test\William Shakespeare\Romeo et Juliette (401)\Romeo et Juliette - William Shakespeare.pdf FWIW: - according to the IETF, underscore, hyphen/minus, full stop, and tilde are acceptable in URI names. BR <rant>Why is it that in the content.opf, the manifest and spine refer to the XHTML files by their 'physical' file names, whereas in the toc.ncx the same files are referred to by their 'percent encoded' URI names. Inconsistencies such as this drives those of us not steeped in the intricacies of 'current technology' nuts. I sometimes wonder if the TPTB do it to feed their love of obscurantism.</rant> Last edited by BetterRed; 12-12-2016 at 05:27 PM. |
|
![]() |
![]() |
![]() |
#9 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 24,905
Karma: 47303824
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
|
Did you mention that you are sending the books as kepubs? From your first post, you are either converting to kepub or using the KoboTouchExtended driver. I'm guessing the latter. If I do an epub-to-epub conversion and send the book as an epub, the book worked OK. When I converted to kepub and sent that, I see some of the problems you reported.
|
![]() |
![]() |
![]() |
#10 |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 79,448
Karma: 145491800
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
I too converted the attached ePub and using RMDSK (ADE) on my H2O, it worked. I did not try as kepub via Access.
I agree you need to give more details. This issue would be best served during the conversion from ePub to kepub which means it is not a Calibre issue. |
![]() |
![]() |
![]() |
#11 | |
Member
![]() Posts: 20
Karma: 10
Join Date: Apr 2013
Device: Kindle Paperwhite
|
Quote:
|
|
![]() |
![]() |
![]() |
#12 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,821
Karma: 19162882
Join Date: Nov 2012
Location: Te Riu-a-Māui
Device: Kobo Glo
|
Quote:
Edit: If the problem affects ePubs then it is something Kobo would have to fix in the device firmware, but if it only affects Kobo's proprietry KePub format then they might fix it by adding a requirement to their publishing guidelines that the NCX toc must not contain those html entities, or by removing the html entities when they convert the publisher's ePub into KePub format. Last edited by GeoffR; 12-16-2016 at 05:18 PM. Reason: If the problem affects ePubs ... |
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
More help needed. Automatic HTML entities conversion | arspr | Editor | 3 | 12-31-2013 01:45 PM |
How doI break a Kobo Aura out of an endlesss boot-loop? | RobertJSawyer | Kobo Developer's Corner | 2 | 12-20-2013 11:35 AM |
Search & Replace issue with html entities | Aleyst | Sigil | 2 | 09-27-2011 07:49 AM |
HTML entities being changed to actual glyphs | GrannyGrump | Sigil | 4 | 09-10-2011 01:16 AM |
Why do html entities get replaced upon import? | kentmatt | Calibre | 1 | 12-08-2010 12:21 PM |