Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 12-11-2016, 02:03 PM   #1
trekky0623
Member
trekky0623 began at the beginning.
 
Posts: 20
Karma: 10
Join Date: Apr 2013
Device: Kindle Paperwhite
HTML Entities placed in ToC break Kobo Aura

This took me a couple hours to figure out, but it seems to be reproducible and fixable now.

I have a certain ePub with filenames with commas in them. For example, TaleofTwoCities,A_split_000.html, something like that. If I transfer it to my Kobo Aura without converting in Calibre, everything works as expected regarding the Table of Contents and chapters and it displays the chapter title at the bottom of the screen.

However, if I convert the ePub > ePub with Calibre, the toc.ncx file lists these files as:

Code:
<content src="TaleofTwoCities%2cA_split_000.html"/>
It replaces the comma with an HTML entity in the toc.ncx file but not the filename itself. This is not the case in the original ePub. This seems to break chapter handling on the Kobo in some ways, including not showing the chapter title at the bottom of the screen as well as flipping to the wrong position when finishing a chapter.

Is there a way to prevent Calibre from putting in these HTML entities?
trekky0623 is offline   Reply With Quote
Old 12-11-2016, 02:46 PM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,253
Karma: 27110894
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
That isn't an HTML entity it is URL encoding and is IIRC perfectly legal in ncx. You can always prevent it from happening by renaming the html file in the calibre editor to remove the comma from the name. If you open a bug report and attach the original epub file, I'll look into getting the conversion to unquote URLs in the ncx.
kovidgoyal is offline   Reply With Quote
Advert
Old 12-11-2016, 03:09 PM   #3
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 21,662
Karma: 29711016
Join Date: Mar 2012
Location: Sydney Australia
Device: none
@trekky0623 - see ==>> How do I report a bug?

BR
BetterRed is offline   Reply With Quote
Old 12-11-2016, 07:56 PM   #4
trekky0623
Member
trekky0623 began at the beginning.
 
Posts: 20
Karma: 10
Join Date: Apr 2013
Device: Kindle Paperwhite
I don't want to post copyrighted material, so I made a test file to demonstrate the issue. It has html files labeled "TestFile,A-01.html", etc.

When loaded onto the Kobo Aura as is, it should show Chapter One, Chapter Two, and Chapter Three at the bottom of the page for the chapter title.

If you then remove the book from the Kobo, convert it with Calibre ePub > ePub, then add the book back, it will no longer show the chapter titles at the bottom of the screen.

Remove the file again, and edit the toc.ncx file to replace %2c with , and add the file back to the Kobo. The chapter titles will reappear.

Now, to be fair, ePub validator does complain about these commas, and the book should probably not have commas in the filenames. However, I feel that Calibre is still breaking functionality here by replacing characters with URL codes in filenames listed in the toc.ncx file. If the publisher is putting commas in their filenames, maybe it should be left alone?
Attached Files
File Type: epub A Test File.epub (2.8 KB, 240 views)

Last edited by trekky0623; 12-11-2016 at 08:11 PM.
trekky0623 is offline   Reply With Quote
Old 12-11-2016, 11:01 PM   #5
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,253
Karma: 27110894
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Once again, URL encoding is perfectly legal in NCX files. That the Kobo does not support it is a bug in the Kobo. Despite that, being the nice guy that I am, I am willing to investigate changing calibre to workaround the bug in the Kobo -- there are already dozens of workarounds for device specific bugs in calibre's conversion pipeline. But, lets not mistake where the bug is.
kovidgoyal is offline   Reply With Quote
Advert
Old 12-12-2016, 10:26 AM   #6
trekky0623
Member
trekky0623 began at the beginning.
 
Posts: 20
Karma: 10
Join Date: Apr 2013
Device: Kindle Paperwhite
Quote:
Originally Posted by kovidgoyal View Post
Once again, URL encoding is perfectly legal in NCX files. That the Kobo does not support it is a bug in the Kobo. Despite that, being the nice guy that I am, I am willing to investigate changing calibre to workaround the bug in the Kobo -- there are already dozens of workarounds for device specific bugs in calibre's conversion pipeline. But, lets not mistake where the bug is.
That's alright. I understand, and it is a pretty nasty bug in the Kobo. I called them to report the flaw.
trekky0623 is offline   Reply With Quote
Old 12-12-2016, 11:32 AM   #7
PeterT
Grand Sorcerer
PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.
 
Posts: 13,380
Karma: 78877538
Join Date: Nov 2007
Location: Toronto
Device: Libra H2O, Libra Colour
I thought it was recommended not to use any special characters in the file names; just a-z A-Z and 0-9
PeterT is offline   Reply With Quote
Old 12-12-2016, 05:13 PM   #8
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 21,662
Karma: 29711016
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by PeterT View Post
I thought it was recommended not to use any special characters in the file names; just a-z A-Z and 0-9
But as some would have it, "... it is a custom more honour'd in the breach than the observance."

Code:
D:\CalibreLibraries\_Test\William Shakespeare\Romeo et Juliette (401)\Romeo et Juliette - William Shakespeare.pdf
I've been wondering what a Kobo device does if a URI within an ncx file has an encoded space (%20) in it. Does it deal with them OK? If so then why not an encoded comma (%2C). They're both specified as encoding candidates in RFCs dated as long ago as August 1998.

FWIW: - according to the IETF, underscore, hyphen/minus, full stop, and tilde are acceptable in URI names.

BR

<rant>Why is it that in the content.opf, the manifest and spine refer to the XHTML files by their 'physical' file names, whereas in the toc.ncx the same files are referred to by their 'percent encoded' URI names. Inconsistencies such as this drives those of us not steeped in the intricacies of 'current technology' nuts. I sometimes wonder if the TPTB do it to feed their love of obscurantism.</rant>

Last edited by BetterRed; 12-12-2016 at 05:27 PM.
BetterRed is offline   Reply With Quote
Old 12-12-2016, 06:00 PM   #9
davidfor
Grand Sorcerer
davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.
 
Posts: 24,905
Karma: 47303824
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
Quote:
Originally Posted by trekky0623 View Post
That's alright. I understand, and it is a pretty nasty bug in the Kobo. I called them to report the flaw.
Did you mention that you are sending the books as kepubs? From your first post, you are either converting to kepub or using the KoboTouchExtended driver. I'm guessing the latter. If I do an epub-to-epub conversion and send the book as an epub, the book worked OK. When I converted to kepub and sent that, I see some of the problems you reported.
davidfor is offline   Reply With Quote
Old 12-12-2016, 06:49 PM   #10
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 79,448
Karma: 145491800
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
I too converted the attached ePub and using RMDSK (ADE) on my H2O, it worked. I did not try as kepub via Access.

I agree you need to give more details. This issue would be best served during the conversion from ePub to kepub which means it is not a Calibre issue.
JSWolf is online now   Reply With Quote
Old 12-16-2016, 04:11 PM   #11
trekky0623
Member
trekky0623 began at the beginning.
 
Posts: 20
Karma: 10
Join Date: Apr 2013
Device: Kindle Paperwhite
Quote:
Originally Posted by davidfor View Post
Did you mention that you are sending the books as kepubs? From your first post, you are either converting to kepub or using the KoboTouchExtended driver. I'm guessing the latter. If I do an epub-to-epub conversion and send the book as an epub, the book worked OK. When I converted to kepub and sent that, I see some of the problems you reported.
Well, the chapter information at the bottom doesn't work with ePubs anyway. I'm not sure about the navigation problems. But yes, I did tell them this was for kepubs. I currently have an open ticket and just E-mailed them some more info. I'll update this thread if anything comes of it, but as is, I think this is important knowledge to have in case anyone else runs into this issue with their Kobo.
trekky0623 is offline   Reply With Quote
Old 12-16-2016, 04:22 PM   #12
GeoffR
Wizard
GeoffR ought to be getting tired of karma fortunes by now.GeoffR ought to be getting tired of karma fortunes by now.GeoffR ought to be getting tired of karma fortunes by now.GeoffR ought to be getting tired of karma fortunes by now.GeoffR ought to be getting tired of karma fortunes by now.GeoffR ought to be getting tired of karma fortunes by now.GeoffR ought to be getting tired of karma fortunes by now.GeoffR ought to be getting tired of karma fortunes by now.GeoffR ought to be getting tired of karma fortunes by now.GeoffR ought to be getting tired of karma fortunes by now.GeoffR ought to be getting tired of karma fortunes by now.
 
GeoffR's Avatar
 
Posts: 3,821
Karma: 19162882
Join Date: Nov 2012
Location: Te Riu-a-Māui
Device: Kobo Glo
Quote:
Originally Posted by trekky0623 View Post
Well, the chapter information at the bottom doesn't work with ePubs anyway. I'm not sure about the navigation problems.
The Chapter title is nomally displayed in the Adobe ePub reader when you open the <-> menu, provided the "Display progress fo:" option is set to "Current chapter" and not "Whole book" (same as for the KePub reader.)

Edit: If the problem affects ePubs then it is something Kobo would have to fix in the device firmware, but if it only affects Kobo's proprietry KePub format then they might fix it by adding a requirement to their publishing guidelines that the NCX toc must not contain those html entities, or by removing the html entities when they convert the publisher's ePub into KePub format.

Last edited by GeoffR; 12-16-2016 at 05:18 PM. Reason: If the problem affects ePubs ...
GeoffR is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
More help needed. Automatic HTML entities conversion arspr Editor 3 12-31-2013 01:45 PM
How doI break a Kobo Aura out of an endlesss boot-loop? RobertJSawyer Kobo Developer's Corner 2 12-20-2013 11:35 AM
Search & Replace issue with html entities Aleyst Sigil 2 09-27-2011 07:49 AM
HTML entities being changed to actual glyphs GrannyGrump Sigil 4 09-10-2011 01:16 AM
Why do html entities get replaced upon import? kentmatt Calibre 1 12-08-2010 12:21 PM


All times are GMT -4. The time now is 04:20 PM.


MobileRead.com is a privately owned, operated and funded community.