Hi All,
NOTE1: Perl script with FIX is now added
NOTE2: Added executable! Thank you nrapallo!
Note: I decided to make my post
#517 its on thread here in the SONY section
I've found the TOC on ePUB generated by calibre to be intolerable. An ePUB with forty TOC entry can take up to 90sec.
Below is what I've found
TOC with "#HREF" syntax makes opening the ePUB extremely slow. With large enough TOC files this will take a long time or even cause the reader to crash.
PROBLEM:
I've noticed a big performance hit every time I try to open up an ePUB book and use the TOC. You mentioned on a different thread it was due to the #HERF.
TEST:
Okay I've done a few test to see how true this is and if there is a good solution to resolve this.
Attached is 3 files
Test File.epub (unmodified calibre generated TOC)
Test File_NOREF.epub (ALL #HREF removed from all URL in the toc.ncx file)
Test File_noREF_Capter.epub (Only the top level chapters have the #HREF removed, sub chapters have the #HREF)
Measured time to the TOC from an ePUB book created from calibre.
SOLUTION
There is a
HUGE performance increase by just removing the the #HREF URL path from top level TOC. While there still is a hit on sub toc they are small and tolerable.
To do this unzip the epub. Open the toc.ncx XML file.
Go to the docTitle section
Then move to the childe node titled docTitle/navPoint/content XPath
<docTitle>
<navPoint>
<content src="URL">
Remove the #HREF portion located in the URL text of the content node. (i.e. at the end of the URL there is something "http://....#calibre_..." Remove everything from the hash (#) to the end of the URL.
This only has to be done for the top level navPoints to increase the performance.
Have Fun,
=X=