MobileRead Forums - View Single Post - HOWTO: Improve performance on calibe generated ePUBs

=X= · 06-04-2009, 03:08 AM

Hi All,
NOTE1: Perl script with FIX is now added
NOTE2: Added executable! Thank you nrapallo!

Note: I decided to make my post #517 its on thread here in the SONY section

I've found the TOC on ePUB generated by calibre to be intolerable. An ePUB with forty TOC entry can take up to 90sec.

Below is what I've found

TOC with "#HREF" syntax makes opening the ePUB extremely slow. With large enough TOC files this will take a long time or even cause the reader to crash.

PROBLEM:
I've noticed a big performance hit every time I try to open up an ePUB book and use the TOC. You mentioned on a different thread it was due to the #HERF.

TEST:
Okay I've done a few test to see how true this is and if there is a good solution to resolve this.

Attached is 3 files
Test File.epub (unmodified calibre generated TOC)
Test File_NOREF.epub (ALL #HREF removed from all URL in the toc.ncx file)
Test File_noREF_Capter.epub (Only the top level chapters have the #HREF removed, sub chapters have the #HREF)

Measured time to the TOC from an ePUB book created from calibre.

Test File.epub
: 110 sec (1min 50 sec)
Test File_NOREF.epub
(Instant)
Test File_noREF_Capter.epub
: Instant for top level chapters. Sub chapters varied depending on how many sub elements it had. The last chapter had 40 items and took 1.5 sec

SOLUTION
There is a HUGE performance increase by just removing the the #HREF URL path from top level TOC. While there still is a hit on sub toc they are small and tolerable.

To do this unzip the epub. Open the toc.ncx XML file.

Go to the docTitle section
Then move to the childe node titled docTitle/navPoint/content XPath
<docTitle>
<navPoint>
<content src="URL">

Remove the #HREF portion located in the URL text of the content node. (i.e. at the end of the URL there is something "http://....#calibre_..." Remove everything from the hash (#) to the end of the URL.

This only has to be done for the top level navPoints to increase the performance.

Have Fun,

=X=