|  06-04-2009, 03:08 AM | #1 | 
| Wizard            Posts: 3,671 Karma: 12205348 Join Date: Mar 2008 Device: Galaxy S, Nook w/CM7 | 
				
				HOWTO: Improve performance on calibe generated ePUBs
			 
			
			Hi All, NOTE1: Perl script with FIX is now added NOTE2: Added executable! Thank you nrapallo! Note: I decided to make my post #517 its on thread here in the SONY section I've found the TOC on ePUB generated by calibre to be intolerable. An ePUB with forty TOC entry can take up to 90sec. Below is what I've found TOC with "#HREF" syntax makes opening the ePUB extremely slow. With large enough TOC files this will take a long time or even cause the reader to crash. PROBLEM: I've noticed a big performance hit every time I try to open up an ePUB book and use the TOC. You mentioned on a different thread it was due to the #HERF. TEST: Okay I've done a few test to see how true this is and if there is a good solution to resolve this. Attached is 3 files Test File.epub (unmodified calibre generated TOC) Test File_NOREF.epub (ALL #HREF removed from all URL in the toc.ncx file) Test File_noREF_Capter.epub (Only the top level chapters have the #HREF removed, sub chapters have the #HREF) Measured time to the TOC from an ePUB book created from calibre. 
 SOLUTION There is a HUGE performance increase by just removing the the #HREF URL path from top level TOC. While there still is a hit on sub toc they are small and tolerable. To do this unzip the epub. Open the toc.ncx XML file. Go to the docTitle section Then move to the childe node titled docTitle/navPoint/content XPath <docTitle> <navPoint> <content src="URL"> Remove the #HREF portion located in the URL text of the content node. (i.e. at the end of the URL there is something "http://....#calibre_..." Remove everything from the hash (#) to the end of the URL. This only has to be done for the top level navPoints to increase the performance. Have Fun, =X= Last edited by =X=; 06-05-2009 at 12:47 PM. Reason: Added Note1 and script to fix the TOC | 
|   |   | 
|  06-04-2009, 11:03 AM | #2 | 
| Wizard            Posts: 3,671 Karma: 12205348 Join Date: Mar 2008 Device: Galaxy S, Nook w/CM7 | 
			
			I started thinking this can easily be done with a perl script.  Is there any interested in such a script? =X= | 
|   |   | 
|  06-04-2009, 12:42 PM | #3 | 
| Resident Curmudgeon            Posts: 80,675 Karma: 150249619 Join Date: Nov 2006 Location: Roslindale, Massachusetts Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3 | 
			
			I'd be more interested in a Python script. But I would be interested in such a fix.
		 | 
|   |   | 
|  06-04-2009, 09:36 PM | #4 | 
| Wizard            Posts: 3,671 Karma: 12205348 Join Date: Mar 2008 Device: Galaxy S, Nook w/CM7 | 
			
			Hi Jon, Unfortunately don't know how to program in Python. I can manage my way through existing code but writing code from scratch is a different story. I'm almost complete with the perl script, just polishing it up. It should be ready by tonight or tomorrow morning. =X= | 
|   |   | 
|  06-04-2009, 10:05 PM | #5 | |
| Resident Curmudgeon            Posts: 80,675 Karma: 150249619 Join Date: Nov 2006 Location: Roslindale, Massachusetts Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3 | Quote: 
 | |
|   |   | 
|  06-05-2009, 08:51 AM | #6 | 
| Guru            Posts: 714 Karma: 2003751 Join Date: Oct 2008 Location: Ottawa, ON Device: Kobo Glo HD | 
			
			ADE on Sony is really sensitive to the size of the "chunk", which is the reason that I always-always-always hand-edit the results of the conversion to have a single file per TOC entry. The smallest possible size of the xhtml chunk really helps.
		 | 
|   |   | 
|  06-05-2009, 10:53 AM | #7 | |
| Wizard            Posts: 3,671 Karma: 12205348 Join Date: Mar 2008 Device: Galaxy S, Nook w/CM7 | Quote: 
 I used the Expat XML parser, one of the first XML parsers out there. It's easy to use but more importantly usually included in all Perl distributions. What I'm getting at is this does not support DOM so it's much harder to handle moving,copying nodes as a result. I'll look into it though it shouldn't be too hard. =X= | |
|   |   | 
|  06-05-2009, 10:57 AM | #8 | 
| Wizard            Posts: 3,671 Karma: 12205348 Join Date: Mar 2008 Device: Galaxy S, Nook w/CM7 | 
				
				First version of the ePUB TOC calibre enhancer
			 
			
			Okay here is my first pass ePub_TOC_enhancer.pl eBookName.epub It also has a recursive mode just put the '-R' switch after the executable to update all ePubs in the current directory and all it's sub directories. Code:   usage: ePub_TOC_enhancer.pl eBookName.epub
        -h          : this (help) message
        -R          : Recursively search for ePUBs
  Example: ePub_TOC_enhancer.pl -R 
    Will recursively search for ePUBs in
    the current directory and all its children.
  Example: ePub_TOC_enhancer.pl MYeBook.epub YOUReBook.epub OUReBOOK.epub
    Will only fix the eBooks specified. | 
|   |   | 
|  06-05-2009, 11:02 AM | #9 | |
| Wizard            Posts: 3,671 Karma: 12205348 Join Date: Mar 2008 Device: Galaxy S, Nook w/CM7 | Quote: 
 What my first post is getting at is the HREF tag cause the SONY/ADE support to really get bad, more so than a single TOC entry for an HTML tag. Try the attached fix on a calibre generated ePUB and you will see the difference. =X= | |
|   |   | 
|  06-05-2009, 12:02 PM | #10 | |
| Guru            Posts: 714 Karma: 2003751 Join Date: Oct 2008 Location: Ottawa, ON Device: Kobo Glo HD | Quote: 
 Unlike what calibre is doing, if your book is split at TOC entries (one XHTML file per TOC entry), there is no need for #HREF tags in TOC entries. All of them are referenced as per your fix.   | |
|   |   | 
|  06-05-2009, 12:16 PM | #11 | 
| GuteBook/Mobi2IMP Creator            Posts: 2,958 Karma: 2530691 Join Date: Dec 2007 Location: Toronto, Canada Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN | 
			
			=X= As requested, Windows executable of your Perl script and batch file to reproduce same using the PAR-Packer-588 (v0.973) from CPAN. Now attached to post #1 above. Last edited by nrapallo; 06-05-2009 at 02:34 PM. Reason: attachment moved to post #1 | 
|   |   | 
|  06-05-2009, 05:05 PM | #12 | 
| The Introvert            Posts: 8,307 Karma: 1000077497 Join Date: Jan 2007 Location: United Kingdom Device: Sony Reader PRS-650 & 505 & 500 | 
			
			Guys, you are geniuses!
		 | 
|   |   | 
|  06-05-2009, 05:28 PM | #13 | 
| Wizard            Posts: 1,731 Karma: 3472866 Join Date: Apr 2008 Device: Sony PRS-650 & 350; Kindle Voyage; Kobo Aura HD, Aura One, and Forma | 
			
			I am looking forward to trying this when I get home tonight!  Thanks!!! dordale   | 
|   |   | 
|  06-06-2009, 02:44 AM | #14 | 
| 01000100 01001010            Posts: 1,889 Karma: 2400000 Join Date: Mar 2009 Device: Polyamorous | 
			
			I've been trying but can't get the script to work. Will have to try when I don't have a migraine.
		 | 
|   |   | 
|  06-13-2009, 01:34 PM | #15 | 
| Resident Curmudgeon            Posts: 80,675 Karma: 150249619 Join Date: Nov 2006 Location: Roslindale, Massachusetts Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3 | 
			
			I don't recall the error, but 0.01 did not work on an ePub I tried it on that had no DRM.
		 | 
|   |   | 
|  | 
| 
 | 
|  Similar Threads | ||||
| Thread | Thread Starter | Forum | Replies | Last Post | 
| Generated covers | melz | Calibre | 8 | 11-02-2024 04:34 AM | 
| Calibe, Dropbox & Two Computers? | modkindle | Related Tools | 2 | 10-10-2010 01:52 AM | 
| Does splitting EPUB among more HTML files improve Performance? | purcelljf | ePub | 2 | 10-01-2010 01:15 AM | 
| Calibe und speichern auf BeBook | ralphffm44 | Software | 11 | 12-09-2009 09:03 AM | 
| Calibe E-Book Conversion problem | =X= | Calibre | 2 | 05-24-2008 09:50 AM |