Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Conversion

Notices

Reply
 
Thread Tools Search this Thread
Old 03-08-2011, 12:58 PM   #1
siebert
Developer
siebert has a complete set of Star Wars action figures.siebert has a complete set of Star Wars action figures.siebert has a complete set of Star Wars action figures.
 
Posts: 136
Karma: 280
Join Date: Nov 2010
Device: Kindle 3 (Keyboard) 3G / iPad 3 WiFi / Nexus 4 (Android)
Multi-level TOC broken in epub->epub conversion

Hi,

I'm taking the sample xhtml code from the calibre manual at
http://calibre-ebook.com/user_manual...le-of-contents
and save it as a html file, add the file to calibre as an ebook (which results in a zip file), enable "Force use of auto-generated Table of Contents", set level1 and level2 toc values as described and convert to epub.

The resulting epub contains a two-level TOC as expected, but if I convert this epub again to epub using the same setting, the TOC of the new epub contains only the level1 headings. Bummer.

Ciao,
Steffen
siebert is offline   Reply With Quote
Old 03-08-2011, 01:05 PM   #2
Manichean
Wizard
Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!
 
Manichean's Avatar
 
Posts: 3,130
Karma: 80446
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
I'm guessing you'll have to adjust the level 2 detection XPath to account for changes occuring during conversion. Have a look at the XHTML inside the ePub to see what the relevant headings are tagged like.
Manichean is offline   Reply With Quote
Old 03-08-2011, 01:16 PM   #3
siebert
Developer
siebert has a complete set of Star Wars action figures.siebert has a complete set of Star Wars action figures.siebert has a complete set of Star Wars action figures.
 
Posts: 136
Karma: 280
Join Date: Nov 2010
Device: Kindle 3 (Keyboard) 3G / iPad 3 WiFi / Nexus 4 (Android)
Quote:
Originally Posted by Manichean View Post
I'm guessing you'll have to adjust the level 2 detection XPath to account for changes occuring during conversion. Have a look at the XHTML inside the ePub to see what the relevant headings are tagged like.
Nope, that's not the reason. In fact, if I move the level 2 XPath to level 1 and delete the input field for level 2, I get a single-level TOC containing the level 2 headings, so this is not the cause.

Ciao,
Steffen
siebert is offline   Reply With Quote
Old 03-08-2011, 01:30 PM   #4
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 25,367
Karma: 4961459
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
the level 2 expression only matches if a leve1 expression matched in the same html file.

Why are you forcing use of auto generated toc twice anyway?
kovidgoyal is online now   Reply With Quote
Old 03-08-2011, 01:53 PM   #5
siebert
Developer
siebert has a complete set of Star Wars action figures.siebert has a complete set of Star Wars action figures.siebert has a complete set of Star Wars action figures.
 
Posts: 136
Karma: 280
Join Date: Nov 2010
Device: Kindle 3 (Keyboard) 3G / iPad 3 WiFi / Nexus 4 (Android)
Quote:
Originally Posted by kovidgoyal View Post
the level 2 expression only matches if a leve1 expression matched in the same html file.
I was afraid that the splitting might be the reason.

Is there a way to have calibre merge the html files again?

Quote:
Why are you forcing use of auto generated toc twice anyway?
Because it took several iterations until I had figured out how to create the TOC the way I wanted and I also overlooked some headings in my first attempt.

Ciao,
Steffen
siebert is offline   Reply With Quote
Old 03-08-2011, 01:56 PM   #6
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 25,367
Karma: 4961459
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
No calibre doesn't do merging.
kovidgoyal is online now   Reply With Quote
Old 03-08-2011, 02:18 PM   #7
Manichean
Wizard
Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!
 
Manichean's Avatar
 
Posts: 3,130
Karma: 80446
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
Quote:
Originally Posted by siebert View Post
Because it took several iterations until I had figured out how to create the TOC the way I wanted and I also overlooked some headings in my first attempt.
Why not just correct the XPath and reconvert from HTML, then?
Manichean is offline   Reply With Quote
Old 03-08-2011, 04:02 PM   #8
siebert
Developer
siebert has a complete set of Star Wars action figures.siebert has a complete set of Star Wars action figures.siebert has a complete set of Star Wars action figures.
 
Posts: 136
Karma: 280
Join Date: Nov 2010
Device: Kindle 3 (Keyboard) 3G / iPad 3 WiFi / Nexus 4 (Android)
Quote:
Originally Posted by Manichean View Post
Why not just correct the XPath and reconvert from HTML, then?
Because in the real case I've started with an epub, not a html file.

Ciao,
Steffen
siebert is offline   Reply With Quote
Old 03-08-2011, 04:07 PM   #9
siebert
Developer
siebert has a complete set of Star Wars action figures.siebert has a complete set of Star Wars action figures.siebert has a complete set of Star Wars action figures.
 
Posts: 136
Karma: 280
Join Date: Nov 2010
Device: Kindle 3 (Keyboard) 3G / iPad 3 WiFi / Nexus 4 (Android)
Quote:
Originally Posted by kovidgoyal View Post
No calibre doesn't do merging.
That's a pity. But it should be possible to implement, or does calibre discard some information it won't be able to recover?

Quote:
Originally Posted by kovidgoyal View Post
the level 2 expression only matches if a leve1 expression matched in the same html file.
As a proper formatted epub won't match this requirement, it should be considered a bug and fixed, shouldn't it?

Ciao,
Steffen
siebert is offline   Reply With Quote
Old 03-08-2011, 04:12 PM   #10
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 25,367
Karma: 4961459
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Quote:
Originally Posted by siebert View Post
That's a pity. But it should be possible to implement, or does calibre discard some information it won't be able to recover?
It's certainly possible, just not something I care to implement.

Quote:
As a proper formatted epub won't match this requirement, it should be considered a bug and fixed, shouldn't it?

Ciao,
Steffen
A properly formatted epub will also have a properly formatted TOC. Again, it's perfectly possible to have the code remember the last encountered level n-1 TOC entry and use that for the next level n toc entry across HTML flows, but again, it's not something I consider important enough to spend time on.
kovidgoyal is online now   Reply With Quote
Old 03-08-2011, 04:24 PM   #11
siebert
Developer
siebert has a complete set of Star Wars action figures.siebert has a complete set of Star Wars action figures.siebert has a complete set of Star Wars action figures.
 
Posts: 136
Karma: 280
Join Date: Nov 2010
Device: Kindle 3 (Keyboard) 3G / iPad 3 WiFi / Nexus 4 (Android)
Quote:
Originally Posted by kovidgoyal View Post
A properly formatted epub will also have a properly formatted TOC.
It might. But if I want to use calibre to generate a different TOC according to the rules I define in calibre, I consider it a bug if it doesn't work as documented.

Quote:
Again, it's perfectly possible to have the code remember the last encountered level n-1 TOC entry and use that for the next level n toc entry across HTML flows, but again, it's not something I consider important enough to spend time on.
I find bugs that make a conversion degrading not so unimportant.

Ciao,
Steffen
siebert is offline   Reply With Quote
Old 03-08-2011, 04:53 PM   #12
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 25,367
Karma: 4961459
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Then feel free to submit a patch
kovidgoyal is online now   Reply With Quote
Old 03-09-2011, 11:41 AM   #13
siebert
Developer
siebert has a complete set of Star Wars action figures.siebert has a complete set of Star Wars action figures.siebert has a complete set of Star Wars action figures.
 
Posts: 136
Karma: 280
Join Date: Nov 2010
Device: Kindle 3 (Keyboard) 3G / iPad 3 WiFi / Nexus 4 (Android)
Quote:
Originally Posted by kovidgoyal View Post
Then feel free to submit a patch
Ok, here it is:

Code:
diff --git a/src/calibre/ebooks/oeb/transforms/structure.py b/src/calibre/ebooks/oeb/transforms/structure.py
index 0db9b15..90de41d 100644
--- a/src/calibre/ebooks/oeb/transforms/structure.py
+++ b/src/calibre/ebooks/oeb/transforms/structure.py
@@ -95,10 +95,8 @@ class DetectStructure(object):
                     self.log.exception('Failed to mark chapter')
 
     def create_level_based_toc(self):
-        if self.opts.level1_toc is None:
-            return
-        for item in self.oeb.spine:
-            self.add_leveled_toc_items(item)
+        if self.opts.level1_toc is not None:
+            self.add_leveled_toc_items()
 
     def create_toc_from_chapters(self):
         counter = self.oeb.toc.next_play_order()
@@ -145,14 +143,15 @@ class DetectStructure(object):
         return text, href
 
 
-    def add_leveled_toc_items(self, item):
-        level1 = XPath(self.opts.level1_toc)(item.data)
+    def add_leveled_toc_items(self):
         level1_order = []
-        document = item
-
+        added = {}
+        added2 = {}
         counter = 1
-        if level1:
-            added = {}
+        for item in self.oeb.spine:
+            level1 = XPath(self.opts.level1_toc)(item.data)
+            document = item
+
             for elem in level1:
                 text, _href = self.elem_to_link(document, elem, counter)
                 counter += 1
@@ -163,14 +162,18 @@ class DetectStructure(object):
                     added[elem] = node
                     #node.add(_('Top'), _href)
             if self.opts.level2_toc is not None:
-                added2 = {}
                 level2 = list(XPath(self.opts.level2_toc)(document.data))
                 for elem in level2:
                     level1 = None
                     for item in document.data.iterdescendants():
                         if item in added.keys():
                             level1 = added[item]
-                        elif item == elem and level1 is not None:
+                        elif item == elem:
+                            if level1 is None:
+                                if added == {}:
+                                    continue
+                                else:
+                                    level1 = added[added.keys()[-1]]
                             text, _href = self.elem_to_link(document, elem, counter)
                             counter += 1
                             if text:
@@ -183,11 +186,15 @@ class DetectStructure(object):
                         for item in document.data.iterdescendants():
                             if item in added2.keys():
                                 level2 = added2[item]
-                            elif item == elem and level2 is not None:
+                            elif item == elem:
+                                if level2 is None:
+                                    if added2 == {}:
+                                        continue
+                                    else:
+                                        level2 = added2[added2.keys()[-1]]
                                 text, _href = \
                                         self.elem_to_link(document, elem, counter)
                                 counter += 1
                                 if text:
                                     level2.add(text, _href,
-                                    play_order=self.oeb.toc.next_play_order())
-
+                                        play_order=self.oeb.toc.next_play_order())
Ciao,
Steffen
siebert is offline   Reply With Quote
Old 03-09-2011, 01:13 PM   #14
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 25,367
Karma: 4961459
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Your patch will not work for the following reasons:

1) You need to use an OrderedDict not {}

2) if a file contains an <h2> as the first element and an <h1> after it, then the <h2> will be added to the <h1> from that file instead of the the <h1> from the previous file

I have fixed both issues, but I haven't really tested my fix, so let me know if there are any problems.
kovidgoyal is online now   Reply With Quote
Old 03-09-2011, 05:38 PM   #15
siebert
Developer
siebert has a complete set of Star Wars action figures.siebert has a complete set of Star Wars action figures.siebert has a complete set of Star Wars action figures.
 
Posts: 136
Karma: 280
Join Date: Nov 2010
Device: Kindle 3 (Keyboard) 3G / iPad 3 WiFi / Nexus 4 (Android)
Quote:
Originally Posted by kovidgoyal View Post
Your patch will not work for the following reasons:
It worked at least for my test documents and the real book I wanted to convert in the first place.

Quote:
1) You need to use an OrderedDict not {}
That's correct. On my python implementation the simple dictionary seems to work the same way as the OrderedDict, but only by accident.

Quote:
2) if a file contains an <h2> as the first element and an <h1> after it, then the <h2> will be added to the <h1> from that file instead of the the <h1> from the previous file
Also correct, I didn't thought of that case during my tests.

I've now created such an epub with sigil and your implementation creates the correct TOC, while mine misplaces the <h2> as described.

Quote:
I have fixed both issues, but I haven't really tested my fix, so let me know if there are any problems.
Thanks, it works fine for me.

Ciao,
Steffen
siebert is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Classic Can Nook read Multi-level TOC? jhempel24 Barnes & Noble NOOK 13 12-09-2010 11:55 PM
Multi-Level TOC edbro Calibre 4 09-16-2010 06:54 PM
Multi level TOC PAQUITO Bookeen 1 12-23-2009 03:57 AM
Opus Multi-Level TOC's in ePub AnemicOak Bookeen 1 11-08-2009 04:14 PM
ePub, 505 and multi-level ToC JSWolf Calibre 4 06-04-2009 02:12 PM


All times are GMT -4. The time now is 09:37 AM.


MobileRead.com is a privately owned, operated and funded community.