View Single Post
Old 04-29-2015, 04:22 AM   #34
darryl
Wizard
darryl ought to be getting tired of karma fortunes by now.darryl ought to be getting tired of karma fortunes by now.darryl ought to be getting tired of karma fortunes by now.darryl ought to be getting tired of karma fortunes by now.darryl ought to be getting tired of karma fortunes by now.darryl ought to be getting tired of karma fortunes by now.darryl ought to be getting tired of karma fortunes by now.darryl ought to be getting tired of karma fortunes by now.darryl ought to be getting tired of karma fortunes by now.darryl ought to be getting tired of karma fortunes by now.darryl ought to be getting tired of karma fortunes by now.
 
darryl's Avatar
 
Posts: 3,108
Karma: 60231510
Join Date: Nov 2011
Location: Australia
Device: Kobo Aura H2O, Kindle Oasis, Huwei Ascend Mate 7
Quote:
Originally Posted by Hitch View Post
I still really want to know what the hell caused it.

Hitch
I'm probably the least qualified on this thread, but I'll give you my understanding of the situation.

I think:

1. Some of us assumed, incorrectly, that the .azw3 file did not have a working toc. In fact it did.
2. What we were looking at, of course, is not the actual files, but the "reconstruction" of those files by either Calibre or KindleUnpack.
3. Both programs were extracting an incorrect toc.ncx. This toc had links referring to the single html file without the necessary "fragments".
4. In fact, the azw3 file does contain fragment information for the links in the form of uuid's.
5. Neither Calibre nor KindleUnpack were extracting these uuid fragments from the azw3 file, because, to use Kovid's description of the changes made, "kindlegen produced azw3 files that do not use normal HTML anchors for linking". It seems that both programs were expecting the use of normal HTML anchors and hence could not reconstruct the toc correctly from the file.
6. I don't know precisely what a normal HTML anchor would be for this purpose, but clearly a simple <h1> tag does not qualify.
7. I don't know why kindlegen dealt with the situation in the way that it did. Perhaps the original input toc was invalid and it generated a new one based on the <h1> tags? With my limited knowledge and experience of this area I cannot even dismiss the possibility that this was happening with every .azw3 file where the input was a single html file, as I think you said in an earlier post, probably based on a word document.
8. Kovid's modifications mean that Calibre including the viewer now correctly deals with these fragments. Kevin, or perhaps Diap Dealer, will do the same for KindleUnpack, which will presumably result in KindleUnpack extracting a toc.ncx from these files including the uuid "fragment" information.

As I said, this is my understanding. I am quite possibly wrong in at least some particulars, and would appreciate any corrections by you or others here with a better understanding. I found the problem quite intriguing, and learnt a little about a few different areas, including the structure of .mobi and .azw3 files, KindleUnpack and of course TOC's.

regards,

Darryl
darryl is offline   Reply With Quote