04-28-2015, 03:08 PM | #31 |
Sigil Developer
Posts: 7,627
Karma: 5433388
Join Date: Nov 2009
Device: many
|
Hi,
I'm out of town traveling and will fix it for KindleUnpack when I get back home unless DiapDealer beats me to it KevinH |
04-28-2015, 08:50 PM | #32 |
Wizard
Posts: 3,108
Karma: 60231510
Join Date: Nov 2011
Location: Australia
Device: Kobo Aura H2O, Kindle Oasis, Huwei Ascend Mate 7
|
Thanks Kevin. Enjoy your trip.
|
Advert | |
|
04-28-2015, 09:57 PM | #33 |
Bookmaker & Cat Slave
Posts: 11,460
Karma: 158448243
Join Date: Apr 2010
Location: Phoenix, AZ
Device: K2, iPad, KFire, PPW, Voyage, NookColor. 2 Droid, Oasis, Boox Note2
|
I still really want to know what the hell caused it.
Hitch |
04-29-2015, 04:22 AM | #34 |
Wizard
Posts: 3,108
Karma: 60231510
Join Date: Nov 2011
Location: Australia
Device: Kobo Aura H2O, Kindle Oasis, Huwei Ascend Mate 7
|
I'm probably the least qualified on this thread, but I'll give you my understanding of the situation.
I think: 1. Some of us assumed, incorrectly, that the .azw3 file did not have a working toc. In fact it did. 2. What we were looking at, of course, is not the actual files, but the "reconstruction" of those files by either Calibre or KindleUnpack. 3. Both programs were extracting an incorrect toc.ncx. This toc had links referring to the single html file without the necessary "fragments". 4. In fact, the azw3 file does contain fragment information for the links in the form of uuid's. 5. Neither Calibre nor KindleUnpack were extracting these uuid fragments from the azw3 file, because, to use Kovid's description of the changes made, "kindlegen produced azw3 files that do not use normal HTML anchors for linking". It seems that both programs were expecting the use of normal HTML anchors and hence could not reconstruct the toc correctly from the file. 6. I don't know precisely what a normal HTML anchor would be for this purpose, but clearly a simple <h1> tag does not qualify. 7. I don't know why kindlegen dealt with the situation in the way that it did. Perhaps the original input toc was invalid and it generated a new one based on the <h1> tags? With my limited knowledge and experience of this area I cannot even dismiss the possibility that this was happening with every .azw3 file where the input was a single html file, as I think you said in an earlier post, probably based on a word document. 8. Kovid's modifications mean that Calibre including the viewer now correctly deals with these fragments. Kevin, or perhaps Diap Dealer, will do the same for KindleUnpack, which will presumably result in KindleUnpack extracting a toc.ncx from these files including the uuid "fragment" information. As I said, this is my understanding. I am quite possibly wrong in at least some particulars, and would appreciate any corrections by you or others here with a better understanding. I found the problem quite intriguing, and learnt a little about a few different areas, including the structure of .mobi and .azw3 files, KindleUnpack and of course TOC's. regards, Darryl |
04-29-2015, 04:56 AM | #35 |
creator of calibre
Posts: 43,835
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Links in an azw3 file are in the form of byte offsets into the raw html. In the past these byte offsets have always pointed to tags that have an id attribute. So calibre would simply use that id attribute as the anchor when converting the byte offset based link into a normal html link. Your problem file had byte offsets that point to tags with no id attribute. In this case calibre would simply point to the file, with no anchor.
The assumption that tags pointed to by byte offsets will always have ids is reasonable, since typically azw3 files are created from epub/html, where links always use ids. However, in the case of your file, the file was presumably created in a way that did not require ids. The links could have been created, for example, using XPath expressions, or some such. So in this case, with my fix, calibre now generates a unique id for the tag, when one is missing. Oh and I should mention, that the azw3 input plugin in calibre is largely based on KevinH's original work reverse engineering the azw3 format. |
Advert | |
|
04-29-2015, 05:00 AM | #36 |
Wizard
Posts: 3,108
Karma: 60231510
Join Date: Nov 2011
Location: Australia
Device: Kobo Aura H2O, Kindle Oasis, Huwei Ascend Mate 7
|
Kovid. Thanks for taking the time to explain.
|
04-29-2015, 08:46 PM | #37 | |
Grand Sorcerer
Posts: 27,545
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
Quote:
I only had the one for testing, so... https://github.com/kevinhendricks/KindleUnpack |
|
04-30-2015, 03:14 AM | #38 |
Sigil Developer
Posts: 7,627
Karma: 5433388
Join Date: Nov 2009
Device: many
|
Hi DiapDealer,
Thanks! Looks good! I will make an official 0.81 release upon my return. From Kovid's description it almost sounds like someone passed an old single file mobi 6 as input into some new kindlegen version (remember the filepos internal link destinations) and it converted the filepos literally. Take care, KevinH |
04-30-2015, 08:09 AM | #39 |
Grand Sorcerer
Posts: 27,545
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
I've been trying to duplicate it using Kindlegen/Previewer, but I just can't seem to make it happen. There's an update for KindlePreviewer that I'll check out next. My curiosity has been piqued.
|
04-30-2015, 02:42 PM | #40 |
Bookmaker & Cat Slave
Posts: 11,460
Karma: 158448243
Join Date: Apr 2010
Location: Phoenix, AZ
Device: K2, iPad, KFire, PPW, Voyage, NookColor. 2 Droid, Oasis, Boox Note2
|
|
04-30-2015, 03:34 PM | #41 |
Grand Sorcerer
Posts: 27,545
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
|
04-30-2015, 03:35 PM | #42 |
Ex-Helpdesk Junkie
Posts: 19,422
Karma: 85397180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
KevinH's guess that it is an all-new-and-improved way of totally mangling mobi7 input sounds reasonable.
I wonder if KDP uploads are now being converted when people upload a calibre-converted mobi7... because they are absolutely insistent on using calibre for some totally inexplicable reason. Amazon probably wants customers to actually get AZW3 if possible. You may not be able to duplicate this with *just* kindlegen/previewer. |
04-30-2015, 04:36 PM | #43 |
Bookmaker & Cat Slave
Posts: 11,460
Karma: 158448243
Join Date: Apr 2010
Location: Phoenix, AZ
Device: K2, iPad, KFire, PPW, Voyage, NookColor. 2 Droid, Oasis, Boox Note2
|
|
04-30-2015, 06:39 PM | #44 |
Grand Sorcerer
Posts: 27,545
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
I'm actually not KDP enabled.
|
04-30-2015, 11:24 PM | #45 |
Bookmaker & Cat Slave
Posts: 11,460
Karma: 158448243
Join Date: Apr 2010
Location: Phoenix, AZ
Device: K2, iPad, KFire, PPW, Voyage, NookColor. 2 Droid, Oasis, Boox Note2
|
Uh, well....I think that the general consensus of utterly uninformed opinion is that the whatever-it-is is happening at the KDP, no? You wanna send me a file to upload there, and hand back to you?
I also have a Word file MOBI around here somewhere, that I made for a video demo, I think, but I made it about a year ago. I probably still have the source Word file, though, and could repeat the experiment. It was a PG copy of a Christie Tommy & Tuppence that I cleaned up and tweaked a bit. Whatcha want? Hitch |
Thread Tools | Search this Thread |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
epub --> azw3 links loss | kerliza | Conversion | 9 | 09-26-2014 01:09 AM |
"invalid start byte" when trying to open a azw3 file with calibre | berlineirn06 | Conversion | 4 | 12-26-2012 01:44 PM |
Generated TOC links back to TOC page in the book | Caleb666 | Sigil | 7 | 08-17-2011 11:58 AM |
Redundant/Invalid TOC entries | Stinger | Kobo Reader | 4 | 06-26-2010 09:02 PM |
patch: LrfError: page id invalid in toc | grimborg | Calibre | 0 | 04-07-2010 05:22 AM |