View Single Post
Old 04-29-2015, 04:56 AM   #35
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,484
Karma: 28005164
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Links in an azw3 file are in the form of byte offsets into the raw html. In the past these byte offsets have always pointed to tags that have an id attribute. So calibre would simply use that id attribute as the anchor when converting the byte offset based link into a normal html link. Your problem file had byte offsets that point to tags with no id attribute. In this case calibre would simply point to the file, with no anchor.

The assumption that tags pointed to by byte offsets will always have ids is reasonable, since typically azw3 files are created from epub/html, where links always use ids. However, in the case of your file, the file was presumably created in a way that did not require ids. The links could have been created, for example, using XPath expressions, or some such. So in this case, with my fix, calibre now generates a unique id for the tag, when one is missing.

Oh and I should mention, that the azw3 input plugin in calibre is largely based on KevinH's original work reverse engineering the azw3 format.
kovidgoyal is offline   Reply With Quote