MobileRead Forums - View Single Post - KRDS - A parser for Kindle reader data store files

jhowell · 08-14-2019, 08:09 AM

Quote:

Originally Posted by shamanNS

So, this script does not extract the actual text that was highlighted?

That is correct. The script decodes whatever is in the files indicated in the first post of this thread. The reader application has no need to store the actual text separately from the book format file.

The linkage between the files that this program decodes and the book's content are fields labeled with "position" in the name. These are strings that identify where to find content within a book and are interpreted differently for each book format.

KF8 (azw3) format appears to be the simplest case. The position is a decimal number giving an offset within the raw HTML content of the book, as can be obtained using the kindleunpack software. See the work done by j.p.s for an example of how to make use of this information.

MOBI (azw) format is similar, but there appears to be additional information that I have not attempted to decode.

KFX uses two values separated by a colon. The first is a base64 encoding of the eid and offset, which are fields used internally by KFX to determine the location of content. The second is the actual position number, which in the case of KFX counts visible unicode characters instead of raw HTML bytes.

I have not looked into how position numbers are handled in the other formats that Kindle supports.