Core to KFX is the
ion data representation. It is a means of encoding a complex data structure into a compact binary format. It includes representations for numbers, strings, symbols, and various types of structured data.
The separate ion data structures that make up a book are called fragments or entities. In KDF files produced by Kindle Previewer 3 they are placed in an SQLite database.
In KDF the fragments form a hierarchy containing the book content:
- document_data contains a list of sections in reading order.
- Each section corresponds to an html file in the source EPUB. It has a page template and a reference to a story.
- A story contains a list of content. Content types are based on HTML: container (nested div), text (div, p, h1, etc.), image (img), horizontal_rule (hr), list (ul, ol), listitem (li), table (table), etc.
- Sets of formatting instructions are grouped into a style that has properties that are based on HTML attributes and CSS properties. (For example, "background-color" becomes "fill_color".)
- There are also fragments containing metadata, and navigation (toc, page numbers, locations, and positions).
KFX appears to contain pretty much the same contents as KDF, just packed in a proprietary container format. (As least the visible parts of KFX, not protected by DRM. I assume that the hidden data also corresponds.)