MobileRead Forums - View Single Post - .azw1 or .tpz, antiquated Topaz sample file

Quoth · 10-14-2025, 03:55 PM

This is the bit where Topaz and DjVu are similar:

Quote:

The JB2 encoding method identifies nearly identical shapes on the page, such as multiple occurrences of a particular character in a given font, style, and size. It compresses the bitmap of each unique shape separately, and then encodes the locations where each shape appears on the page. Thus, instead of compressing a letter "e" in a given font multiple times, it compresses the letter "e" once (as a compressed bit image) and then records every place on the page it occurs.

https://en.wikipedia.org/wiki/DjVu

In contrast a scanned image in a PDF might simply be encapsulated TIFF.

Quote:

A PDF file is often a combination of vector graphics, text, and bitmap graphics. The basic types of content in a PDF are:

Typeset text stored as content streams (i.e., not encoded in plain text);
Vector graphics for illustrations and designs that consist of shapes and lines;
Raster graphics for photographs and other types of images; and
Other multimedia objects.

In later PDF revisions, a PDF document can also support links (inside document or web page), forms, JavaScript (initially available as a plugin for Acrobat 3.0), or any other types of embedded contents that can be handled using plug-ins.

PDF combines three technologies:

An equivalent subset of the PostScript page description programming language but in declarative form, for generating the layout and graphics.
A font-embedding/replacement system to allow fonts to travel with the documents.
A structured storage system to bundle these elements and any associated content into a single file, with data compression where appropriate.

https://en.wikipedia.org/wiki/PDF

DjVu beats PDF if the source is only scanned from paper, though like PDF it can have an OCR text layer to help search.

PDF beats DjVu for output from maths typesetting, vector art, wordprocessing DTP etc.

Both show a WYSIWG rendering of what might be printed on paper, and for DjVu the intention was a same size paper source. Normally the actual page size is pre-encoded into both.

Topaz takes the compression/encoding idea of DjVu (and the OCR overlay for search that scanned to PDF + OCR also has), but instead of replicating the original layout it reflows and re-paginates for the actual screen.

EDIT:
Also of course Topaz and DjVu the work is done by the creator's tools (the readers rendering is simple), whereas non-scanned PDF (with postscript from laTex, vector art etc), azw3/KF8, KFX, epub2 and especially epub3 (with javascript, reflowable and fixed layout) require more work. The mobi /KF7 is HTML3 and has no CSS, so is pretty simple to render, especially as it only has three font faces (serif, sans and monospace) each in normal, bold, italic and bold italic.