Quote:
Originally Posted by colinsky
Are you aware of documentation (or reverse engineering) that describes the word boundary information in KF8 or KFX?
|
I have not seen any documentation on this. I looked into the KF8 GESW years ago but did not write up my findings. If I recall correctly it is a compressed table of coded instructions for parsing the raw HTML content to determine which bytes make up each word, taking into account that they might not be contiguous due to HTML markup.