View Single Post
Old 09-19-2015, 04:58 PM   #423
adamselene
Enthusiast
adamselene can tame squirrels without the assistance of a chair or a whip.adamselene can tame squirrels without the assistance of a chair or a whip.adamselene can tame squirrels without the assistance of a chair or a whip.adamselene can tame squirrels without the assistance of a chair or a whip.adamselene can tame squirrels without the assistance of a chair or a whip.adamselene can tame squirrels without the assistance of a chair or a whip.adamselene can tame squirrels without the assistance of a chair or a whip.adamselene can tame squirrels without the assistance of a chair or a whip.adamselene can tame squirrels without the assistance of a chair or a whip.adamselene can tame squirrels without the assistance of a chair or a whip.adamselene can tame squirrels without the assistance of a chair or a whip.
 
Posts: 42
Karma: 11050
Join Date: Nov 2009
Device: Kindle Paperwhite, Kindle Touch, Kindle 2
I did some research today on ligatures with KF8 and older Kindles.

There is specific code in kindlegen to expand precomposed ligatures back to their constituent letters. This is sort of obnoxious, but is probably due to the fact that older Kindles (note: I only tested on a Touch, but I assume it's true of all older devices) render them horribly—it looks like they're using a fallback font that's completely different from Caecilia.

However, if you want to use precomposed ligatures (Unicode code points FB00-FB04) in your KF8 files, there are two options:

1) Use Calibre. It has a checkbox to turn off ligature expansion, and Just Works. (However, I didn't like using Calibre, because it doesn't use HUFF/CDIC compression, and ignores the Adobe page-map, making its own instead.)

2) Binary patch kindlegen. Ultimately, it just has a simple table of ligatures. E.g., in 0208-797bf75, it starts here:

0x88875c0: 0x00000132 0x08887e48 0x00000002 0x00000133
0x88875d0: 0x088d5548 0x00000002 0x000001c7 0x08887e4b
0x88875e0: 0x00000002 0x000001c8 0x08887e4e 0x00000002
0x88875f0: 0x000001c9 0x08887e51 0x00000002 0x000001ca

The format is simple: triplets of UTF-16 code point, string pointer, string length. The interesting part of the standard ligatures is here:

0x8887698: 0x0000fb00 0x0888f0ba 0x00000002 0x0000fb01
0x88876a8: 0x0888dc45 0x00000002 0x0000fb02 0x08887e6e
0x88876b8: 0x00000002 0x0000fb03 0x08887e69 0x00000003
0x88876c8: 0x0000fb04 0x08887e6d 0x00000003

Replace the code points with something innocuous (I used 0xebcd), and voila, it now outputs the ligatures. You'll also now see this in the informational output:

Info(prcgen):I1046: Found UNICODE range: Alphabetic Presentation Forms [FB00..FB4F]

This is really only good for Paperwhites (back to 1st gen) and Voyage, but hey, that works for me!

Addendum: The precomposed ligatures also display fine in current K4PC (both Georgia and Bookerly). It's worth noting that searching for a word with one of these characters in K4PC is wonky (you have to enter the ligature character itself), but it seems to work fine on the hardware.

Further addendum: In case it isn't obvious, most EPUB books do not contain precomposed ligatures, instead relying on the rendering system and the font to build ligatures automatically. You will have to edit the ebook contents to use them. This seems to follow Amazon's general trend of not using any of the bytecode stuff in OpenType.

Last edited by adamselene; 09-19-2015 at 05:45 PM.
adamselene is offline   Reply With Quote