Yes, SubsetFonts is the other plugin I was thinking of, but I didn't have the exact name off the top of my head.
Sure, any automatic process (or not subsetting at all) is easier than using FontForge. But as I see it, the entire point of subsetting fonts is to only include in the font those characters that are actually used from that font by the ebook.
For example, I don't use an embedded font for the body text—only for headings and (usually) the first 3 words of each chapter. The body of my book includes the word "café" but the headings and first 3 words do not. When I ran either SubsetFonts or ePUBOptimizer, the resulting subsetted version of the heading font included the character é. Which will never get used, among a few other extraneous characters. (I do appreciate that the ligatures were subsetted by ePUBOptimizer.)
ePUBOptimizer also subsetted & and #, which don't even appear in the body text—only in html entities.
The version of the manual process I used this time was:
1. Use Calibre to convert the ePub to HTMLZ
2. Unzip the file
3. Open the html in Word
4. Use Find Format to delete any text in the default body text font
5. Copy everything that's left into BBEdit
6. Add a line break after each character
7. Process duplicate lines, case sensitive, leaving one
8. Sort lines
9. Remove line breaks
10. Use the resulting list: (space),-.0123456789ABCDEFGHIJLMNOPRSTUVWYabcdefghiklmnoprs tuvwxyz®’“”
to determine which characters to keep in FontForge, plus (CR)fifl (which do get used in this book by readers that do ligatures automatically).
It would be possible to do all of that in a plugin—take each @font-face from the CSS, see which tags use that font-face, process only the text included in those tags in the ePub to find the unique characters. It would also be possible to do the ligature selection algorithmically, before the unique characters are found.
I've kludged some things together in python in the past, so maybe I could take a stab at adjusting SubsetFonts before the "# get rid of the tags" section, but someone who actually knows what they're doing could do it better and in a fraction of the time. I like a challenge, so maybe I'll take some Saturdays and try it anyway. I've never touched C#, so trying to alter ePUBOptimizer is beyond me.
|