Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > ePub

Notices

Reply
 
Thread Tools Search this Thread
Old 04-25-2010, 04:23 PM   #1
Cygfrydd
Jackass Crazyfish
Cygfrydd began at the beginning.
 
Cygfrydd's Avatar
 
Posts: 7
Karma: 10
Join Date: Jan 2007
Location: St. Petersburg, Florida, US
Device: Sony Reader PRS-505
ePub Font Subsetting

I'm working on an entirely Python-based ePub build toolchain (I use Subversion for source management); have it working quite nicely, including font embedding and obfuscation. However, the resultant ePubs are suffering bloat, since I'm using fonts that have fairly extensive collections of glyphs, so I needed to implement some sort of subsetting.

This turned out to be far more complicated than I initially realised. epub-tools has been mentioned several times as supporting both obfuscation and subsetting, however, it's implemented in Java, and doesn't appear to be able to take an already-compiled ePub and modify it. Subsetting requires, it seems, two rather complex tasks: 1) parsing the content of the component files of the ePub for all elements that aren't set display: none (and possibly alt-text for images), parsing the embedded/inline-set styles to generated a computed style for each element, resolving the computed style to point at an embedded font, and then collecting the used glyphs from that font to decide what needs to be subset, and 2) subsetting the font[s] appropriately, which, as I've discovered, isn't as simple as just deleting all glyphs from the font that aren't needed (besides .notdef); apparently just modifying the Truetype 'glyf' table is insufficient.

I have an extremely ugly solution partially working, by using the Java tool css2xslfo to convert my content into XSL:FO, parsing the results to get font information and glyph coverage (drastically easier than trying to parse XHTML+CSS, and get computed styles), and then subsetting the font using a Perl tool font-optimizer to take the list of glyphs and actually do the subsetting.

This is ugly, and certainly doesn't meet my goal of doing everything in Python.

Does anyone have any suggestions? I can probably manage to cobble together workable font-subsetting using fonttools, which has a truly lovely roundtripping TTF-to-XML conversion, but the actual parsing of XHTML and associated stylesheets seems to be beyond me (though I find it difficult to believe someone hasn't already implemented this, beyond the basic stuff that cssutils does).

So... anyone have any ideas?

—Cyg
Cygfrydd is offline   Reply With Quote
Old 04-25-2010, 05:08 PM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,858
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
calibre resolves all CSS into simple classes of computed values as part of the conversion pipeline. This is then used for things like font size rescaling. Finding embedded fonts for subsetting should be trivial.
kovidgoyal is online now   Reply With Quote
Advert
Old 08-17-2010, 08:53 AM   #3
billingd
Enthusiast
billingd shines like a glazed doughnut.billingd shines like a glazed doughnut.billingd shines like a glazed doughnut.billingd shines like a glazed doughnut.billingd shines like a glazed doughnut.billingd shines like a glazed doughnut.billingd shines like a glazed doughnut.billingd shines like a glazed doughnut.billingd shines like a glazed doughnut.billingd shines like a glazed doughnut.billingd shines like a glazed doughnut.
 
Posts: 42
Karma: 8616
Join Date: May 2010
Location: Melbourne, Australia
Device: Kobo
sorry for the noise.

Last edited by billingd; 08-17-2010 at 08:55 AM. Reason: deleting irrelevant post
billingd is offline   Reply With Quote
Reply

Tags
epub font subset python


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
PDF 2 EPUB - font problem sulka Calibre 18 09-16-2010 06:20 AM
Font Difference Between ePUB and LRF? EatingPie Sony Reader 7 05-14-2010 05:32 PM
PRS-600 Default EPUB font? jamadams Sony Reader 5 04-06-2010 11:07 PM
ePub with external font DairyKnight Sony Reader 34 02-22-2010 02:31 AM
How do I insert a font in my epub using Sigil? Haya Sigil 2 11-10-2009 09:47 AM


All times are GMT -4. The time now is 03:08 AM.


MobileRead.com is a privately owned, operated and funded community.