Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > Workshop

Notices

Reply
 
Thread Tools Search this Thread
Old 10-10-2012, 09:22 AM   #16
pdurrant
The Grand Mouse
pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.
 
pdurrant's Avatar
 
Posts: 32,912
Karma: 89897838
Join Date: Jul 2007
Location: Norfolk, England
Device: NOOK ST GlowLight
I've found a Perl-based TTF font subsetter here: https://bitbucket.org/philip/font-optimizer/overview (MIT license)

there seems to be some C++ source here: http://podofo.sourceforge.net/doc/ht...TTFSubset.html (LGPL)

Another Perl one here: http://search.cpan.org/~mhosken/Font-TTF-Scripts/ (Perl Artistic license 2.0)

A php one here: http://www.4real.gr/technical-documents-ttf-subset.html

Aha! A python one under active development as part of a PDF generation project: http://code.google.com/p/pyfpdf/ (LGPL)

I do not know if any of the projects also work on otf fonts.

matplotlib is another python project that seems to include font subsetting: https://github.com/matplotlib/matplotlib (an attribution licence)
pdurrant is offline   Reply With Quote
Old 10-10-2012, 09:24 AM   #17
pdurrant
The Grand Mouse
pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.
 
pdurrant's Avatar
 
Posts: 32,912
Karma: 89897838
Join Date: Jul 2007
Location: Norfolk, England
Device: NOOK ST GlowLight
Quote:
Originally Posted by Freeshadow View Post
Fontforge can AFAIR be controlled by python scripts.
But is a bit of overkill. I'd much rather find some self-contained python library. I've done a bit of digging and listed some projects that I've found. This is to remind me to check them out later.
pdurrant is offline   Reply With Quote
Old 10-10-2012, 09:35 AM   #18
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 6,259
Karma: 4801167
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
I wonder how those subsetters work with ligatures and glyph variants...
Jellby is offline   Reply With Quote
Old 10-10-2012, 09:42 AM   #19
pdurrant
The Grand Mouse
pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.
 
pdurrant's Avatar
 
Posts: 32,912
Karma: 89897838
Join Date: Jul 2007
Location: Norfolk, England
Device: NOOK ST GlowLight
Quote:
Originally Posted by Jellby View Post
I wonder how those subsetters work with ligatures and glyph variants...
Those are good questions!
pdurrant is offline   Reply With Quote
Old 10-10-2012, 02:21 PM   #20
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 37,904
Karma: 18763702
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Sony Reader PRS-650, iPad, nook STR
Quote:
Originally Posted by Freeshadow View Post
Python means fontforge could be fed with it just what I tought about.
Not a good solution as Fontforge is Linux and to get it to run on Windows, Cygwin has to be installed and Python has to be installed under Cygwin. That's a very poor solution for Windows users. We need a more universal solution.
JSWolf is offline   Reply With Quote
Old 10-10-2012, 03:17 PM   #21
Doitsu
Wizard
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 2,052
Karma: 4836606
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by JSWolf View Post
Not a good solution as Fontforge is Linux and to get it to run on Windows, Cygwin has to be installed and Python has to be installed under Cygwin.
You're wrong on both accounts. The Fontforge Windows build no longer requires Cygwin and Python never did.
You may want to do your own research every now and then.
Doitsu is offline   Reply With Quote
Old 10-10-2012, 03:39 PM   #22
pdurrant
The Grand Mouse
pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.
 
pdurrant's Avatar
 
Posts: 32,912
Karma: 89897838
Join Date: Jul 2007
Location: Norfolk, England
Device: NOOK ST GlowLight
Working on way to subset fonts for ePub/KF3

In a couple of other threads, the idea has come up that it would be very useful to have a tool that could subset the fonts embedded in ePub and/or KF8 books, as that would both reduce the file sizes and make more fonts available, as some font licences require subsetting to permit embedding.

I have moved the posts from the main other thread over here.
pdurrant is offline   Reply With Quote
Old 10-10-2012, 03:49 PM   #23
pdurrant
The Grand Mouse
pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.
 
pdurrant's Avatar
 
Posts: 32,912
Karma: 89897838
Join Date: Jul 2007
Location: Norfolk, England
Device: NOOK ST GlowLight
Someone raised the question of ligatures and alternate characters. As I said, these are very good questions, as the creator of the ePub has no control over the glyph choice of the display software.

In PDFs, the subsetting task is a lot simpler, as all the glyphs used (not just characters) are fixed in the PDF.

For ePubs and KF8, I think we must take this into account in any solution. But this doesn't need to be part of the font subsetting code, which should work from a passed list of glyphs that should be included. (And should return an error if any are missing from the font.)

For ligatures we might need to generate not only a list of all characters in a file, but also of all character pairs. But, of course, there are also three character ligatures (ffi in English, for examples) and I suppose some languages might have more.

Hmmm... Perhaps we just need to include all ligatures for which the source file includes all the characters in the ligature.

Or perhaps we also need a script to get information on ligatures present in a font, so that that information can be used when parsing the XHTML.



Or should we start off with a very basic solution, and elaborate once that's working?
pdurrant is offline   Reply With Quote
Old 10-11-2012, 05:19 AM   #24
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 6,259
Karma: 4801167
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
Quote:
Originally Posted by pdurrant View Post
Hmmm... Perhaps we just need to include all ligatures for which the source file includes all the characters in the ligature.
I think that's the easiest, same with other features like kerning pairs or alternate glyphs.
Jellby is offline   Reply With Quote
Old 10-11-2012, 06:11 AM   #25
Toxaris
Wizard
Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.
 
Toxaris's Avatar
 
Posts: 3,104
Karma: 5861069
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-300, PRS-T1
The C++ class can also work with otf files according to the site. Unfortunately I cannot program in C++...
Toxaris is offline   Reply With Quote
Old 10-11-2012, 01:13 PM   #26
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 37,904
Karma: 18763702
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Sony Reader PRS-650, iPad, nook STR
Quote:
Originally Posted by Doitsu View Post
You're wrong on both accounts. The Fontforge Windows build no longer requires Cygwin and Python never did.
You may want to do your own research every now and then.
Are you sure Python would work with that Windows compiled version of Fontforge?
JSWolf is offline   Reply With Quote
Old 10-11-2012, 03:55 PM   #27
Doitsu
Wizard
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 2,052
Karma: 4836606
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by JSWolf View Post
Are you sure Python would work with that Windows compiled version of Fontforge?
Unfortunately, the MinGW based FontForge Windows binary installer doesn't seem to install the fontforge Python module, but theoretically it should be possible to use Python scripts to control FontForge.
Doitsu is offline   Reply With Quote
Old 10-11-2012, 03:57 PM   #28
Toxaris
Wizard
Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.
 
Toxaris's Avatar
 
Posts: 3,104
Karma: 5861069
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-300, PRS-T1
The windows binary of FontForge has more drawbacks. It crashes often.
Toxaris is offline   Reply With Quote
Old 10-14-2012, 04:29 PM   #29
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 9,424
Karma: 43260000
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Because I have much more time than sense, I've done some more work on the script that counts/collects the characters used in files.

Building on the core that Man Eating Duck posted, this script will work for a single ePub, (x)html, or text file. In addition to filtering all of the html code/attributes from the results, it will also convert entities (named or otherwise) to their rendered equivalents.

It also has the ability to limit the results to a single specified CSS class (handy for determining the font-subset required for headings or drop-caps).

Python will almost always have issues printing certain unicode characters to the console on Windows OSs, so Windows users should consider just writing the results to a file and then viewing that file with an editor that supports the required character encoding.

Should work with Python 2.5 - 2.7 (maybe even earlier).

A test xhtml file is included for testing/benchmarking purposes.

Spoiler:
Code:
USAGE: characters-used.py [-h] [-o OUTFILE] [-e ENCODING] [-c CSSCLASS] FILE

This script will parse an epub/html/text file and generate a list of
unique characters used in that file.

Positional arguments:
   FILE         Input file (epub html text).

Optional arguments:
   -h, --help           show this help message and exit.
   -o OUTFILE, --outfile OUTFILE
                Output file for unique character list. (default: None)
   -e ENCODING, --encoding ENCODING
                Character encoding of input file. (default: utf-8)
   -c CSSCLASS, --cssclass CSSCLASS
                Restrict results to a specific CSS class. (default: None)
Attached Files
File Type: zip characters-used.zip (4.5 KB, 73 views)

Last edited by DiapDealer; 10-14-2012 at 04:31 PM.
DiapDealer is online now   Reply With Quote
Old 10-14-2012, 06:44 PM   #30
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 37,904
Karma: 18763702
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Sony Reader PRS-650, iPad, nook STR
But is there a way of taking a TTF or OTF font file and subsetting it that works on Windows, OS X, & Linux without the need for Cygwin?
JSWolf is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
group an ARBITRARY subset of records RotAnal Library Management 6 10-09-2012 12:53 PM
Working with Fonts and Calibre kiwidude Development 8 03-04-2011 08:49 PM
Fonts not working in a converted book snape Sony Reader 9 11-09-2010 12:46 AM
Changing fonts not working? tselling Astak EZReader 11 09-21-2009 04:03 PM
Why are some fonts not working?? daviddem HanLin eBook 4 01-22-2009 10:14 AM


All times are GMT -4. The time now is 10:56 PM.


MobileRead.com is a privately owned, operated and funded community.