![]() |
#1 |
Junior Member
![]() Posts: 4
Karma: 10
Join Date: Jun 2010
Device: Sony PRS 505
|
Looking for a tool to help strip fonts of uncessary characters
I am hand-converting a PDF book to ePub and have run into a problem. I don't want to use the PDF fonts because they are licensed in a proprietary format. So, I'm using as close as possible open source equivalents. One problem I am running into is that I am importing the entire font into the ePub file, making it much larger than it really needs to be. What I'd like to do is strip all unnecessary characters from the font so that it is as small as possible. I know how to do this with FontForge, but what I don't know how to do is determine which unicode characters, exactly, are used in a given work. The author likes to use various characters here and there beyond the normal Esperanto ones (most of the English alphabet, plus ĉĈĝĜĥĤĵĴŝŜŭŬ). I'm worried about missing various characters, I would have to examine the whole document by hand if I guessed. Is it possible to trick Acrobat Pro to do it for me (by converting it to PDF and then getting Acrobat to do it)? Are there scripts for detecting which characters are used in a Unicode text file? Any ideas for solving this problem are appreciated.
|
![]() |
![]() |
![]() |
#2 | |
Zealot
![]() ![]() Posts: 115
Karma: 150
Join Date: Jul 2008
Location: Netherlands Veenendaal
Device: Palm T5, Sony PRS-505, Nook Color
|
Quote:
http://scripts.sil.org/cms/scripts/p...CharacterCount If you're handy with perl you might be able to use the counts to write a FontForge script to automatically generate the stripped font. Regards, Joop |
|
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Junior Member
![]() Posts: 4
Karma: 10
Join Date: Jun 2010
Device: Sony PRS 505
|
Yes, thank you, this script helped a lot. In fact a script to process the font automatically might be nice, but FontForge already allows you to select font glyphs by unicode number, so I've decided to do this one by hand. If I end up having to do this quite often, I'll create a script and share it.
Keeping a record for myself and for those who come after me doing the same thing: Downloaded just about every epub reader for the Mac trying to "cut-n-paste" the entire book so that I could feed it into the above script. Only "Stanza" was able to adequately perform this function. ADE, Calibre, and FBreader were all completely unsuitable to the task. Various word processors also failed to perserve the Esperanto characters. The Mac Terminal with vi worked fine though. Last edited by eriĉjo; 06-28-2010 at 09:15 PM. |
![]() |
![]() |
![]() |
#4 |
Junior Member
![]() Posts: 4
Karma: 10
Join Date: Jun 2010
Device: Sony PRS 505
|
It's done! For anyone who likes short stories in Esperanto which are readable on a Sony in the ePub format (no doubt there are millions of you out there), here you go:
http://timwestover.com/marvirinstrato/?page_id=7 And thank you JvdW for your help. Last edited by eriĉjo; 06-30-2010 at 12:30 AM. |
![]() |
![]() |
![]() |
#5 |
Zealot
![]() ![]() Posts: 115
Karma: 150
Join Date: Jul 2008
Location: Netherlands Veenendaal
Device: Palm T5, Sony PRS-505, Nook Color
|
You're welcome. Part of it goes to Google which helped me to find it. I have to admit I didn't come up with that link the first time around but being a bit creative about the search terms and reading between the lines got me there.
The epub looks nice in ADE but in Sigil (0.23) it looks like its using a bitmap font instead of the included truetypes. Regards, Joop |
![]() |
![]() |
Advert | |
|
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Neo Weird characters and fonts | ivanm | BeBook | 0 | 07-21-2010 09:38 AM |
Large fonts / bold fonts for Kindle DX International | tandyjames | Amazon Kindle | 5 | 03-23-2010 06:53 AM |
iLiad how to strip ipdf? | harpum | iRex Developer's Corner | 2 | 06-24-2009 08:32 PM |
Best tool to strip text out of PDF for LRF conversion? | the7gerbers | LRF | 3 | 03-22-2009 07:27 PM |
Special Characters / Fonts | Gatton | IMP | 4 | 03-21-2008 01:43 AM |