Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > Workshop

Notices

Reply
 
Thread Tools Search this Thread
Old 01-23-2013, 07:59 PM   #16
Hitch
Bookmaker & Cat Slave
Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.
 
Hitch's Avatar
 
Posts: 2,371
Karma: 12862193
Join Date: Apr 2010
Location: Phoenix, AZ
Device: Kindle2, iPad, KindleFire and NookColor
Tox:

Stupid question: NEVER MIND. DUH.

H

Last edited by Hitch; 01-23-2013 at 08:00 PM. Reason: Really, really stupid.
Hitch is offline   Reply With Quote
Old 01-24-2013, 02:29 AM   #17
Toxaris
Wizard
Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.
 
Toxaris's Avatar
 
Posts: 2,955
Karma: 3363559
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-300, PRS-T1
Now I am curious what the question was...
Toxaris is offline   Reply With Quote
 
Enthusiast
Old 01-24-2013, 03:10 AM   #18
Hitch
Bookmaker & Cat Slave
Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.
 
Hitch's Avatar
 
Posts: 2,371
Karma: 12862193
Join Date: Apr 2010
Location: Phoenix, AZ
Device: Kindle2, iPad, KindleFire and NookColor
Quote:
Originally Posted by Toxaris View Post
Now I am curious what the question was...
IN short, it was: for those of us who won't know the final character set until after the ePUB is basically created, and have 20-30 xhtml files....is there an easier way to obtain all the text than copy-and-pasting each of the xhtml files into the box? And, is there any way that you can see to incorporate this with, say, ePUBtweak.exe, in that vein? So that Font Shrinker could scour the exploded files when you have ePUBtweak open, and obtain the character sets that way?

H.
Hitch is offline   Reply With Quote
Old 01-24-2013, 04:06 AM   #19
Toxaris
Wizard
Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.
 
Toxaris's Avatar
 
Posts: 2,955
Karma: 3363559
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-300, PRS-T1
For now, the only method is copy/pasting. I know that is not always handy, but it was the easiest to do and I needed the program now. I see the added value in having it reading ePUB and/or XHTML, but that will be some work. The main problem would be in identifying only the required characters in a class.
I will take a look at ePUBtweak to see if I can use the output from it. It might be a good idea.
Toxaris is offline   Reply With Quote
Old 01-24-2013, 04:16 AM   #20
meme
Sigil developer
meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.
 
Posts: 1,275
Karma: 1101600
Join Date: Jan 2011
Location: UK
Device: Kindle PW, K4 NT, K3, Kobo Touch
The next version of Sigil will have a report listing all the characters visible in Book View. It's not by class though, so to limit it to sections of text would still require you to do some work. It may be it needs to be changed to use Code View - this might allow seeing what is in a class but it would be guessing what is actually visible in Book View (e.g. if a style hides the text using display:none or similar).
meme is offline   Reply With Quote
Old 01-24-2013, 05:05 AM   #21
Toxaris
Wizard
Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.
 
Toxaris's Avatar
 
Posts: 2,955
Karma: 3363559
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-300, PRS-T1
I have thought about this for a while. I think I will start working on the following in the weekend (depends on a lot of personal stuff...):
- ability to select an ePUB
- parse XHTML to find all characters in use by a certain CSS class
- open the used fonts in the ePUB and shrink it according to the used characters for that font
- replace the fonts in the ePUB by the shrinked ones.

Don't expect it to be ready soon though, it needs quite some testing and the most difficult part will probably be the parsing of the stylesheet to find the classes where a font is defined/used. It might be that an intermediate version will be created where the styles class names have to be entered manually.

As as special service to JSWolf () I will automatically add the ligatures to the unique characters used.
Toxaris is offline   Reply With Quote
Old 01-24-2013, 11:26 AM   #22
Hitch
Bookmaker & Cat Slave
Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.
 
Hitch's Avatar
 
Posts: 2,371
Karma: 12862193
Join Date: Apr 2010
Location: Phoenix, AZ
Device: Kindle2, iPad, KindleFire and NookColor
Quote:
Originally Posted by Toxaris View Post
I have thought about this for a while. I think I will start working on the following in the weekend (depends on a lot of personal stuff...):
- ability to select an ePUB
- parse XHTML to find all characters in use by a certain CSS class
- open the used fonts in the ePUB and shrink it according to the used characters for that font
- replace the fonts in the ePUB by the shrinked ones.

Don't expect it to be ready soon though, it needs quite some testing and the most difficult part will probably be the parsing of the stylesheet to find the classes where a font is defined/used. It might be that an intermediate version will be created where the styles class names have to be entered manually.

As as special service to JSWolf () I will automatically add the ligatures to the unique characters used.
Well, that's a hell of a wishlist, and it would rock, but I'd be thrilled if it could simply peruse an ePUB for all the characters used in that ePUB, even if the classes are not discovered. By which I mean: let's say I have two fonts. One for the body; one for the chapter heads. By definition, the font for the body will have more characters, in all likelihood. However, I wouldn't care, at this point in time, if I had to feed the Shrinker all the chars in the ePUB, to shrink the Chapter head font.

To have it perfect, later, would be, as I said, amazing, but right this second, what I'd love is if it could just open the ePUB and say, "VOILA!" I don't even care if I have to manually replace the fonts, that's not a big deal.

Not that I'd turn DOWN Shrinker with all the extra goodies...just thinking aloud about what I, personally, need most. I realize my needs are probably different than almost everyone else's.

OH, also: a way to direct the location of the output of the created subsetted font would be super. While I'm wish-listing.

And if I didn't say it loudly enough, before: seriously, you are fabulous.

H
Hitch is offline   Reply With Quote
Old 01-24-2013, 12:31 PM   #23
Freeshadow
temp. out of service
Freeshadow ought to be getting tired of karma fortunes by now.Freeshadow ought to be getting tired of karma fortunes by now.Freeshadow ought to be getting tired of karma fortunes by now.Freeshadow ought to be getting tired of karma fortunes by now.Freeshadow ought to be getting tired of karma fortunes by now.Freeshadow ought to be getting tired of karma fortunes by now.Freeshadow ought to be getting tired of karma fortunes by now.Freeshadow ought to be getting tired of karma fortunes by now.Freeshadow ought to be getting tired of karma fortunes by now.Freeshadow ought to be getting tired of karma fortunes by now.Freeshadow ought to be getting tired of karma fortunes by now.
 
Posts: 2,262
Karma: 12637080
Join Date: May 2010
Location: Duisburg (DE)
Device: BeBook mini
I'm really enthusiastic that this particular idea of epub tweaking found that much positive resonance - really no joking here.

Tox: while you work on manipulation of the font files you should consider auto-renaming them: both filename & the font name stored inside the font file. AFAIR It's often required even in licences of free fonts when they are changed. While it's relatively meaningless for personal uses it's crucial as soon as your tool matures to become a part of the toolchain used by professional producers. (and aren't more optimized professional books a goal we all wish for?)
Freeshadow is offline   Reply With Quote
Old 01-24-2013, 12:47 PM   #24
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 37,016
Karma: 18129756
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Sony Reader PRS-650, iPad, nook STR
Quote:
Originally Posted by grannyGrumpy View Post
Super! Would you like to be adopted?

After hearing about the problem with ligatures from Calibre subsetted fonts, I'm curious. Has anybody checked on whether this handles ligatures ok?

Thank you Toxaris, you get ten gold stars.
I did report the problem with ligatures and that has been fixed in Calibre.
JSWolf is online now   Reply With Quote
Old 01-24-2013, 12:49 PM   #25
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 37,016
Karma: 18129756
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Sony Reader PRS-650, iPad, nook STR
Quote:
Originally Posted by Toxaris View Post
As as special service to JSWolf () I will automatically add the ligatures to the unique characters used.
Thank you very much! This will help for sure.

I've figured out what would work with the current version. Take the ePub, convert it to HTMLZ and run the HTML file through the subsetter and there you go.

Last edited by JSWolf; 01-24-2013 at 12:54 PM.
JSWolf is online now   Reply With Quote
Old 01-24-2013, 12:55 PM   #26
meme
Sigil developer
meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.
 
Posts: 1,275
Karma: 1101600
Join Date: Jan 2011
Location: UK
Device: Kindle PW, K4 NT, K3, Kobo Touch
I'm not sure. Do you have an example epub (a link or just a small file is fine) that contains ligatures? The code literally just reports each unicode character that appears in the text (and if it has an entity name).
meme is offline   Reply With Quote
Old 01-24-2013, 01:01 PM   #27
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 37,016
Karma: 18129756
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Sony Reader PRS-650, iPad, nook STR
Quote:
Originally Posted by meme View Post
I'm not sure. Do you have an example epub (a link or just a small file is fine) that contains ligatures? The code literally just reports each unicode character that appears in the text (and if it has an entity name).
The ePub does not contain the ligatures. ADE 2.0 and Calibre (and maybe other reading software) converts to using ligatures. So for example, if your text have a word such as flight, the fl will be converted to the ligature and displayed that way. Your code would have to handle fl as separate fl and as the ligature for reading software that does and does not convert to ligatures.

Oh and would it be possible to display each character for a given font for embedded fonts?
JSWolf is online now   Reply With Quote
Old 01-24-2013, 01:20 PM   #28
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 6,142
Karma: 4792399
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
I'll try to explain. A text typically has no explicit ligatures, it could have some, but it should not, and I've only seen some very old text files with them. What a text has is just normal unicode characters, let's say a text consists of the single word "office", that's only 5 different letters: c, e, f, i, o.

Now, a font could have ligatures defined, and a reading software may use them (although many do not, I'm afraid). Let's say that the font we are dealing with has the ligatures "fi", "ffi" and "fj" defined. Defining a ligature means that the font has a glyph (a character shape) for the combination "fi" and some instructions saying that whenever there's an f and an i in the text, they should be rendered as the "fi" ligature and not as the separate characters (ditto for "ffi" and "fj").

OK, then our text will ideally be displayed as 4 glyphs: "o", "ffi", "c" "e". There are different things a font subsetter could do:

1) Remove everything but "o", "f", "i", "c", "e", including ligatures and their definition. This is not ideal, but it's probably the simplest.

2) Same as 1, but do not remove ligatures or their definition. That's much better, but it leaves unused glyphs, such as "fi" or "fj".

3) Detect ligatures, find out that "i" and "f" are never used alone, and remove everything but "o", "ffi", "c", "e". This is not a good idea, as renderers that do not support ligatures will not be able to display "f" and "i".

4) Remove all unused single characters, and related ligatures. This would remove "fj", since "j" is not in the source text, but leave "fi" since both "f" and "i" are, although the "fi" ligature is never used (because we have "ffi" already). I think this is the perfect combination of subsetting and not too demanding.

5) Remove some or all ligatures (the glyphs), but do not remove their definitions. This is not a good idea either, and I think this was the bug in Calibre. It means a renderer supporting ligatures would believe there is a ligature to use for "ffi", but it would't find it.

So, if you can, go for #4. But things may be significantly harder. A font (particulary an OTF one) may contain other alternate shapes for glyphs (final forms, swash forms, older variants, small-caps, etc.), those are currently unused by practically all renderers, but there's still hope that some day we'll be able to enjoy some more advanced typesetting options...
Jellby is offline   Reply With Quote
Old 01-24-2013, 01:21 PM   #29
Freeshadow
temp. out of service
Freeshadow ought to be getting tired of karma fortunes by now.Freeshadow ought to be getting tired of karma fortunes by now.Freeshadow ought to be getting tired of karma fortunes by now.Freeshadow ought to be getting tired of karma fortunes by now.Freeshadow ought to be getting tired of karma fortunes by now.Freeshadow ought to be getting tired of karma fortunes by now.Freeshadow ought to be getting tired of karma fortunes by now.Freeshadow ought to be getting tired of karma fortunes by now.Freeshadow ought to be getting tired of karma fortunes by now.Freeshadow ought to be getting tired of karma fortunes by now.Freeshadow ought to be getting tired of karma fortunes by now.
 
Posts: 2,262
Karma: 12637080
Join Date: May 2010
Location: Duisburg (DE)
Device: BeBook mini
Just what jellby said.
I was slower and less detailed at it.

Last edited by Freeshadow; 01-24-2013 at 01:23 PM.
Freeshadow is offline   Reply With Quote
Old 01-24-2013, 01:33 PM   #30
Toxaris
Wizard
Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.
 
Toxaris's Avatar
 
Posts: 2,955
Karma: 3363559
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-300, PRS-T1
For now I will probably just add the few ligature glyphs. There aren't that many, so the impact on the size is limited. I should think about the smallcaps, but that one will be at the bottom on the list.

Let me first work on the list and look for pink bidets later...
Toxaris is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Working on way to subset fonts for ePub/KF3 Freeshadow Workshop 51 04-22-2013 04:18 PM
Embedded font-subset sometimes fails GrannyGrump Sigil 3 10-20-2012 09:47 AM
group an ARBITRARY subset of records RotAnal Library Management 6 10-09-2012 11:53 AM
Kindle 1 Font Mod Tool v0.1 lovebeta Kindle Developer's Corner 20 04-16-2012 03:06 PM
Is there a tool to see the contents of an embedded font file (ttf)? James_Wilde ePub 4 09-06-2010 03:53 PM


All times are GMT -4. The time now is 03:35 AM.


MobileRead.com is a privately owned, operated and funded community.