Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > ePub

Notices

Reply
 
Thread Tools Search this Thread
Old 03-26-2013, 03:52 AM   #1
gawl
Member
gawl can eat soup with a fork.gawl can eat soup with a fork.gawl can eat soup with a fork.gawl can eat soup with a fork.gawl can eat soup with a fork.gawl can eat soup with a fork.gawl can eat soup with a fork.gawl can eat soup with a fork.gawl can eat soup with a fork.gawl can eat soup with a fork.gawl can eat soup with a fork.
 
gawl's Avatar
 
Posts: 20
Karma: 9018
Join Date: Mar 2013
Device: Pocketbook Touch
Support of Special Unicode Characters?

First, please accept my apologize, because this is a bit off-topic, as I am not a Sigil user (and won't be, because I prefer to use XML+XSLT to create EPUBs).

But the Sigil forum was recommended to me as a first reaction to my original post, because a lot of EPUB creators are participating here.
In the (EPUB) e-books I create, I need support of ligatures (and the contrary, the "non-joining of glyphs"). However, I am quite desperate because my impression is that...:
-- either I am doing something completely wrong, or...
-- the support of the related OpenType features is simply non-existent, when it comes to recent mobile e-book devices.

In particular:
I am wondering whether/how it is possible to have Unicode characters like ZWNJ (U+200C) correctly "displayed" by e-book devices/software, when these appear in EPUB documents. (Just for completeness: "My" EPUBs look fine when using PC software like e.g. the Calibre preview!)

I am currently doing some tests on my Pocketbook Touch and also tried a Sony reader, and I get the impression that all EPUB software there simply ignores this character. (That means: All ligatures are automatically built, just like defined within the embedded OTF font, but they are also built at places where they must not appear (although being separated by ZWNJ from each other).)

Anyone else here in this forum interested in "Typography and EPUB"?
Can someone...
--- confirm or deny this typography problem?
--- tell me [if my impression is correct] whether this is an unchangeable fact that I simply have to accept (for the time being), or is there anything I can do?

The only trick that seems to come somewhat near to what there should be is to use U+200A (Hair Space) instead, because it prevents a ligature being built from the glyphs left and right to this character. But the downside of it is that a "Hair Space" allows a line-break, which no-one wants within a word, of course. I haven't found the time yet to test characters like U+2060 ("Word Joiner") or similar. But probably it's pointless anyway to check all the endless ranges of Unicode characters, because ZWNJ is the officially declared candidate for this purpose.

Maybe some other ideas?

Thanks for your efforts!
gawl is offline   Reply With Quote
Old 03-26-2013, 04:17 AM   #2
Toxaris
Wizard
Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.
 
Toxaris's Avatar
 
Posts: 3,101
Karma: 5861069
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-300, PRS-T1
Almost all knowledgable people linger in multiple groups, so you really should have posted this in the ePUB forum instead of the Sigil forum.
Now, with regards to your questions. The support of ligatures is dependent of the reader device. The older devices will not be able to handle it, since the ligature characters are not part of the font of the readers. The only way to have the ligatures in the old devices, is to embed a font with the ligatures.
That being said, the readers that support ligatures also have the tendency to create ligatures when the characters are next to each other. I believe the Sony does that and the reasoning is that it is better read experience for the reader. I think that for some readers you are able to turn it off.

Can you give cases where a ligature should not appear? If I recall correctly, it is only for certain languages like German. I think there lies the issue. The readers (and its software) are build in general for the English market (I know there are localized versions, but those are just translation). I do agree that the ZWNJ should be honored, but I am not too surprised it gets ignored actually... Have you tried with different embedding fonts like Charis SIL? You stand a good change ZWNJ is part of that font.
If the ZWNJ character is not part of the font (sounds silly since it is empty, but it needs to be defined), it will be ignored. Since on most readers the internal fonts are crippled/not complete, I would not be surprised if the ZWNJ is not part of it.

Last edited by Toxaris; 03-26-2013 at 04:20 AM.
Toxaris is offline   Reply With Quote
Old 03-26-2013, 11:20 AM   #3
theducks
Grand Sorcerer
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 15,096
Karma: 5939999
Join Date: Aug 2009
Location: (The original) Silicon Valley, USA
Device: Galaxy Tab 2, Astak Pocket Pro, K4NT

Moved to EPUB
theducks is online now   Reply With Quote
Old 03-26-2013, 01:43 PM   #4
dgatwood
Curmudgeon
dgatwood ought to be getting tired of karma fortunes by now.dgatwood ought to be getting tired of karma fortunes by now.dgatwood ought to be getting tired of karma fortunes by now.dgatwood ought to be getting tired of karma fortunes by now.dgatwood ought to be getting tired of karma fortunes by now.dgatwood ought to be getting tired of karma fortunes by now.dgatwood ought to be getting tired of karma fortunes by now.dgatwood ought to be getting tired of karma fortunes by now.dgatwood ought to be getting tired of karma fortunes by now.dgatwood ought to be getting tired of karma fortunes by now.dgatwood ought to be getting tired of karma fortunes by now.
 
Posts: 311
Karma: 1028382
Join Date: Jan 2012
Device: iPad, iPhone, Nook Simple Touch
You might try a zero-width space alongside each zero-width non-joiner. Worth a try, anyway. Both break ligatures and are treated as potential wrapping points, so I'm really not clear on the difference between the two, other than ostensibly some theoretical semantic difference between "not joining" and "unjoining". Anybody?
dgatwood is offline   Reply With Quote
Old 03-26-2013, 04:29 PM   #5
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 6,259
Karma: 4801165
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
Some of us are interested in "typography and EPUB", although it's usually a rather disappointing subject... Regarding the spaces, you can see here that some special ones are at least supported in some reader, and in particular ‌ worked fine in my test.
Jellby is online now   Reply With Quote
Old 03-27-2013, 06:59 AM   #6
gawl
Member
gawl can eat soup with a fork.gawl can eat soup with a fork.gawl can eat soup with a fork.gawl can eat soup with a fork.gawl can eat soup with a fork.gawl can eat soup with a fork.gawl can eat soup with a fork.gawl can eat soup with a fork.gawl can eat soup with a fork.gawl can eat soup with a fork.gawl can eat soup with a fork.
 
gawl's Avatar
 
Posts: 20
Karma: 9018
Join Date: Mar 2013
Device: Pocketbook Touch
Quote:
Originally Posted by Toxaris View Post
Can you give cases where a ligature should not appear? If I recall correctly, it is only for certain languages like German.
Yes, this is probably a field that is interesting only for some particular languages (German, in my case).

(Just as a general info for non-Germans: In good German typography, it is forbidden to have a ligature across different parts of a compound word. For a word like e.g. auffinden, which is made from the two morphems "auf"+"finden", one may bind the 2nd "f" with the "i", but one must not bind the two "f". A similar idea might also exist for English, but because English rarely combines words together to a new single word this way, this is probably not relevant here.)

@theducks: Thanks for moving to the right place!

@Jellby: Thanks for pointing me to the Test EPUB! So, obviously at least some readers are able to handle these things correctly. My own device ("Pocketbook Touch"), however, even fails completely at the "Ligatures" chapter of your Test EPUB(s). (Unfortunately this rises more questions as it answers, because this device succeeds in building ligatures for the fonts that I am using for creating my EPUBs, whereas it fails completely at building the ligatures in your Test EPUB. So, there must still be some [font-internal!?] differences that I've not understood so far...)

@dgatwood: Thanks for the suggestion! The good news is: It works (at least with my EPUBs on the Pocketbook Touch), i.e. the combination U+200C U+2060 breaks ligatures (without generating unwanted spaces or line-breaks).
Concerning what the difference might be between these two, I can only guess: Maybe it is related to the full-text-search, maybe the U+2060 breaks the word into two different parts so that the search does not find it, whereas the U+200C should probably not prevent a full-text-search. (But I am not sure about that, and I can even observe for a normal MozillaFirefox on a PC that the full-text-search finds neither a U+200C-separated word nor a U+2060-separated word.)
gawl is offline   Reply With Quote
Old 03-27-2013, 03:41 PM   #7
Toxaris
Wizard
Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.
 
Toxaris's Avatar
 
Posts: 3,101
Karma: 5861069
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-300, PRS-T1
Like I said, the internal fonts probably don't have the ZWNJ character built in. Therefore it works for some embedded fonts.
Toxaris is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Support of Special Unicode Characters in EPUB? gawl PocketBook 1 03-24-2013 06:12 AM
¿Convert unicode decomposed characters to unique/normal characters? JohnQwerty Calibre 3 04-05-2012 01:08 PM
Non-Roman Unicode Characters teh603 Writers' Corner 7 03-26-2012 12:06 PM
Unicode characters OK in text but wrong in TOC paulpeer ePub 8 01-15-2010 07:17 PM
Reader adds space after unicode characters... bmfrosty Astak EZReader 2 07-16-2009 09:53 PM


All times are GMT -4. The time now is 03:02 PM.


MobileRead.com is a privately owned, operated and funded community.