View Full Version : Epub Revision - enhanced global language support


Nate the great
04-08-2010, 07:10 PM
Need for enhanced global language support. There is substantial interest in using EPUB in China, Japan, Korea , and other geographies, however it is recognized that at a minimum, requirements for support for the character sets, Ruby markup, and typographic rules needed for reading systems in these geographies (including but not limited to special line-breaking rules and vertical writing direction) are not adequately specified in EPUB 2.0.1.

This one is outside my experience. Does anyone have any suggestions?

mimochal
04-10-2010, 02:40 PM
Hi Nate!

What about Arabic language?! we in the Arab world have got plenty of ebooks in Arabic; I have a Sony Reader and I can't wait for Sony people or someone from MR to suggest an update. I'd love to be able to convert my Arabic books into ePub or BBeB forms!

carmelra
04-11-2010, 05:02 PM
Hi,

I'm not familiar with the far east languages and related issues. However, at Mendele HeBooks ( http://www.mendele.co.il ), we are producing Hebrew ePubs. Since ePub is based on xhtml, it's not a real problem to have it in any required language. I mean, in a similar way one create a web page in any language, he should be able to create the ePub in such language. Arabic is very similar to Hebrew and there should not be any problem to create an Arabic ePub.

I think that the main issue here is the support level of the different reading systems. While there are so many software applications and reading devices that claim to "fully support" ePub, just a few of those are really handle different languages properly.

The main "features" that I think should be noticed in this respect are:
1) CSS support - this issue is complicated by itself and I guess it might be problematic to define what does it mean to "support" CSS. Again, maybe there is a need to define a set of minimal CSS support that is related to languages ("dir" attribute for example)
2) Embedded and non embedded fonts - Can a reading system handle properly fonts that exist on the system even when there are no embedded fonts? Can it deal with embedded fonts properly?
3) Language related HTML tags such as the "dir" tag which is critical for RTL languages...
4) The ability to render complicated text properly. Chars with punctuation marks and other non English characters support.

I would like to see some kind of clear defined levels of "language support" that the reading systems will be measured and classified accordingly.

For example, I will list some of our findings about different reading systems and how they handle Hebrew ePubs:
Adobe Digital Editions - Doesn't handle system fonts properly. Doesn't respect dir=rtl in the CSS or as an HTML tag.
Sony Reader - - Doesn't handle system fonts properly. Doesn't respect dir=rtl in the CSS or as an HTML tag.
Calibre - Handles system fonts properly, respect dir=rtl in the css and in the html tag. Handles special chars pretty good (there are some rendering issues with a few special chars)
AZARDI - Handles system fonts properly, respect dir=rtl in the css and in the html tag. Handles special chars properly.
FBReader - Handles system fonts properly, respect dir=rtl in the html tag. Handles special chars properly. Has some CSS support limitations...
EPUBReader - Handles system fonts properly, respect dir=rtl in the css and in the html tag. Handles special chars properly.
Stanza for iPhone - ignores the fonts as defined in the ePub file and use the system fonts (but display Hebrew properly) respect dir=rtl in the css and in the html tag. Handles special chars properly.
iBooks for iPad (initial tests) - Display Hebrew properly by respecting dir=rtl in the css and in the html tag. Handles special chars properly.

In this real example, ADE and Sony Reader should get lower ePub support level than Stanza, FBReader, Calibre, AZARDI, EPUBReader and iPad.

Regards,

Carmel

carmelra
04-13-2010, 09:19 AM
Hi,


Calibre - Handles system fonts properly, respect dir=rtl in the css and in the html tag. Handles special chars pretty good (there are some rendering issues with a few special chars)


In this real example, ADE and Sony Reader should get lower ePub support level than Stanza, FBReader, Calibre, AZARDI, EPUBReader and iPad.



One mistake I had in the sample i gave is about Calibre Hebrew support. Actually, it should be:

Calibre - Handles system fonts properly, respect dir=rtl in the css and in the html tag. Doesn't render some Hebrew punctuation properly. As a result the Calibre Viewer is crashing and it's impossible to read many Hebrew books due to this reason.

And as a result, in this real example, ADE and Sony Reader should get lowest ePub support level, after that comes Calibre with a medium support level and Stanza, FBReader, AZARDI, EPUBReader and iPad should get the higher Hebrew support level.

DaleDe
04-15-2010, 04:17 PM
One big problem is hyphenation support for languages and line breaking. For example it is ok to break a line after an emdash in English but not in Spanish.