MobileRead Forums

MobileRead Forums (https://www.mobileread.com/forums/index.php)
-   Amazon Kindle (https://www.mobileread.com/forums/forumdisplay.php?f=140)
-   -   Unicode support in K3 (https://www.mobileread.com/forums/showthread.php?t=96294)

tomsem 08-28-2010 10:27 AM

Unicode support in K3
 
I have been checking to see exactly what's been added in terms of additional unicode characters. The feature description merely states 'Cyrillic, Chinese, Japanese, Korean' - but I'm finding it is much more, and somewhat less, than that.

I've seen reports that at least Chinese is not complete, and is missing some common glyphs, but am not qualified to check that (or Japanese or Korean), beyond verifying that indeed a great number of glyphs are supported. But as they say, almost isn't.

As to Cyrillic, I can confirm that it supports the basic Russian alphabet, and most other alphabets based on Cyrillic, as well as most 'extended' Cyrillic. However it is missing some glyphs needed for at least Macedonian, Bulgarian, Kildan Sami, and Yupik (reference: Cyrillic (Unicode block)). The missing glyphs are as follows:

0400 CYRILLIC CAPITAL LETTER IE WITH GRAVE
040D CYRILLIC CAPITAL LETTER I WITH GRAVE
0450 CYRILLIC SMALL LETTER IE WITH GRAVE
045D CYRILLIC SMALL LETTER I WITH GRAVE
048A-048F, 04C5-04C6, 04C9-04CF, 04D4-04D5, 04DA-04DB, 04EA-04ED, 04F6-04FF
It also does not include 'historic' Cyrillic in the range 0460-0489.

But beyond Cyrillic, it fully supports Armenian, Thai, Lao, Myanmar, Georgian, Ethiopian, Cherokee, Canadian Syllabics, Ogham, Runic, Buhid, Khmer, Mongolian, Limbu, and Ol Chiki - as far as I can tell. (Thai probably isn't really supported, I doubt it's able to properly break lines, etc.) I'm not able to assess readability. I'm not sure if Latin and Greek were complete before, and didn't really check these much - didn't notice anything missing.

It also looks like the 3 typefaces - regular, condensed, san serif - all look the same for the non Latin scripts.

Another good thing is that the fonts used for the item lists, browser, etc., seem to have at least the same level of support (I suspect the browser is even more complete, but haven't checked this yet). So you can use the new characters for Title/Author etc., and I would suspect review text that uses non-latin scripts will display properly on Kindle now, though I didn't test that.

Sadly, no Klingon.

But I'm excited to finally be able to make some Russian+English ebooks that can be read on Kindle without font hacks or settling for PDF.

drac 08-28-2010 10:55 AM

Can you do a quick check for Romanian language support on Kindle3?

Please comment on the other thread so others will benefit from your information (if you don't see square characters then everything is fine)

Thank you.

Kumabjorn 08-28-2010 10:58 AM

I am curious to know if you can do searches in non-latin based languages. Is there a basic IME included? Have you installed a Cyrillic based dictionary?

gollu 08-28-2010 11:03 AM

Hmm, that post is really informative, yet I didn't understand which are the Cyrillic letters that are not supported.
What is "IE"?
This : Й й ?
What about "I"? Is it И?

To my understanding Bulgarian has fewer letters than Russian../.

Slava 08-28-2010 11:17 AM

Looks like it's a variation of Й, never saw it used in Russian texts though

http://img.skitch.com/20100828-nnn48...jykw8b4ear.jpg

tomsem 08-28-2010 11:33 AM

Quote:

Originally Posted by drac (Post 1080274)
Can you do a quick check for Romanian language support on Kindle3?

Please comment on the other thread so others will benefit from your information (if you don't see square characters then everything is fine)

Thank you.

Yes, it looks like they are. Have posted as much on the other thread.

tomsem 08-28-2010 11:36 AM

Quote:

Originally Posted by Kumabjorn (Post 1080279)
I am curious to know if you can do searches in non-latin based languages. Is there a basic IME included? Have you installed a Cyrillic based dictionary?

There's no way to enter text for search, so if the index is smart, it will ignore non latin scripts.

Dictionary would be possible (in which case index SHOULD index everything), but I don't have one and they are not easy to make. Might be interesting to make a little one just to see what happens.

tomsem 08-28-2010 11:38 AM

Quote:

Originally Posted by gollu (Post 1080284)
Hmm, that post is really informative, yet I didn't understand which are the Cyrillic letters that are not supported.
What is "IE"?
This : Й й ?
What about "I"? Is it И?

To my understanding Bulgarian has fewer letters than Russian../.

Check the wikipedia reference I gave above, which says they're used in Bulgarian alphabet (how commonly I would not know). I assume 'IE' and 'I' are phonetic transliterations (unicode descriptions like these are always given in English)

gollu 08-28-2010 11:45 AM

I see. ѝ is rarely used and I've even seen published books using the regular и instead of ѝ.
Definitely not a deal breaker, yet good to know the Cyrillic ain't perfect.
Thanks for the heads-up.

ppw 08-28-2010 04:49 PM

Chinese is unreadable. It is full of "squares". I tried Mobi, PRC, TXT, PDF.

Kumabjorn 08-28-2010 05:23 PM

Is it possible to check if you have a Chinese font installed on the Kindle?
Squares is usually what appears when there isn't any supported font.

HelenaJole 08-28-2010 05:45 PM

I'm interested in the Korean!

Slava 08-28-2010 05:54 PM

According to this review, Traditional and simplified Chinese, Japanese and Korean characters are supported. He has screenshots that seem to confirm that. Perhaps your document should be in Unicode.

ppw 08-28-2010 06:43 PM

Quote:

Originally Posted by Slava (Post 1080923)
Perhaps your document should be in Unicode.

Do you know how to check the encoding of Mobi or epub files? There seems no option in Calibre.

ppw 08-28-2010 06:44 PM

Quote:

Originally Posted by Kumabjorn (Post 1080888)
Is it possible to check if you have a Chinese font installed on the Kindle?
Squares is usually what appears when there isn't any supported font.

Don't see any option to check what's installed. Though people seems can hack it.


All times are GMT -4. The time now is 06:54 PM.

Powered by: vBulletin
Copyright ©2000 - 3.8.5, Jelsoft Enterprises Ltd.
MobileRead.com is a privately owned, operated and funded community.