Since I don't read/write Chinese, I was wondering if anyone on MR could help.
I know with many CJK Unicode characters, they can render differently depending on which language they're in (Chinese/Korean/Japanese). (See
"Han unification" on Wikipedia.)
The Fonts/Sentences
The documents I'm converting used these 4 fonts in the original DOCs:
- SimSun
- MS Gothic
- PMingLiU
- MS Mincho
Here's an example sentence of each:
(There are ~80 in total.)
I converted all to use
lang="zh" + xml:lang="zh":
Code:
(<i>Shujing</i>, “The Great Declaration I”, <span class="chinese" lang="zh" xml:lang="zh">泰誓上</span>)
[...]
The Questions
1. Is "zh" the proper lang to use in this case?
(I used Google Translate and it
seems like all the characters are in Chinese, but I'm not sure if it's Simplified/Traditional [zh-Hans or zh-Hant].)
2. When working with these characters, would it be best to embed a Chinese/language-specific font? If so, which one?
(Free/Open font preferable.)
3. Is there any better way of handling conversion to ebook? Or should I just trust the source document had them typed in correctly and that ereaders will render okay?
I visually inspected some, and they
seem to render similar to the source documents, but I'm not sure how they'll appear on actual ereaders.
The examples all look the same except for some small differences in #1 (SimSun + whatever font Sigil is rendering these in):
SimSun
MS Gothic
PMingLiU
MS Mincho
Side Note: For some more CJK unicode goodness, also see:
https://meta.stackexchange.com/quest...port-han-chara
https://modelviewculture.com/pieces/...-write-my-name
Seems like even many sites don't handle certain cases properly... so I can't imagine the ebook side of things. :P