View Single Post
Old 05-26-2020, 09:31 PM   #1
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
Should Chinese Fonts be Embedded in Ebooks?

Since I don't read/write Chinese, I was wondering if anyone on MR could help.

I know with many CJK Unicode characters, they can render differently depending on which language they're in (Chinese/Korean/Japanese). (See "Han unification" on Wikipedia.)

The Fonts/Sentences

The documents I'm converting used these 4 fonts in the original DOCs:
  • SimSun
  • MS Gothic
  • PMingLiU
  • MS Mincho

Here's an example sentence of each:

Spoiler:
Code:
(<i>Shujing</i>, “The Great Declaration I”, <span style='font-family:SimSun'>泰誓上</span>)

[...]

Liu E, also known as Liu Tieyun <span style='font-family:"MS Gothic"'>劉鐵雲</span>, was born in 1857 at Liuhe <span style='font-family:"MS Gothic"'>六合</span> county in what is today Nanjing <span style='font-family:"MS Gothic"'>南京</span>.

[...]

From Liu E’s<span style='font-family:"PMingLiU"'>劉鶚</span> preface to <i>The Travels of Laocan</i> (<i>Laocan youji</i> <span style='font-family:"PMingLiU"'>老殘遊記</span>).

[...]

In his <i>Historical Records</i> (<i>Shiji</i> <span style='font-family:"MS Mincho"'>史記</span>), Sima Qian quotes the philosopher Jia Yi,


(There are ~80 in total.)

I converted all to use lang="zh" + xml:lang="zh":

Code:
(<i>Shujing</i>, “The Great Declaration I”, <span class="chinese" lang="zh" xml:lang="zh">泰誓上</span>)

[...]
The Questions

1. Is "zh" the proper lang to use in this case?

(I used Google Translate and it seems like all the characters are in Chinese, but I'm not sure if it's Simplified/Traditional [zh-Hans or zh-Hant].)

2. When working with these characters, would it be best to embed a Chinese/language-specific font? If so, which one?

(Free/Open font preferable.)

3. Is there any better way of handling conversion to ebook? Or should I just trust the source document had them typed in correctly and that ereaders will render okay?

I visually inspected some, and they seem to render similar to the source documents, but I'm not sure how they'll appear on actual ereaders.

The examples all look the same except for some small differences in #1 (SimSun + whatever font Sigil is rendering these in):

SimSun

Click image for larger version

Name:	Chinese.Example.1.PDF.png
Views:	602
Size:	7.7 KB
ID:	179540 Click image for larger version

Name:	Chinese.Example.1.EPUB.png
Views:	582
Size:	4.1 KB
ID:	179539

MS Gothic

Click image for larger version

Name:	Chinese.Example.2.PDF.png
Views:	579
Size:	14.4 KB
ID:	179542 Click image for larger version

Name:	Chinese.Example.2.EPUB.png
Views:	554
Size:	7.4 KB
ID:	179541

PMingLiU

Click image for larger version

Name:	Chinese.Example.3.PDF.png
Views:	558
Size:	13.9 KB
ID:	179544 Click image for larger version

Name:	Chinese.Example.3.EPUB.png
Views:	550
Size:	7.0 KB
ID:	179543

MS Mincho

Click image for larger version

Name:	Chinese.Example.4.PDF.png
Views:	563
Size:	7.5 KB
ID:	179546 Click image for larger version

Name:	Chinese.Example.4.EPUB.png
Views:	569
Size:	5.7 KB
ID:	179545

Side Note: For some more CJK unicode goodness, also see:

https://meta.stackexchange.com/quest...port-han-chara
https://modelviewculture.com/pieces/...-write-my-name

Seems like even many sites don't handle certain cases properly... so I can't imagine the ebook side of things. :P

Last edited by Tex2002ans; 05-27-2020 at 01:46 AM.
Tex2002ans is offline   Reply With Quote