05-26-2020, 09:31 PM | #1 |
Wizard
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
Should Chinese Fonts be Embedded in Ebooks?
Since I don't read/write Chinese, I was wondering if anyone on MR could help.
I know with many CJK Unicode characters, they can render differently depending on which language they're in (Chinese/Korean/Japanese). (See "Han unification" on Wikipedia.) The Fonts/Sentences The documents I'm converting used these 4 fonts in the original DOCs:
Here's an example sentence of each: Spoiler:
(There are ~80 in total.) I converted all to use lang="zh" + xml:lang="zh": Code:
(<i>Shujing</i>, “The Great Declaration I”, <span class="chinese" lang="zh" xml:lang="zh">泰誓上</span>) [...] 1. Is "zh" the proper lang to use in this case? (I used Google Translate and it seems like all the characters are in Chinese, but I'm not sure if it's Simplified/Traditional [zh-Hans or zh-Hant].) 2. When working with these characters, would it be best to embed a Chinese/language-specific font? If so, which one? (Free/Open font preferable.) 3. Is there any better way of handling conversion to ebook? Or should I just trust the source document had them typed in correctly and that ereaders will render okay? I visually inspected some, and they seem to render similar to the source documents, but I'm not sure how they'll appear on actual ereaders. The examples all look the same except for some small differences in #1 (SimSun + whatever font Sigil is rendering these in): SimSun MS Gothic PMingLiU MS Mincho Side Note: For some more CJK unicode goodness, also see: https://meta.stackexchange.com/quest...port-han-chara https://modelviewculture.com/pieces/...-write-my-name Seems like even many sites don't handle certain cases properly... so I can't imagine the ebook side of things. :P Last edited by Tex2002ans; 05-27-2020 at 01:46 AM. |
05-26-2020, 10:38 PM | #2 |
Grand Sorcerer
Posts: 6,496
Karma: 84420419
Join Date: Nov 2011
Location: Tampa Bay, Florida
Device: Kindles
|
Are these books being produced for sale?
Do you have specific ecosystems in mind for these books? I don't read or speak Chinese but I know that Kindles have fonts for Chinese books and have different handling for simplified vs. traditional Chinese. |
Advert | |
|
05-27-2020, 01:32 AM | #3 | |
Wizard
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
Yes.
All the usual major ones. (B&N, Kobo, Amazon, [...].) Quote:
With Chinese, I previously ran across only ~2-3 characters in an entire book. In that case, I either didn't bother (2 characters likely wouldn't be missed if the reader didn't display), or I subset a font (like Droid Sans Fallback) just for those. In this specific case, it's 2 articles (out of ~230) that have dozens of Chinese words inside... and now that I've since learned about the language-dependent glyphs, I want this done right. Side Note: Just now I ran across this: https://en.wikipedia.org/wiki/List_of_CJK_fonts which lists:
None are open-source (so definitely not embeddable). And I may be dealing with different languages than I thought... I also wonder if Droid Sans Fallback is substitutable for all those, and will morph depending on lang... has anyone tested this across different ereaders? Side Note #2: Here's the 2 actual PDFs if anyone wants to take a closer look: http://libertarianpapers.org/wp-cont...3/lp-5-1-5.pdf http://libertarianpapers.org/wp-cont...6/lp-8-1-6.pdf Everything is all CC3.0. Last edited by Tex2002ans; 05-27-2020 at 03:23 AM. |
|
05-27-2020, 05:13 AM | #4 |
the rook, bossing Never.
Posts: 11,145
Karma: 85874891
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper11
|
Do the PDFs embed the required fonts? Otherwise you don't know what it should look like
|
05-27-2020, 06:01 AM | #5 | |
Wizard
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
Quote:
But there's two parallel issues here: 1. Fonts: Since I can't use any of those 4 proprietary fonts, I'm going to have to rely on different fonts in the ebook. On the proofing side of things, it's hard to tell if this is simple font differences (like a difference between Serif/Sans-Serif fonts)... or if stripping those fonts can cause the displayed text to now be wrong. Side Note: It looks like "Source Han Sans" may be another potential font candidate. 2. HTML Language: There are actual language variations (different swashes and swooshes). For example, this single character: 返 (U+8FD4) in different languages, has at least 5 different representations: https://en.wikipedia.org/wiki/File:S...Difference.svg In ebooks, this would require proper lang markup: Code:
<span lang="zh-Hans">返</span> (Simplified Chinese) <span lang="zh-Hant">返</span> (Traditional Chinese) <span lang="zh-HK">返</span> (Traditional Chinese - Hong Kong) <span lang="ja">返</span> (Japanese) <span lang="ko">返</span> (Korean) I mean, to me, the few sample images I posted in #1 look similar, but I don't know, because it all looks Chinese to me . Side Note: My best guess currently, is that I can change anything that was in: PMingLiU -> lang="zh-Hant" (Traditional Chinese) SimSun -> lang="zh-Hans" (Simplified Chinese) MS Gothic + MS Mincho -> lang="ja" (Japanese) then substitute in a thoroughly vetted Asian font (like Source Han Sans). But then comes actual device support... has anyone meticulously tested this stuff across devices? Last edited by Tex2002ans; 05-29-2020 at 08:33 PM. |
|
Advert | |
|
05-27-2020, 08:43 AM | #6 |
Grand Sorcerer
Posts: 6,496
Karma: 84420419
Join Date: Nov 2011
Location: Tampa Bay, Florida
Device: Kindles
|
I get it now. The book is primarily in English with Chinese characters here and there.
As this relates to Kindle there are language specific fonts for Simplified and Traditional Chinese, but those won't come into play since they are enabled based on the primary language of the book. The regular fonts probably won't have the characters you want and I believe that the fallback is the Code2000 font. I doubt that has any handling of language-specific character variants. So it does appear that embedding a font with the correct language variant would need to be done. Using images instead would be more foolproof. |
05-27-2020, 10:51 AM | #7 |
the rook, bossing Never.
Posts: 11,145
Karma: 85874891
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper11
|
I gave up and used an image (screen captured and reduced from source!) at first occurrence with transliteration and then just transliteration. Which may or may not have been correct. It was a few years ago and I tended to get [][][][][] on the actual ebook, but I didn't know much about Calibre or Font Embedding or CSS for language support then.
Also if you had someone Chinese, would they be the "right" Chinese person, though the various written scripts are simple compared with the bewildering variety of spoken "Chinese" languages. |
05-27-2020, 03:38 PM | #8 | |||||
Wizard
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
Quote:
Quote:
This is an English book with the occasional Chinese/Japanese character (~80 foreign words). Side Note: Do you know which fonts Kindles have for Simplified/Traditional Chinese? Quote:
Symbola is also a "fallback font" I embed whenever I'm dealing with very obscure Unicode characters (like Wingdings/Webdings, which I wrote about in 2016). Quote:
Quote:
Side Note: On many Asian font bugs and poor support across all types of programs... I recommend checking out some of these talks:
That's where I first learned about many of these Asian-specific issues. Last edited by Tex2002ans; 05-27-2020 at 08:02 PM. |
|||||
05-27-2020, 06:07 PM | #9 |
Grand Sorcerer
Posts: 6,496
Karma: 84420419
Join Date: Nov 2011
Location: Tampa Bay, Florida
Device: Kindles
|
As far as I know there are eight fonts. They are named Heiti, Kaiti, Song, and Yuan with separate ones for Traditional and Simplified Chinese. I don't know any details about these.
|
05-28-2020, 07:47 AM | #10 | |
the rook, bossing Never.
Posts: 11,145
Karma: 85874891
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper11
|
Quote:
It's a shame that these issues were largely solved at the OS level before anyone made any eink reader and that the early Kindles are so poor. What I do now isn't the same as even four years ago. |
|
05-29-2020, 10:45 PM | #11 |
Wizard
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
Doing a bit more research into "Source Han Sans":
https://github.com/adobe-fonts/source-han-sans They offer it as:
You can read more about why in the readme, or this helpful explanation post: Adobe's CJK Type Blog: "Source Han Sans: OTF, OTC, Super OTC, or Subset OTF?" Turns out, OTC (or TTC) is an "OpenType/CFF Collection". (All technical details can be read in Microsoft: "The OpenType Font File".) Doubt this works in ebooks. So, best bet would probably be to download the OTFs as needed, then embed. That would:
See Microsoft's The Old New Thing: "What happened to the Arial Unicode MS font?" and Wikipedia: "Arial Unicode MS". Last edited by Tex2002ans; 05-29-2020 at 11:00 PM. |
05-30-2020, 01:50 PM | #12 | ||
Grand Sorcerer
Posts: 5,584
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
|
Quote:
Quote:
@Tex2002ans You also might want to check out Noto CJK. Last edited by Doitsu; 05-30-2020 at 02:00 PM. |
||
05-30-2020, 01:56 PM | #13 |
Resident Curmudgeon
Posts: 73,931
Karma: 128903250
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
How are you going to handle the Chinese characters in Mobi eBooks?
|
05-30-2020, 03:31 PM | #14 | ||
Wizard
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
Quote:
Yeah, a lot of the Android fonts are also good, since they're (usually) open source + have to work across the entire world for billions of users at all different DPIs. Here's all of the Asian characters being used (Sigil's "Characters in HTML" Report): Code:
「」えとるアジ丈三上世之京仁佐保倉儒公六凱利到剛劉勢化南口古史司合君周命和商啟嘲四報墨夢大天太好子存学學專小岡崖州帝平年弼從德惠戰揚教文料斯景書末朱李束東林格業樹殘毅民氣江法泰津派浦淮清湖為無熹營爭片物狐獨玉王理瑞產用申發盜目研祖禮秀私程究紂紓經編老臣自蒙虎術袁覚言記詩誓說譜谷資造連遊道遠遺鉄録鏢鐵長開陰陳雲青非革韓頤魯鴉鶚黃黄齊
Note: I attached the 2 articles in EPUB if anyone wants to do testing. It's WIP files as of today, and I currently have no idea if I marked the languages up properly, but you can search for: Code:
<span class="chinese" <span class="japanese" And in the CSS file: Code:
span.chinese span.japanese Original PDFs are in Post #3. If anyone wants the HTML straight from Word, let me know and I can attach that too (since it has the original font markup too). But let me warn you, it's disgusting, and the characters are wrongly marked as... "French". Quote:
Side Note: Also, Chapter 18 "East Asia" of the Unicode Standard: http://www.unicode.org/versions/Unicode13.0.0/ covers a ton of stuff (like half-width/full-width characters). I guess I have some more reading to do. Wouldn't old MOBI (KF7) display Code2000? Last edited by Tex2002ans; 05-30-2020 at 03:39 PM. |
||
05-30-2020, 04:26 PM | #15 | |
Grand Sorcerer
Posts: 5,584
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
|
Quote:
Yes, but only Kindle 3 (AKA Kindle Keyboard) and higher. |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
embedded fonts vs e-reader fonts | lumpynose | ePub | 19 | 05-29-2019 01:06 PM |
Glo Kobo fonts naming conventions and embedded fonts | roger64 | Kobo Reader | 0 | 05-09-2013 06:30 AM |
Read Chinese books in Sony Reader PRS900 using Chinese Fonts | PSL | ePub | 3 | 10-08-2010 08:11 AM |
Embedded fonts, Calibre, and choice of fonts | AlexBell | ePub | 8 | 05-30-2010 06:00 AM |
iPad Embedded Fonts | JSWolf | Apple Devices | 24 | 04-26-2010 02:41 PM |