Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > Workshop

Notices

Reply
 
Thread Tools Search this Thread
Old 05-26-2020, 09:31 PM   #1
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
Should Chinese Fonts be Embedded in Ebooks?

Since I don't read/write Chinese, I was wondering if anyone on MR could help.

I know with many CJK Unicode characters, they can render differently depending on which language they're in (Chinese/Korean/Japanese). (See "Han unification" on Wikipedia.)

The Fonts/Sentences

The documents I'm converting used these 4 fonts in the original DOCs:
  • SimSun
  • MS Gothic
  • PMingLiU
  • MS Mincho

Here's an example sentence of each:

Spoiler:
Code:
(<i>Shujing</i>, “The Great Declaration I”, <span style='font-family:SimSun'>泰誓上</span>)

[...]

Liu E, also known as Liu Tieyun <span style='font-family:"MS Gothic"'>劉鐵雲</span>, was born in 1857 at Liuhe <span style='font-family:"MS Gothic"'>六合</span> county in what is today Nanjing <span style='font-family:"MS Gothic"'>南京</span>.

[...]

From Liu E’s<span style='font-family:"PMingLiU"'>劉鶚</span> preface to <i>The Travels of Laocan</i> (<i>Laocan youji</i> <span style='font-family:"PMingLiU"'>老殘遊記</span>).

[...]

In his <i>Historical Records</i> (<i>Shiji</i> <span style='font-family:"MS Mincho"'>史記</span>), Sima Qian quotes the philosopher Jia Yi,


(There are ~80 in total.)

I converted all to use lang="zh" + xml:lang="zh":

Code:
(<i>Shujing</i>, “The Great Declaration I”, <span class="chinese" lang="zh" xml:lang="zh">泰誓上</span>)

[...]
The Questions

1. Is "zh" the proper lang to use in this case?

(I used Google Translate and it seems like all the characters are in Chinese, but I'm not sure if it's Simplified/Traditional [zh-Hans or zh-Hant].)

2. When working with these characters, would it be best to embed a Chinese/language-specific font? If so, which one?

(Free/Open font preferable.)

3. Is there any better way of handling conversion to ebook? Or should I just trust the source document had them typed in correctly and that ereaders will render okay?

I visually inspected some, and they seem to render similar to the source documents, but I'm not sure how they'll appear on actual ereaders.

The examples all look the same except for some small differences in #1 (SimSun + whatever font Sigil is rendering these in):

SimSun

Click image for larger version

Name:	Chinese.Example.1.PDF.png
Views:	609
Size:	7.7 KB
ID:	179540 Click image for larger version

Name:	Chinese.Example.1.EPUB.png
Views:	593
Size:	4.1 KB
ID:	179539

MS Gothic

Click image for larger version

Name:	Chinese.Example.2.PDF.png
Views:	590
Size:	14.4 KB
ID:	179542 Click image for larger version

Name:	Chinese.Example.2.EPUB.png
Views:	560
Size:	7.4 KB
ID:	179541

PMingLiU

Click image for larger version

Name:	Chinese.Example.3.PDF.png
Views:	568
Size:	13.9 KB
ID:	179544 Click image for larger version

Name:	Chinese.Example.3.EPUB.png
Views:	557
Size:	7.0 KB
ID:	179543

MS Mincho

Click image for larger version

Name:	Chinese.Example.4.PDF.png
Views:	573
Size:	7.5 KB
ID:	179546 Click image for larger version

Name:	Chinese.Example.4.EPUB.png
Views:	577
Size:	5.7 KB
ID:	179545

Side Note: For some more CJK unicode goodness, also see:

https://meta.stackexchange.com/quest...port-han-chara
https://modelviewculture.com/pieces/...-write-my-name

Seems like even many sites don't handle certain cases properly... so I can't imagine the ebook side of things. :P

Last edited by Tex2002ans; 05-27-2020 at 01:46 AM.
Tex2002ans is offline   Reply With Quote
Old 05-26-2020, 10:38 PM   #2
jhowell
Grand Sorcerer
jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.
 
jhowell's Avatar
 
Posts: 7,155
Karma: 92500001
Join Date: Nov 2011
Location: Charlottesville, VA
Device: Kindles
Are these books being produced for sale?

Do you have specific ecosystems in mind for these books?

I don't read or speak Chinese but I know that Kindles have fonts for Chinese books and have different handling for simplified vs. traditional Chinese.
jhowell is offline   Reply With Quote
Old 05-27-2020, 01:32 AM   #3
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by jhowell View Post
Are these books being produced for sale?
Yes.

Quote:
Originally Posted by jhowell View Post
Do you have specific ecosystems in mind for these books?
All the usual major ones. (B&N, Kobo, Amazon, [...].)

Quote:
Originally Posted by jhowell View Post
I don't read or speak Chinese but I know that Kindles have fonts for Chinese books and have different handling for simplified vs. traditional Chinese.
I was treating it similar to how I handle Polytonic Greek. Since many of those obscure Greek characters don't show up on old devices, I embed a font (like Galatia SIL) just for that "greek" class, then subset it.

With Chinese, I previously ran across only ~2-3 characters in an entire book. In that case, I either didn't bother (2 characters likely wouldn't be missed if the reader didn't display), or I subset a font (like Droid Sans Fallback) just for those.

In this specific case, it's 2 articles (out of ~230) that have dozens of Chinese words inside... and now that I've since learned about the language-dependent glyphs, I want this done right.

Side Note: Just now I ran across this:

https://en.wikipedia.org/wiki/List_of_CJK_fonts

which lists:
  • Traditional Chinese
    • PMingLIU
  • Simplified Chinese
    • SimSun
  • Japanese
    • MS Gothic
    • MS Mincho

None are open-source (so definitely not embeddable).

And I may be dealing with different languages than I thought... I also wonder if Droid Sans Fallback is substitutable for all those, and will morph depending on lang... has anyone tested this across different ereaders?

Side Note #2: Here's the 2 actual PDFs if anyone wants to take a closer look:

http://libertarianpapers.org/wp-cont...3/lp-5-1-5.pdf
http://libertarianpapers.org/wp-cont...6/lp-8-1-6.pdf

Everything is all CC3.0.

Last edited by Tex2002ans; 05-27-2020 at 03:23 AM.
Tex2002ans is offline   Reply With Quote
Old 05-27-2020, 05:13 AM   #4
Quoth
Still reading
Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.
 
Quoth's Avatar
 
Posts: 14,901
Karma: 110507267
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper
Do the PDFs embed the required fonts? Otherwise you don't know what it should look like
Quoth is offline   Reply With Quote
Old 05-27-2020, 06:01 AM   #5
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by Quoth View Post
Do the PDFs embed the required fonts? Otherwise you don't know what it should look like
Those are the PDFs generated years ago, and then I have the actual DOC(X)s (this is how I know all the font information + have all the correct underlying unicode characters).

But there's two parallel issues here:

1. Fonts: Since I can't use any of those 4 proprietary fonts, I'm going to have to rely on different fonts in the ebook.

On the proofing side of things, it's hard to tell if this is simple font differences (like a difference between Serif/Sans-Serif fonts)... or if stripping those fonts can cause the displayed text to now be wrong.

Side Note: It looks like "Source Han Sans" may be another potential font candidate.

2. HTML Language: There are actual language variations (different swashes and swooshes).

For example, this single character:

返 (U+8FD4)

in different languages, has at least 5 different representations:

https://en.wikipedia.org/wiki/File:S...Difference.svg

In ebooks, this would require proper lang markup:

Code:
<span lang="zh-Hans">返</span> (Simplified Chinese)
<span lang="zh-Hant">返</span> (Traditional Chinese)
<span lang="zh-HK">返</span> (Traditional Chinese - Hong Kong)
<span lang="ja">返</span> (Japanese)
<span lang="ko">返</span> (Korean)
All are the same Unicode character, but should display differently (like the above SVG).

I mean, to me, the few sample images I posted in #1 look similar, but I don't know, because it all looks Chinese to me .

Side Note: My best guess currently, is that I can change anything that was in:

PMingLiU -> lang="zh-Hant" (Traditional Chinese)
SimSun -> lang="zh-Hans" (Simplified Chinese)
MS Gothic + MS Mincho -> lang="ja" (Japanese)

then substitute in a thoroughly vetted Asian font (like Source Han Sans). But then comes actual device support... has anyone meticulously tested this stuff across devices?

Last edited by Tex2002ans; 05-29-2020 at 08:33 PM.
Tex2002ans is offline   Reply With Quote
Old 05-27-2020, 08:43 AM   #6
jhowell
Grand Sorcerer
jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.
 
jhowell's Avatar
 
Posts: 7,155
Karma: 92500001
Join Date: Nov 2011
Location: Charlottesville, VA
Device: Kindles
I get it now. The book is primarily in English with Chinese characters here and there.

As this relates to Kindle there are language specific fonts for Simplified and Traditional Chinese, but those won't come into play since they are enabled based on the primary language of the book. The regular fonts probably won't have the characters you want and I believe that the fallback is the Code2000 font. I doubt that has any handling of language-specific character variants.

So it does appear that embedding a font with the correct language variant would need to be done. Using images instead would be more foolproof.
jhowell is offline   Reply With Quote
Old 05-27-2020, 10:51 AM   #7
Quoth
Still reading
Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.
 
Quoth's Avatar
 
Posts: 14,901
Karma: 110507267
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper
I gave up and used an image (screen captured and reduced from source!) at first occurrence with transliteration and then just transliteration. Which may or may not have been correct. It was a few years ago and I tended to get [][][][][] on the actual ebook, but I didn't know much about Calibre or Font Embedding or CSS for language support then.

Also if you had someone Chinese, would they be the "right" Chinese person, though the various written scripts are simple compared with the bewildering variety of spoken "Chinese" languages.
Quoth is offline   Reply With Quote
Old 05-27-2020, 03:38 PM   #8
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by jhowell View Post
I get it now. The book is primarily in English with Chinese characters here and there.


Quote:
Originally Posted by jhowell View Post
As this relates to Kindle there are language specific fonts for Simplified and Traditional Chinese, but those won't come into play since they are enabled based on the primary language of the book.
Agreed.

This is an English book with the occasional Chinese/Japanese character (~80 foreign words).

Side Note: Do you know which fonts Kindles have for Simplified/Traditional Chinese?

Quote:
Originally Posted by jhowell View Post
The regular fonts probably won't have the characters you want and I believe that the fallback is the Code2000 font.
I believe so too.

Symbola is also a "fallback font" I embed whenever I'm dealing with very obscure Unicode characters (like Wingdings/Webdings, which I wrote about in 2016).

Quote:
Originally Posted by jhowell View Post
I doubt that has any handling of language-specific character variants. So it does appear that embedding a font with the correct language variant would need to be done.
Agreed. Doubt Symbola handles that either. Probably need a font specifically designed for Asian languages.

Quote:
Originally Posted by Quoth View Post
I gave up and used an image (screen captured and reduced from source!) at first occurrence with transliteration and then just transliteration. Which may or may not have been correct. It was a few years ago and I tended to get [][][][][] on the actual ebook, but I didn't know much about Calibre or Font Embedding or CSS for language support then.
I strongly recommend against inserting text as images. I wrote about some reasons why in the 2018 Greek thread.

Side Note: On many Asian font bugs and poor support across all types of programs... I recommend checking out some of these talks:

That's where I first learned about many of these Asian-specific issues.

Last edited by Tex2002ans; 05-27-2020 at 08:02 PM.
Tex2002ans is offline   Reply With Quote
Old 05-27-2020, 06:07 PM   #9
jhowell
Grand Sorcerer
jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.
 
jhowell's Avatar
 
Posts: 7,155
Karma: 92500001
Join Date: Nov 2011
Location: Charlottesville, VA
Device: Kindles
Quote:
Originally Posted by Tex2002ans View Post
Side Note: Do you know which fonts Kindles have for Simplified/Traditional Chinese?
As far as I know there are eight fonts. They are named Heiti, Kaiti, Song, and Yuan with separate ones for Traditional and Simplified Chinese. I don't know any details about these.
jhowell is offline   Reply With Quote
Old 05-28-2020, 07:47 AM   #10
Quoth
Still reading
Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.
 
Quoth's Avatar
 
Posts: 14,901
Karma: 110507267
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper
Quote:
Originally Posted by Tex2002ans View Post
I strongly recommend against inserting text as images. I wrote about some reasons why in the 2018 Greek thread.
I'd agree.
It's a shame that these issues were largely solved at the OS level before anyone made any eink reader and that the early Kindles are so poor.

What I do now isn't the same as even four years ago.
Quoth is offline   Reply With Quote
Old 05-29-2020, 10:45 PM   #11
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
Doing a bit more research into "Source Han Sans":

https://github.com/adobe-fonts/source-han-sans

They offer it as:
  • 1 single OTC font file
    • Includes all languages and all weights.
    • Note: Works in mac OSX 10.8 + Windows 10 (1703 or above).
  • 7 OTCs
    • Split per weight.
    • Note: Works in mac OSX 10.8 + Windows 10 (1607 or above).
  • 28 OTFs
    • Split per language (Japanese + Korean + Simplified/Traditional Chinese) + per weight.
    • Includes all the characters, just displays that language's variants where applicable.
  • 28 Subset OTFs
    • Split per language, then all the characters not in that language are removed.

You can read more about why in the readme, or this helpful explanation post:

Adobe's CJK Type Blog: "Source Han Sans: OTF, OTC, Super OTC, or Subset OTF?"

Turns out, OTC (or TTC) is an "OpenType/CFF Collection". (All technical details can be read in Microsoft: "The OpenType Font File".)

Doubt this works in ebooks.

So, best bet would probably be to download the OTFs as needed, then embed. That would:
  • Minimize ebook filesize
  • Make sure the correct variant is drawn on the device.
    • Not needing to rely on the device to support/understand language-switching fonts.
Complete Side Note: Over the years, "Arial Unicode MS" was another fallback font I sometimes used. Turns out, it's been deprecated.

See Microsoft's The Old New Thing: "What happened to the Arial Unicode MS font?" and Wikipedia: "Arial Unicode MS".

Last edited by Tex2002ans; 05-29-2020 at 11:00 PM.
Tex2002ans is offline   Reply With Quote
Old 05-30-2020, 01:50 PM   #12
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,762
Karma: 24088559
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by jhowell View Post
The regular fonts probably won't have the characters you want and I believe that the fallback is the Code2000.
Code 2000 covers most of Unicode 5.2, however, many glyphs don't render well. (For an example, see the attached image.)
Quote:
Originally Posted by jhowell View Post
[...] I doubt that has any handling of language-specific character variants.
Since Traditional and Simplified Chinese characters are encoded using different codepoints, no character variant handling is required.

@Tex2002ans You also might want to check out Noto CJK.
Attached Thumbnails
Click image for larger version

Name:	ZH_Fonts.png
Views:	532
Size:	20.1 KB
ID:	179601  

Last edited by Doitsu; 05-30-2020 at 02:00 PM.
Doitsu is offline   Reply With Quote
Old 05-30-2020, 01:56 PM   #13
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 80,665
Karma: 150249619
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
How are you going to handle the Chinese characters in Mobi eBooks?
JSWolf is online now   Reply With Quote
Old 05-30-2020, 03:31 PM   #14
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by Doitsu View Post
@Tex2002ans You also might want to check out Noto CJK.


Yeah, a lot of the Android fonts are also good, since they're (usually) open source + have to work across the entire world for billions of users at all different DPIs.

Here's all of the Asian characters being used (Sigil's "Characters in HTML" Report):

Code:
「」えとるアジ丈三上世之京仁佐保倉儒公六凱利到剛劉勢化南口古史司合君周命和商啟嘲四報墨夢大天太好子存学學專小岡崖州帝平年弼從德惠戰揚教文料斯景書末朱李束東林格業樹殘毅民氣江法泰津派浦淮清湖為無熹營爭片物狐獨玉王理瑞產用申發盜目研祖禮秀私程究紂紓經編老臣自蒙虎術袁覚言記詩誓說譜谷資造連遊道遠遺鉄録鏢鐵長開陰陳雲青非革韓頤魯鴉鶚黃黄齊
And I think I narrowed it down, there's a handful of Japanese words in there too. So three languages:
  • Simplified/Traditional Chinese
  • Japanese

Note: I attached the 2 articles in EPUB if anyone wants to do testing.

It's WIP files as of today, and I currently have no idea if I marked the languages up properly, but you can search for:

Code:
<span class="chinese"

<span class="japanese"
to find every Asian word.

And in the CSS file:

Code:
span.chinese

span.japanese
if you wanted to test fonts.

Original PDFs are in Post #3.

If anyone wants the HTML straight from Word, let me know and I can attach that too (since it has the original font markup too). But let me warn you, it's disgusting, and the characters are wrongly marked as... "French".

Quote:
Originally Posted by Doitsu View Post
Since Traditional and Simplified Chinese characters are encoded using different codepoints, no character variant handling is required.
Hmmmmm...

Side Note: Also, Chapter 18 "East Asia" of the Unicode Standard:

http://www.unicode.org/versions/Unicode13.0.0/

covers a ton of stuff (like half-width/full-width characters). I guess I have some more reading to do.

Quote:
Originally Posted by JSWolf View Post
How are you going to handle the Chinese characters in Mobi eBooks?
Wouldn't old MOBI (KF7) display Code2000?
Attached Files
File Type: epub Libertarian.Papers.-.Asian.Articles[2020.05.29].epub (71.5 KB, 538 views)

Last edited by Tex2002ans; 05-30-2020 at 03:39 PM.
Tex2002ans is offline   Reply With Quote
Old 05-30-2020, 04:26 PM   #15
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,762
Karma: 24088559
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by Tex2002ans View Post
Side Note: Also, Chapter 18 "East Asia" of the Unicode Standard [...] covers a ton of stuff (like half-width/full-width characters).
IIRC, you'll only need to pay attention to half-width/full-width characters, if you want to use a CJK font to display CJK and Latin characters. Since this isn't the case with your book, you can forget about CJK character widths.

Quote:
Originally Posted by Tex2002ans View Post
Wouldn't old MOBI (KF7) display Code2000?
Yes, but only Kindle 3 (AKA Kindle Keyboard) and higher.
Doitsu is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
embedded fonts vs e-reader fonts lumpynose ePub 19 05-29-2019 01:06 PM
Glo Kobo fonts naming conventions and embedded fonts roger64 Kobo Reader 0 05-09-2013 06:30 AM
Read Chinese books in Sony Reader PRS900 using Chinese Fonts PSL ePub 3 10-08-2010 08:11 AM
Embedded fonts, Calibre, and choice of fonts AlexBell ePub 8 05-30-2010 06:00 AM
iPad Embedded Fonts JSWolf Apple Devices 24 04-26-2010 02:41 PM


All times are GMT -4. The time now is 01:58 PM.


MobileRead.com is a privately owned, operated and funded community.