Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Closed Thread
 
Thread Tools Search this Thread
Old 12-07-2019, 12:31 PM   #1
akatsuki
Junior Member
akatsuki began at the beginning.
 
Posts: 1
Karma: 10
Join Date: Dec 2019
Device: Kindle Oasis
More info regarding the zh-cn / zh-tw differences for AZW3 output

Hello Calibre developer,

This topic is a continuation of the older post: When Calibre convert, the input language is zh-tw, but the output language become zh. I find myself not able to reply to that thread, therefore open a new thread. Just in case it's not a good manner here, please let me know.

1. I am willing to help write code and debug

I understand Calibre is an open source project, and it is not an obligation for any developer to solve any problem. Therefore, I am willing to help write code and debug. But currently I have no knowledge of AZW3 internal format and the architecture of Calibre, so I post here to gather information and seek help.

2. The reading difference between zh-cn and zh-tw

The main difference is that the Kindle operating system provides different fonts for them.

For zh-cn, they are: 宋体, 黑体, 楷体, 圆体.
For zh-tw, they are: 宋體, 黑體, 楷體, 圓體.
(Notice the slight difference in the names)
It seems that Kindle not yet supports zh-{hk,mo,sg,my}. But zh-{hk,mo} is similar to zh-tw, and zh-{sg,my} is similar to zh-cn.

3. Why font matters?

To save Unicode encode space, the Unicode consortium merges CJKV characters from different country or territory into same Unicode representation. This caused a result that the reader must choose the correct font, otherwise character shapes from mixed country or territory will appear in-mid of a paragraph. Most shared characters have similar shapes so the reader can guess, but roughly less than 1% of the characters are unintelligible because the shapes are not similar.

You can learn the Unicode same-codepoint-different-shape problem from this picture on Wikipedia.

4. Possible values for zh-cn and zh-tw

From previous posts, I know that it is not clear which XML value does Kindle recognize as zh-cn and zh-tw. I think they might be one of the following:

Code:
zho-cn / zho-tw
zho-hans / zho-hant
zho-sim / zho-trad (or maybe zho-tra)
zho_CN / zho_TW
...
This way we can narrow down the search so the amount of work may be less, ... probably.

4. Possible fallback method?

In case none of them work, maybe it would be possible to add an
Code:
<html lang="zh-tw">
attribute to force the Kindle to use the correct font if Kindle uses an HTML render that understands this.

Thank you.
akatsuki is offline  
Old 12-07-2019, 05:22 PM   #2
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 20,550
Karma: 26954694
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by akatsuki View Post
Hello Calibre developer,

This topic is a continuation of the older post: When Calibre convert, the input language is zh-tw, but the output language become zh. I find myself not able to reply to that thread, therefore open a new thread. Just in case it's not a good manner here, please let me know.
If you're referring to this:

Click image for larger version

Name:	Annotation 2019-12-08 091617.jpg
Views:	53
Size:	25.9 KB
ID:	175430

It's a warning to deter piggy back posts to old threads - but it shouldn't prevent new posts, especially from the original poster.

Let me know, if you want this thread to be merged with the old one.

BR
BetterRed is offline  
Old 12-07-2019, 07:58 PM   #3
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,840
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
If you wish to contribute code, feel free to do so. The azw3 output plugin in in the writer8 folder. Search for lang in that folder. As far as I know the azw3 format has no support for anything other than ISO 639-1 lang codes, but if you have a azw3 file that does specify country code, you will have to use a hex editor to check how the country code is stored in the header and implement it in the azw3 output plugin. There is a description of the header fields of MOBI/AZW3 files in the mobileread wiki.
kovidgoyal is online now  
Closed Thread

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
[Conversion Output] KePub Output Plugin jgoguen Plugins 551 07-18-2023 06:22 AM
Images from AZW3 Lonely Planet books are downsampled in ePub output wealthychef Conversion 3 05-27-2018 10:34 AM
Setting default output to azw3 instead of mobi gweminence Calibre 3 06-17-2013 02:18 AM
catalogue builder output columns are not in the same order in the output KWhytte Library Management 5 12-04-2012 02:03 AM
Mobi output: how to suppress Calibre version info in Creator metadata Doitsu Calibre 1 10-20-2011 04:14 AM


All times are GMT -4. The time now is 09:28 PM.


MobileRead.com is a privately owned, operated and funded community.