![]() |
Japanese characters not showing up on some devices
Hi everyone, I'm a newbie to both eBook formatting and Calibre, but willing to learn!
I have an ePub novel that was originally typeset in Reedsy. There are several instances of Japanese kanji characters in the book that show up fine in Calibre and in Kindle Previewer. However, when I send it over to my Kobo Libre, the Japanese text is deleted out. (Doesn't display as a blank or as weird characters; just vanishes entirely.) I know my Kobo has the ability to display kanji, because I just read a book that had it, and it was definitely text and not an image of the characters dropped in. Something in this book is not set up right, I'm guessing, but I'm not positive what. (I also tried generating ePubs from LibreOffice and Scrivener and they won't display the kanji either, so it's not a problem unique to the Reedsy-typeset file.) I was trying to follow directions to embed the fonts after seeing older posts in this forum, thinking that might be a fix, but I've never worked in-depth with Calibre yet and I don't seem to be doing it right: Spoiler:
Can someone nudge me in the right direction? Thanks! |
I think you need to embed a font that has CJK character support like Noto. The fonts shown embedded there don't cover Japanese or Chinese characters.
|
If you hope to sell this book, be aware that embedding fonts can lead to trouble on Amazon's KDP. Also note that Calibre, while a wonderful library management tool, is not the best way to format a book for sale. For that, Sigil is the go-to software.
|
Quote:
Quote:
2020: "Should Chinese Fonts be Embedded in Ebooks?" I was working on a few journal articles that had a handful of Chinese/Japanese characters. The topic goes into a lot of technical discussion though, so I'll try to create a more easy tutorial here. :D * * * This is what you'll do: 1. Use Proper HTML Language in Your Document Add this code around all your Japanese text: Code:
<span class="japanese" lang="ja" xml:lang="ja">返</span>
2. Find Font That Includes Characters You Need A big list of CJK fonts can be found on Wikipedia: https://en.wikipedia.org/wiki/List_of_CJK_fonts For example, the "Source Han" or "Noto" fonts have a license that allows you to embed them in ebooks for free. 3. Download the OTF or TTF Once you've chosen your font, download the OTF or TTF versions. (This is the format needed for EPUBs.) For example, here is the latest page for "Source Han Sans": https://github.com/adobe-fonts/sourc...ses/tag/2.004R (As of today, it's v2.004. Yes, fonts have version numbers + they get updated/fixed!) Every font will be slightly different in organization/naming conventions... But in this case, you want to download:
Unzip the file, and you'll see multiple folders. What you want is buried in OTF: - SourceHanSans -- OTF --- Japanese ---- SourceHanSans-Regular.otf You can install that font to your computer and/or you can use that file to shove into your ebook. 4. Add the CSS Now, you want to go back into your EPUB. This time, we'll adjust the CSS. In your CSS file, add this: Code:
span.japanese {This says to use the font ONLY for those specific pieces. :) And in plain English, the code says: "For every <span> that has a class named 'japanese': Use the font Source Han Sans. If you can't find that, use the device's default sans-serif font." 5. Insert Font Into Your EPUB 5.1. If you're using Calibre:
or you can:
5.2. If you're using Sigil:
5.3. Or you can manually put this at the top of your CSS: Code:
@font-face {"I have a new font here! The font's name is Source Han Sans. It is not bold and not italics. This is the location of the font file." (Optional) 6. Subset your Fonts If Calibre: Tools > Subset Embedded Fonts If Sigil: Use Doitsu's "SubsetFonts" plugin. This will cut down the filesize. Asian font files are huge, because there are thousands and thousands of characters. If you're only using a handful (like in my journal articles, there were maybe ~20 characters total), you can shave the font from many MBs down to a few hundred KBs. In plain English: What does subsetting do exactly? Subsetting looks through your specific book + font, then deletes every letter you're not using. Imagine you only used the fancy font on your book's title page: "I CUP". You can remove all those thousands and thousands of characters, only leaving 4 within the subsetted font: I + C + U + P! :) Quote:
Calibre's Editor does some things better than Sigil, like External Link check + Insert Special Character. And Sigil does some things better than Calibre, like easier TOC editing + easier-to-read Reports. |
Excellent tutorial.
And then even when done properly it won't work on many older ereaders, because the companies took lazy Western-centric view. The actual OSes used had Cyrillic, Arabic, Chinese, Japanese, Thai, Hindi, Hebrew etc over a decade before dedicated ebook readers were developed. Amazon Kindle was one of the most backward. Both Sigil and Calibre can be better for ebooks than expensive Indesign as it's fudged for ebooks, it was developed for PDFs and production of papers, magazines and technical books. Even for PDF novels a free Word processor can be better than InDesign. |
Quote:
Quote:
Check out these two fantastic videos by Computerphile (especially the first one!): "Internationalis(z)ing Code" "Characters, Symbols and the Unicode Miracle" And many other details were discussed in that "Should Chinese be Embedded" thread above. If you're interested in more, check out the Harfbuzz + LibreOffice talks I linked to in Post #8. Harbuzz is the renderer that figures out how to actually draw the characters (it's now the basis of many programs/browsers). And the LibreOffice talks were some of the Asian users discussing common bugs/problems that crop up in programs like Word/LibreOffice. Plus needing to keep in mind special cases, like how they input characters using IME. |
Thank you so much! Luckily, the Kindle version displays properly and the typesetting looks great right out of the box, it's the other readers I'm trying to tackle - but I'll definitely have a closer look at Sigil once I get used to Calibre and CSS adjustments.
I really appreciate the tutorial - will get started on how to apply it tomorrow! |
1 Attachment(s)
Quote:
Kindle Previewer 3 only shows you the latest devices/formats... But you still have to keep in mind fallback code for very old Kindle devices that only read the original MOBI format (like the Kindle DX). Sadly, Kindle Previewer 3 doesn't show these ol' devices in the dropdown anymore. You'd need to run an old version of Kindle Previewer 2 (or have a very old device on hand). * * * Can you show more actual examples of the Japanese usage in this book? In the case of my journal articles, they all included full English transliterations right next to the original Chinese/Japanese words: Code:
Liu E, also known as Liu Tieyun <span class="japanese" lang="ja" xml:lang="ja">劉鐵雲</span>, was born in 1857 at Liuhe <span class="japanese" lang="ja" xml:lang="ja">六合</span> county in what is today Nanjing <span class="japanese" lang="ja" xml:lang="ja">南京</span>.Code:
@media amzn-mobi {- If the format is Amazon's old MOBI format. - And the class is "chinese" or "japanese" - Hide it. Instead of a reader seeing "missing boxes"... it'll just be disappeared, and all you'd see is the English text: Quote:
Quote:
As you're following Step 1, adding the proper HTML lang markup, you can use Calibre's/Sigil's fantastic Spellcheck Lists. In Calibre: Tools > Check Spelling. In Sigil: Tools > Spellcheck > Spellcheck. You can then sort by the "Word" or "Language" column. Here's what it looks like in Sigil 1.5.1: Attachment 187303 This lets you easily spot words you haven't marked yet or accidentally marked wrongly. Like you might see a Japanese word that says Language: "English". You can then double-click on the word to jump to its location in the ebook, then add that Step 1 <span> code around it. :) Now next time you refresh the Spellcheck List, bam, it'll say Japanese! |
Quote:
There really isn't a choice, at this point in time. You cannot upload a book at Amazon that says "Hey, I'm kF8/KFX only, don't sell me to people with Kindle Keyboards," so...you have to take the longer view. (n.b.: well, you could upload a book that says KF8/KFX only, by using fixed-layout but we don't do that for all the obvious reasons.) Hitch |
Looking at the OP's message, he references a Kobo Libre. The Libre has a couple of CJK fonts available but you will need to select one of them which will set also set that font as the default for the next book. You also need to avoid using font-family references in the CSS which can block the ability to select either Tsukushi Mincho or UD Kakugo as the display font when displaying Japanese characters when creating an epub that will be displayed with the RMSDK renderer. Creating a kepub will use Kobo's epub3 renderer which will search for a glyph that is not defined in the current font. At time the results of doing this can be a bit strange.
|
Quote:
|
Quote:
|
Quote:
Have any insights into mass generating the images? And/or resizing them correctly to the text reliably? I think I have a few workflow ideas with ImageMagick (similar to my formulas generation), but that's a whole other thing. :D And I personally haven't messed with inline images too much since way back during that SVG/apple thread. * * * Side Note: For book club, I've recently been reading Jordan Peterson's latest book: "Beyond Order: 12 More Rules for Life", published by Penguin Random House. The EPUB had two Hebrew words in there as inline images: Quote:
Code:
img.h1em-HEB {And I'm glad to see they used the proper alt tags! :) |
Quote:
I test any new styles on a DXG and KK3 (using Dual Mobi and also Publisher option and default on the KK3). |
Quote:
|
Quote:
Quote:
* * * Quote:
Quote:
Hitch |
3 Attachment(s)
Quote:
Plus who knows how a device might f-up rendering RTL languages within LTR text... especially RTL within an LTR sentence! (Again, check out the fantastic Internationalization video + Harfbuzz talk on Arabic/Hebrew/East-Asian font rendering.) Of course, I'd prefer the actual Unicode Hebrew within my EPUBs... but it was an "okay" solution. (And using proper alt means TTS + you can substitute or regenerate in the future!) * * * Actually, I think they may have botched the ebook slightly. Originals: Attachment 187339 Attachment 187340 LibreOffice Writer (v7.1.3.2): Attachment 187341 1st word's alt text is definitely missing the 2nd chunk. In 2nd example, LibreOffice has 2 dots above. In original image, it's two dots below. Unsure which rendering is correct though, since I don't read Hebrew. :D Quote:
Closest thing I ever come across in all my books is Polytonic Greek. (Written about many times over the years.) And luckily, nearly all inline equations I come across can all be converted to simple form. So something like this: Code:
1Code:
1 / (1 + 2 + 3 + ... + n)Quote:
|
Quote:
Quote:
Hitch |
Quote:
Isn't that what we were talking about? :blink: You mentioned Kanji = two versions: 1 embedded fonts + 1 media-query images. I mentioned latest book I've been reading: inline Hebrew images. Quote:
And you can't just go dangling the word "equations" out there and not let me know at least a little bit of details! I must know! |
Quote:
Quote:
(This is a book on statistics and the bigger issue are what I refer to as b-hats, or hatted-bs. Hatted-S'es have a unicode character, but hatted-Bs do not, so we'll have to do all the hatted-bs, and all the fraction equations, as images. [sigh].) And you know, inline SVG in "Kindle" (aka, MOBI or ePUB or whatever damned format) doesn't work right. Bugger.) Fun working with YOU, though, snookums. Hitch |
1 Attachment(s)
Quote:
B̂ It looks like crap on this site in my browser with whatever sans-serif font it's using, but pretty decent on a separate HTML file with the default (Deja Vu Serif) font: |
Quote:
Hitch |
Quote:
|
Quote:
The customer doesn't want that, and neither do I. Even Cambria's math glyphs don't have the bloody thing. There is no b-hat unicode and believe me, we looked. Hell, there are Stack Exchange whinges about it! Hitch |
Quote:
|
Quote:
https://en.wikipedia.org/wiki/Teachi...r_to_suck_eggs Pizza pre-dates tomato. It's a corruption of pita, which originally was just flat bread, not "pocket bread". |
Quote:
And as for Amazon, they really Fed up with embedded fonts. They should automatically select the publisher font option. |
Quote:
No, you can't use inline SVGs, at least, not as of my last test. Everytime you put a bloody SVG (which is, mind you, allegedly "supported" for KF8/KFX!), you get an inadvertent page-break (screen break). Works fine for full-page/screen images, natch--but it's utterly worthless for inline. Trust me, Jon--I swear to you, this is something we've searched high and low on. Unless KDP has changed something around SVG, in the last...say, 4 months, the ONLY solution so far is (regular, JPEG/GIF/PNG) images and yes, I hate it. Hitch |
Quote:
Sorry Amazon makes these sorts of things a pain in the ass. |
| All times are GMT -4. The time now is 06:49 PM. |
Powered by: vBulletin
Copyright ©2000 - 3.8.5, Jelsoft Enterprises Ltd.
MobileRead.com is a privately owned, operated and funded community.