05-09-2024, 01:19 PM | #1 |
Enthusiast
Posts: 26
Karma: 10
Join Date: May 2024
Device: Kobo, Clara
|
Special characters and conversion
Dear friends,
I have a problem with converting a book (from docx to epub) The book in question contains some words in the ancient Indian language Pali, which have a few special characters that are difficult to get properly displayed. Some examples of the characters in question are these: ṅ ṭ ṃ ṁ ṇ When I view the epub in my e-reader (Kobo Clara 2E) each of these characters are simpy ignored and words are "merged together". For example: the word tomato would turn into toato. Strangely enough, when I open the file on the Calibre ebook reader, it works fine. It is only when I use the Kobo that it malfunctions. I have some other books on the Kobo, where these characters display very fine, so I know that it is not beyond the capabilities of the device in itself. Would be very grateful for any advice on this! Last edited by MGA; 05-09-2024 at 01:22 PM. |
05-09-2024, 01:31 PM | #2 | |
Enthusiast
Posts: 26
Karma: 10
Join Date: May 2024
Device: Kobo, Clara
|
In relation to this statement from the original post:
Quote:
|
|
Advert | |
|
05-09-2024, 02:06 PM | #3 |
Bibliophagist
Posts: 40,516
Karma: 156983616
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
|
Sounds as if you are trying to display glyphs that are not supported by the current font used on your Kobo ereader.
|
05-09-2024, 02:12 PM | #4 |
creator of calibre
Posts: 44,539
Karma: 24495948
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
You need to embed a font in the book that has those characters. Either embed it in the word document itself and conversion should preserve it or embed after conversion using the editor and apply the right font styles, or use the conversion option to embed all referenced fonts, provided you have the font on your system somewhere and the docx file actually references the correct font by name but just doesnt embed it.
|
05-09-2024, 03:15 PM | #5 |
Well trained by Cats
Posts: 30,443
Karma: 58055868
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Those seem to be Language relevant.
Conversions from earlier TXT files sometimes needed you to set the CHARSET at the beginning of the Add. What Language was the DOCX using? I was guessing ligatures (ffi,ll...) but I don't know of one that includes an M, so that rules out that reason for 'missing' |
Advert | |
|
05-10-2024, 04:09 AM | #6 |
Enthusiast
Posts: 25
Karma: 574140
Join Date: May 2024
Location: Berlin
Device: KindlePW Scribe Palma Poke5 NovaAir2 NA3C TUC Max2 TabX A6X2
|
This is a two-step problem: First you need to make sure that the correct Unicode character is used in the source/text itself (so not DIY combination of letter-x+point-below), and then you need to make sure that the system/platform/device that displays this text and its diacritical special glyphs also has a font that contains this Unicode glyph.
|
05-10-2024, 10:41 AM | #7 | |
Enthusiast
Posts: 26
Karma: 10
Join Date: May 2024
Device: Kobo, Clara
|
Quote:
And as for the second side of the problem: What would then happen if I specifically downloaded a font and added it to my own device, and then share the file in question with someone else? Would they end up with the same error, or is it somehow possible to "attach" a font with the file? Sorry if my questions are on a very elementary level. |
|
05-10-2024, 10:53 AM | #8 | |
Well trained by Cats
Posts: 30,443
Karma: 58055868
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Quote:
Depending on the font face, get out your wallet if you want to distribute the book. Font license can get expensive. Most fonts are copyrighted. and it geta worse. The license might needed for up to 4 of them (normal, bold, italic bold-italic) |
|
05-10-2024, 11:00 AM | #9 | |
Enthusiast
Posts: 25
Karma: 574140
Join Date: May 2024
Location: Berlin
Device: KindlePW Scribe Palma Poke5 NovaAir2 NA3C TUC Max2 TabX A6X2
|
Quote:
Regarding the licence: Yes, licensing is another thing to consider. Thus the safest way would be to not try and use a fancy font with a licence, but just take one of the few very known fonts that basically have anything and everything covered of the Unicode chart. That leaves you with (admittedly not so pretty) fonts like Times New Roman, Arial, Noto, etc. |
|
05-10-2024, 11:21 AM | #10 | |
Enthusiast
Posts: 26
Karma: 10
Join Date: May 2024
Device: Kobo, Clara
|
Quote:
Thanks for taking the time and making the effort! I am quite a novice in this, but I managed to access the "document.xml", and there I found this information: encoding="UTF-8" It seems to me that this might be very relevant...? |
|
05-10-2024, 11:28 AM | #11 |
Enthusiast
Posts: 26
Karma: 10
Join Date: May 2024
Device: Kobo, Clara
|
|
05-10-2024, 11:34 AM | #12 | |
Enthusiast
Posts: 26
Karma: 10
Join Date: May 2024
Device: Kobo, Clara
|
Exccellent. I hope this is not too complicated. But will examine it. And from what I read previously, it would perhaps be best to do this in the DOCX, before converting it to EPUB?
Quote:
|
|
05-10-2024, 12:54 PM | #13 |
the rook, bossing Never.
Posts: 12,352
Karma: 92073397
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper11
|
No need to embed in the docx, which makes opening times terrible.
As long as the same fonts are on the PC running Calibre (always true if the same computer), it will find and embed the fonts if asked. Obviously if for publication you need either free fonts or a licence that covers ebook distribution (which might be incompatible with Kindle Publishing if the fonts have to be obfuscated or encrypted). It doesn't matter for personal use. |
05-10-2024, 03:53 PM | #14 |
Bibliophagist
Posts: 40,516
Karma: 156983616
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
|
It would be easier to embed the font in your ePub after the conversion. You can use the calibre ebook-editor to do this. As @Quoth mentioned, if this is not for your personal use, you would have to check the licensing on the font.
|
05-10-2024, 05:44 PM | #15 |
Well trained by Cats
Posts: 30,443
Karma: 58055868
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Polish can do it also
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Enhancement request: special characters, invisible characters and html entities. | PenguinCEO | Editor | 15 | 04-08-2020 06:26 PM |
Special CHaracters | Rellwood | Calibre | 10 | 05-01-2019 11:43 PM |
Special characters in conversion | derekn552 | Conversion | 4 | 02-05-2014 08:31 PM |
HTML to Epub conversion dosn`t work because special characters | eLit | Conversion | 2 | 08-29-2011 03:01 AM |
PDF to WORD/HTML conversion, "special characters and marks" errors | chengyibo | 3 | 11-06-2010 01:43 AM |