![]() |
#16 |
Still reading
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 14,167
Karma: 105212035
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper
|
The old mobi/azw/KF7 only supports a subset of Greek in addition to Latin-Roman. Lack of fonts on the ereaders is one issue.
No Chinese, Korean, Hindi, Japanese (or any Asian). No Cyrllic, Hebrew or Arabic. The fonts installed do support Icelandic, Spanish, French, German etc extra letters not used in English. It was really very bad of Amazon to release such a limited GUI/Language support in 2007. The underlying OS supported it over 10 years earlier. They seem to just bolt the Mobipocket stuff they bought in 2005 on top of Linux. It was very limited as it had originally supported Palm OS, Symbian, Windows CE and maybe DOS and very limited. |
![]() |
![]() |
![]() |
#17 | |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 240
Karma: 3500000
Join Date: Sep 2009
Device: Sony PRS-300, PRS-T1, PRS-T3
|
Quote:
Basically: MOBI doesn't seem to attempt any word segmentation (you can select any combination of characters. AZW/KFX only let you select along some notion of word boundaries (kanji + inflection). I'll have to do some more experimentation, to see if it is deriving from a source feature (like a RUBY tag) or some other process. Are you aware of documentation (or reverse engineering) that describes the word boundary information in KF8 or KFX? |
|
![]() |
![]() |
![]() |
#18 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 7,087
Karma: 91577715
Join Date: Nov 2011
Location: Charlottesville, VA
Device: Kindles
|
I have not seen any documentation on this. I looked into the KF8 GESW years ago but did not write up my findings. If I recall correctly it is a compressed table of coded instructions for parsing the raw HTML content to determine which bytes make up each word, taking into account that they might not be contiguous due to HTML markup.
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Converting a Japanese Word doc to Mobi help, please | ImogenRose | Conversion | 1 | 06-12-2013 01:20 PM |
Need help w/very simple task: page of Word text > Kindle text I can share w/friends | kearnine | Conversion | 1 | 10-17-2012 08:25 PM |
Japanese Text in KT firmaware v2.0 | kumaryu | Kobo Reader | 18 | 07-17-2012 01:43 AM |
Displays Japanese text | roquet | Bookeen | 5 | 11-07-2007 09:30 AM |
Can I read Japanese text with it? | ChristSchmidt | Sony Reader | 2 | 01-27-2007 11:14 AM |