![]() |
#1 |
Groupie
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 183
Karma: 266070
Join Date: Dec 2010
Device: Win7,Win10,Lubuntu,smartphone
|
Antique languages in epub
Hello!
I hope somebody can help with this: it is there some recommendation/best practice about dealing with words in an antique language in an epub? And by 'antique language' I don't mean Latin nor Classic Greek, but beauties like Hittite or Phyrgian--so, I suspect that both spelling and pronunciation are dubious, and for specialists... However, I feel that they ought to be got out of the way of the spell-checker, and if one were to use a text-to-speech ![]() Any help will be appreciated. |
![]() |
![]() |
![]() |
#2 |
frumious Bandersnatch
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 7,543
Karma: 19001583
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
|
Use the appropriate lang code, spell checking and text-to-speech should obey it, and that could mean ignoring the word.
|
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Groupie
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 183
Karma: 266070
Join Date: Dec 2010
Device: Win7,Win10,Lubuntu,smartphone
|
Please, are there language codes for Old Indic, Avestan, Old Church Slavonic, Old Norse, Old English, Old High German, Phyrgian, Hittite, Luwian...? Where does one find them?
I was rather hoping for some comprehensive label on the lines of xml:lang="exclude" / "none" / "PIE", that would deal with all those bits--some are only roots: 'xxx-' |
![]() |
![]() |
![]() |
#4 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 7,032
Karma: 91577715
Join Date: Nov 2011
Location: Charlottesville, VA
Device: Kindles
|
I am not familiar with those languages but it occurs to me that some may be based on obsolete alphabets. If so there may not be existing fonts to support them or even Unicode code points defined for some of the needed characters.
|
![]() |
![]() |
![]() |
#5 |
Groupie
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 183
Karma: 266070
Join Date: Dec 2010
Device: Win7,Win10,Lubuntu,smartphone
|
Well, neither am I familiar, and in fact I am cleaning the book (from OCR) as I read. There were some glyphs rendered by gif images (just like I used to render equations before MATHML) but fortunately, I was able to find all of those characters in utf-8--chiefly Latin subset, and some Greek. I suspect that in such cases (most of the languages referred to in the book are preliterate) linguists try to give a 'transliteration',or a 'reconstructed pronunciation'. Most are not worse than *k’ṃtom (the m has a point below) but I feel that there should be some epub mark-up to distinguish such from 'normal language'... So, I am asking--hope that somebody knows ??
|
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Bibliophagist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 44,814
Karma: 168802811
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
|
Are we looking at Hittite, Middle Hittite, Neo-Hittite or Old Hittite? The ISO 639-3 language code list has all four of those. Phrygian only has the one entry.
See ISO 639 Code Tables for the complete searchable list. Wrapping an string in the language code should prevent a spellchecker from attempting to spellcheck it unless it has a matching dictionary. Something like <span xml:lang="xpg">*k’ṃtom</span> for example. |
![]() |
![]() |
![]() |
#7 |
Groupie
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 183
Karma: 266070
Join Date: Dec 2010
Device: Win7,Win10,Lubuntu,smartphone
|
My, oh, my! Ask to learn: I never suspected that there would be so many codes for dead languages!
Now, a short and perhaps silly question: would it work just to mark those words (or roots, mostly) as <span xml:lang="pied">*k’ṃtom</span>, i.e., with a non-existing language code, or it would it sound all the bells in epubChecker? Really, the author is hardly ever giving words in a particular language, but rather pointing out 'cognates' or common roots in related languages... It seems rather an overkill to insert more than thirty different <span xml:lang="---">...</span> for mere fragments, mostly not belonging to actual languages, when what is really required is the notice 'don't treat this as a common word'. Anyway, my thanks for the link to the code tables: good to know that there is such a complete reference for out-of-the-way languages. ![]() |
![]() |
![]() |
![]() |
#8 |
Bibliophagist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 44,814
Karma: 168802811
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
|
I've never tried a non-existent language code but give it a try and see what happens.
Hey, if it's got room for Klingon, antique languages are nothing. |
![]() |
![]() |
![]() |
#9 |
frumious Bandersnatch
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 7,543
Karma: 19001583
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
|
There are some special codes for cases where no other existing code is appropriate. Of course, what a particular application will do or not do with such codes is... unknown (I think calibre does not accept 'zxx' as a book language, for example).
|
![]() |
![]() |
![]() |
#10 | |
frumious Bandersnatch
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 7,543
Karma: 19001583
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
|
Quote:
https://www.loc.gov/marc/languages/language_name.html Old Indic: USE Vedic: Assigned collective code [san] Avestan: [ave] Slavonic, Old Churh: USE Church Slavic: [chu] Old Norse: [non] Old English: USE English, Old (ca. 450-1100): [ang] Old High German: USE German, Old High (ca. 750-1050): [goh] Phrygian: Assigned collective code [ine] Hittite: [hit] Luwian: Assigned collective code [ine] I had some "fun" translating all language names in calibre... |
|
![]() |
![]() |
![]() |
#11 |
Groupie
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 183
Karma: 266070
Join Date: Dec 2010
Device: Win7,Win10,Lubuntu,smartphone
|
Thanks again! I have tried xml:lang="none" and EpubCheck has passed it. Moreover, searching the iso database for that found no match--I tried "gib" for "gibberish", but there is a real language with such code...
Aside from this particular non-fiction book, there are tons of my favourite SF & Fantasy works chock-full of words in the author's tooled lingo. Klingon is nothing to it--think of Edgar Rice Burroughs! I suspect the Tarzan, Barsoom, Venus, ... series haven't got their own iso entries. |
![]() |
![]() |
![]() |
#12 |
curly᷂͓̫̙᷊̥̮̾ͯͤͭͬͦͨ ʎʌɹnɔ
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,016
Karma: 50506927
Join Date: Dec 2010
Location: ♁ ᴺ₄₅°₃₀' ᵂ₇₃°₃₇' ±₆₀"
Device: K3₃.₄.₃ PW3&4₅.₁₃.₃
|
The IANA language registry and RFC5646: Tags for Identifying Languages might also be useful resources.
I used these extensively doing this. A cursory inspection found the Gibanawa, the tlh, and all the Hittite variations of the above posts ![]() ![]() |
![]() |
![]() |
![]() |
#13 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 5,680
Karma: 23983815
Join Date: Dec 2010
Device: Kindle PW2
|
Quote:
You could use "und" or "zxx". |
|
![]() |
![]() |
![]() |
#14 |
Groupie
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 183
Karma: 266070
Join Date: Dec 2010
Device: Win7,Win10,Lubuntu,smartphone
|
Many thanks, everybody! Besides learning a lot about language codes (until now, I had only bothered about the ones that I can read--more or less)--I am definitely clear about how deal with 'not really a language'.
|
![]() |
![]() |
![]() |
#15 | ||
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
Quote:
Because from the link you posted: Quote:
Last edited by Tex2002ans; 01-07-2021 at 08:19 AM. |
||
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
How to correctly display characters in ePub for non-standard languages? | Nitnesh | ePub | 3 | 05-27-2017 11:13 AM |
Can an EPUB file be localized to many languages? | hershe | ePub | 2 | 02-20-2013 06:15 AM |
Can I write an epub ebook in two languages? | dvitoria | Writers' Corner | 7 | 09-11-2011 08:55 AM |
Can I write an epub ebook in two languages? | dvitoria | ePub | 2 | 09-09-2011 12:25 PM |
Antique books, copyright status. | Ankh | Workshop | 10 | 05-28-2009 09:52 AM |