Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > ePub

Notices

Reply
 
Thread Tools Search this Thread
Old 12-30-2020, 03:24 PM   #1
carmenchu
Groupie
carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.
 
Posts: 183
Karma: 266070
Join Date: Dec 2010
Device: Win7,Win10,Lubuntu,smartphone
Antique languages in epub

Hello!
I hope somebody can help with this: it is there some recommendation/best practice about dealing with words in an antique language in an epub?
And by 'antique language' I don't mean Latin nor Classic Greek, but beauties like Hittite or Phyrgian--so, I suspect that both spelling and pronunciation are dubious, and for specialists...
However, I feel that they ought to be got out of the way of the spell-checker, and if one were to use a text-to-speech
Any help will be appreciated.
carmenchu is offline   Reply With Quote
Old 12-30-2020, 03:51 PM   #2
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 7,514
Karma: 18512745
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
Use the appropriate lang code, spell checking and text-to-speech should obey it, and that could mean ignoring the word.
Jellby is offline   Reply With Quote
Advert
Old 12-30-2020, 04:07 PM   #3
carmenchu
Groupie
carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.
 
Posts: 183
Karma: 266070
Join Date: Dec 2010
Device: Win7,Win10,Lubuntu,smartphone
Please, are there language codes for Old Indic, Avestan, Old Church Slavonic, Old Norse, Old English, Old High German, Phyrgian, Hittite, Luwian...? Where does one find them?
I was rather hoping for some comprehensive label on the lines of xml:lang="exclude" / "none" / "PIE", that would deal with all those bits--some are only roots: 'xxx-'
carmenchu is offline   Reply With Quote
Old 12-30-2020, 04:44 PM   #4
jhowell
Grand Sorcerer
jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.
 
jhowell's Avatar
 
Posts: 6,470
Karma: 84000001
Join Date: Nov 2011
Location: Tampa Bay, Florida
Device: Kindles
I am not familiar with those languages but it occurs to me that some may be based on obsolete alphabets. If so there may not be existing fonts to support them or even Unicode code points defined for some of the needed characters.
jhowell is offline   Reply With Quote
Old 12-30-2020, 05:15 PM   #5
carmenchu
Groupie
carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.
 
Posts: 183
Karma: 266070
Join Date: Dec 2010
Device: Win7,Win10,Lubuntu,smartphone
Quote:
Originally Posted by jhowell View Post
I am not familiar with those languages but it occurs to me that some may be based on obsolete alphabets. If so there may not be existing fonts to support them or even Unicode code points defined for some of the needed characters.
Well, neither am I familiar, and in fact I am cleaning the book (from OCR) as I read. There were some glyphs rendered by gif images (just like I used to render equations before MATHML) but fortunately, I was able to find all of those characters in utf-8--chiefly Latin subset, and some Greek. I suspect that in such cases (most of the languages referred to in the book are preliterate) linguists try to give a 'transliteration',or a 'reconstructed pronunciation'. Most are not worse than *k’ṃtom (the m has a point below) but I feel that there should be some epub mark-up to distinguish such from 'normal language'... So, I am asking--hope that somebody knows ??
carmenchu is offline   Reply With Quote
Advert
Old 12-30-2020, 05:31 PM   #6
DNSB
Bibliophagist
DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.
 
DNSB's Avatar
 
Posts: 34,517
Karma: 144552660
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Forma, Clara HD, Lenovo M8 FHD, Paperwhite 4, Tolino epos
Are we looking at Hittite, Middle Hittite, Neo-Hittite or Old Hittite? The ISO 639-3 language code list has all four of those. Phrygian only has the one entry.

See ISO 639 Code Tables for the complete searchable list.

Wrapping an string in the language code should prevent a spellchecker from attempting to spellcheck it unless it has a matching dictionary. Something like <span xml:lang="xpg">*k’ṃtom</span> for example.
DNSB is offline   Reply With Quote
Old 12-30-2020, 06:10 PM   #7
carmenchu
Groupie
carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.
 
Posts: 183
Karma: 266070
Join Date: Dec 2010
Device: Win7,Win10,Lubuntu,smartphone
My, oh, my! Ask to learn: I never suspected that there would be so many codes for dead languages!
Now, a short and perhaps silly question: would it work just to mark those words (or roots, mostly) as <span xml:lang="pied">*k’ṃtom</span>, i.e., with a non-existing language code, or it would it sound all the bells in epubChecker? Really, the author is hardly ever giving words in a particular language, but rather pointing out 'cognates' or common roots in related languages... It seems rather an overkill to insert more than thirty different <span xml:lang="---">...</span> for mere fragments, mostly not belonging to actual languages, when what is really required is the notice 'don't treat this as a common word'.

Anyway, my thanks for the link to the code tables: good to know that there is such a complete reference for out-of-the-way languages.
carmenchu is offline   Reply With Quote
Old 12-30-2020, 07:21 PM   #8
DNSB
Bibliophagist
DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.
 
DNSB's Avatar
 
Posts: 34,517
Karma: 144552660
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Forma, Clara HD, Lenovo M8 FHD, Paperwhite 4, Tolino epos
I've never tried a non-existent language code but give it a try and see what happens.

Hey, if it's got room for Klingon, antique languages are nothing.
DNSB is offline   Reply With Quote
Old 12-31-2020, 04:30 AM   #9
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 7,514
Karma: 18512745
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
There are some special codes for cases where no other existing code is appropriate. Of course, what a particular application will do or not do with such codes is... unknown (I think calibre does not accept 'zxx' as a book language, for example).
Jellby is offline   Reply With Quote
Old 12-31-2020, 04:40 AM   #10
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 7,514
Karma: 18512745
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
Quote:
Originally Posted by carmenchu View Post
Please, are there language codes for Old Indic, Avestan, Old Church Slavonic, Old Norse, Old English, Old High German, Phyrgian, Hittite, Luwian...?
You could refer to:
https://www.loc.gov/marc/languages/language_name.html

Old Indic: USE Vedic: Assigned collective code [san]
Avestan: [ave]
Slavonic, Old Churh: USE Church Slavic: [chu]
Old Norse: [non]
Old English: USE English, Old (ca. 450-1100): [ang]
Old High German: USE German, Old High (ca. 750-1050): [goh]
Phrygian: Assigned collective code [ine]
Hittite: [hit]
Luwian: Assigned collective code [ine]

I had some "fun" translating all language names in calibre...
Jellby is offline   Reply With Quote
Old 12-31-2020, 04:57 AM   #11
carmenchu
Groupie
carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.
 
Posts: 183
Karma: 266070
Join Date: Dec 2010
Device: Win7,Win10,Lubuntu,smartphone
Thanks again! I have tried xml:lang="none" and EpubCheck has passed it. Moreover, searching the iso database for that found no match--I tried "gib" for "gibberish", but there is a real language with such code...
Aside from this particular non-fiction book, there are tons of my favourite SF & Fantasy works chock-full of words in the author's tooled lingo. Klingon is nothing to it--think of Edgar Rice Burroughs! I suspect the Tarzan, Barsoom, Venus, ... series haven't got their own iso entries.
carmenchu is offline   Reply With Quote
Old 12-31-2020, 08:11 AM   #12
PoP
 curly᷂͓̫̙᷊̥̮̾ͯͤͭͬͦͨ ʎʌɹnɔ
PoP ought to be getting tired of karma fortunes by now.PoP ought to be getting tired of karma fortunes by now.PoP ought to be getting tired of karma fortunes by now.PoP ought to be getting tired of karma fortunes by now.PoP ought to be getting tired of karma fortunes by now.PoP ought to be getting tired of karma fortunes by now.PoP ought to be getting tired of karma fortunes by now.PoP ought to be getting tired of karma fortunes by now.PoP ought to be getting tired of karma fortunes by now.PoP ought to be getting tired of karma fortunes by now.PoP ought to be getting tired of karma fortunes by now.
 
PoP's Avatar
 
Posts: 3,002
Karma: 50506927
Join Date: Dec 2010
Location: ♁ ᴺ₄₅°₃₀' ᵂ₇₃°₃₇' ±₆₀"
Device: K3₃.₄.₃ PW3&4₅.₁₃.₃
The IANA language registry and RFC5646: Tags for Identifying Languages might also be useful resources.

I used these extensively doing this.

A cursory inspection found the Gibanawa, the tlh, and all the Hittite variations of the above posts , yet not the Phyrgian
PoP is offline   Reply With Quote
Old 12-31-2020, 10:52 AM   #13
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,582
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by carmenchu View Post
Thanks again! I have tried xml:lang="none" and EpubCheck has passed it.
AFAIK, EPUBCheck doesn't check the values of xml:lang attributes. For example, xml:lang="nope" isn't flagged either.
You could use "und" or "zxx".
Doitsu is offline   Reply With Quote
Old 01-01-2021, 10:28 AM   #14
carmenchu
Groupie
carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.carmenchu ought to be getting tired of karma fortunes by now.
 
Posts: 183
Karma: 266070
Join Date: Dec 2010
Device: Win7,Win10,Lubuntu,smartphone
Many thanks, everybody! Besides learning a lot about language codes (until now, I had only bothered about the ones that I can read--more or less)--I am definitely clear about how deal with 'not really a language'.
carmenchu is offline   Reply With Quote
Old 01-07-2021, 08:16 AM   #15
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by Doitsu View Post
You could use "und" or "zxx".
"und" would be correct in this case.

Because from the link you posted:

Quote:
When the text is non-linguistic

Use the subtag zxx when the text is known to be not in any language.

This would apply for text such as type samples, part numbers, illustrations of binary data, etc. The definition of zxx in the IANA Language Subtag Registry is 'no linguistic content'.

For example:

Code:
<p>Here is a list of part numbers: <span lang="zxx">9RUI34 8XOS12 3TYY85</span>.</p>
so zxx is for gibberish + und is for linguistic content, but unknown what language.

Last edited by Tex2002ans; 01-07-2021 at 08:19 AM.
Tex2002ans is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
How to correctly display characters in ePub for non-standard languages? Nitnesh ePub 3 05-27-2017 11:13 AM
Can an EPUB file be localized to many languages? hershe ePub 2 02-20-2013 06:15 AM
Can I write an epub ebook in two languages? dvitoria Writers' Corner 7 09-11-2011 08:55 AM
Can I write an epub ebook in two languages? dvitoria ePub 2 09-09-2011 12:25 PM
Antique books, copyright status. Ankh Workshop 10 05-28-2009 09:52 AM


All times are GMT -4. The time now is 11:49 AM.


MobileRead.com is a privately owned, operated and funded community.