View Single Post
Old 08-02-2020, 07:53 PM   #32
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by 1v4n0 View Post
I often edit ebooks which feature many foreign words, whose (the words') language is not marked in the code [...] As things stand now, when correcting a long book (typically university textbooks on humanistic subjects), I find myself scrolling through a list of thousands of words, many of which are in some language other than the one the text is actually written in, and 99% of which are false positives.
Back in 2019, I wrote a rough breakdown of the method I currently use:

Post #11 in "Export list of words in spellcheck"

which also points to how I use (Calibre's) Spellcheck Lists + Regex:

Post #29 in "Is there a way to use the selection in a Saved Search?"

I've used that method successfully on journal articles + text from game files (millions of words).

For one game, I even hackishly assigned each character different langs, then used Calibre to give me a breakdown of all words spoken per character. This allowed me to normalize the translation. (For example, one character always said "dinnae" instead of "didn't". The word list method made sure to catch any strays. )

For games, it also allowed me to easily catch any made-up fantasy words very easily, since they didn't appear in either the US or UK dictionaries.

Last edited by Tex2002ans; 08-02-2020 at 07:57 PM.
Tex2002ans is offline   Reply With Quote