View Single Post
Old 08-20-2022, 03:47 PM   #441
Shark69
Zealot
Shark69 ought to be getting tired of karma fortunes by now.Shark69 ought to be getting tired of karma fortunes by now.Shark69 ought to be getting tired of karma fortunes by now.Shark69 ought to be getting tired of karma fortunes by now.Shark69 ought to be getting tired of karma fortunes by now.Shark69 ought to be getting tired of karma fortunes by now.Shark69 ought to be getting tired of karma fortunes by now.Shark69 ought to be getting tired of karma fortunes by now.Shark69 ought to be getting tired of karma fortunes by now.Shark69 ought to be getting tired of karma fortunes by now.Shark69 ought to be getting tired of karma fortunes by now.
 
Shark69's Avatar
 
Posts: 136
Karma: 493152
Join Date: Mar 2012
Location: Spain
Device: Kindle Oasis 2
Quote:
Originally Posted by xxyzz View Post
WorldLex's *CDPc values drop sharply after a few rows, most values are below one, these values looks like a "percentage" number. I not sure the meaning of *Freq and *FreqPm columns.

Google's Ngram has more words and is released more recently, but the frequency data needs to be computed from the "1-grams" files and the "Total counts for 1-grams" file. According to the Ngram viewer, the frequencies of google's data is also mostly below one.

Wiktionary also has many word frequency lists, they don't have frequency data though: https://en.wiktionary.org/wiki/Categ...ts_by_language
https://es.wiktionary.org/wiki/Wikci...de_frecuencias


I not sure which data source is better then the others. I'm planning to release a new version so this feature probably will be added in a future release.
Yes, it is quite difficult to guess the better way to face it. I only can support your idea and contribute when possible.
Thanks
Shark69 is offline   Reply With Quote