View Single Post
Old 08-20-2022, 09:57 AM   #440
xxyzz
Evangelist
xxyzz ought to be getting tired of karma fortunes by now.xxyzz ought to be getting tired of karma fortunes by now.xxyzz ought to be getting tired of karma fortunes by now.xxyzz ought to be getting tired of karma fortunes by now.xxyzz ought to be getting tired of karma fortunes by now.xxyzz ought to be getting tired of karma fortunes by now.xxyzz ought to be getting tired of karma fortunes by now.xxyzz ought to be getting tired of karma fortunes by now.xxyzz ought to be getting tired of karma fortunes by now.xxyzz ought to be getting tired of karma fortunes by now.xxyzz ought to be getting tired of karma fortunes by now.
 
Posts: 442
Karma: 2666666
Join Date: Nov 2020
Device: none
WorldLex's *CDPc values drop sharply after a few rows, most values are below one, these values looks like a "percentage" number. I not sure the meaning of *Freq and *FreqPm columns.

Google's Ngram has more words and is released more recently, but the frequency data needs to be computed from the "1-grams" files and the "Total counts for 1-grams" file. According to the Ngram viewer, the frequencies of google's data is also mostly below one.

Wiktionary also has many word frequency lists, they don't have frequency data though: https://en.wiktionary.org/wiki/Categ...ts_by_language
https://es.wiktionary.org/wiki/Wikci...de_frecuencias

I not sure which data source is better then the others. I'm planning to release a new version so this feature probably will be added in a future release.

Last edited by xxyzz; 08-20-2022 at 10:16 AM.
xxyzz is offline   Reply With Quote