Interesting paper, I haven't fully read through it but their data(that 8M Excel file) can be used to assign the difficulty value and determine which word to enable. I was looking for similar data before, this looks promising.
Kindle *.klld and English Wiktionary has 79134 and 817135 words/phrases respectively. That Excel file has 61858 lemmas.
Last edited by xxyzz; 07-26-2022 at 09:06 AM.
|