05-03-2009, 11:38 AM | #1 |
Enthusiast
Posts: 27
Karma: 10
Join Date: Mar 2009
Device: none
|
.txt Dictionairies
Does anyone have any .txt word lists. Without the definitions.
Thanks. |
05-03-2009, 10:53 PM | #2 | |
GuteBook/Mobi2IMP Creator
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
|
Quote:
It's a txt version of the html wordlist posted here. EDIT: This seems to be a very popular attachment, so I think it would be beneficial to finally remove any duplicated words ("dups") therein. In fact, I've now trimmed it down to only about 95,000 unique words/phrases (15694 words were deleted) in the newer attachment i.e. -no-dups.zip EDIT: Another sorting of the same word list showing, in increasing order, "words" with 'n' character(s) and their frequency count i.e. -increasing order of characters.zip. Below is a summary count of this ordering: Code:
Wordlist for Webster's Dictionary 1913 ver. 2.1
With Summary count of "words" with 'n' character(S)
by nrapallo (Nick Rapallo) - November 2009
'n' Characters Count %
-----------------------------
_ 1 character 26 0.0%
_ 2 characters 96 0.1%
_ 3 characters 924 1.0%
_ 4 characters 3413 3.6%
_ 5 characters 6066 6.4%
_ 6 characters 9684 10.2%
_ 7 characters 11986 12.6%
_ 8 characters 13870 14.6%
_ 9 characters 13689 14.4%
_10 characters 11788 12.4%
_11 characters 8892 9.3%
_12 characters 6283 6.6%
_13 characters 3968 4.2%
_14 characters 2240 2.4%
_15 characters 1187 1.2%
_16 characters 568 0.6%
_17 characters 280 0.3%
_18 characters 117 0.1%
_19 characters 57 0.1%
_20 characters 23 0.0%
_21 characters 18 0.0%
_22 characters 10 0.0%
_23 characters 5 0.0%
_24 characters 5 0.0%
_25 characters 6 0.0%
_26 characters 2 0.0%
_27 characters 1 0.0%
_30 characters 1 0.0%
_31 characters 1 0.0%
_32 characters 1 0.0%
_33 characters 1 0.0%
_34 characters 1 0.0%
_35 characters 2 0.0%
-----------------------------
"Word" count 95211 100.0%
=============================
-NR Last edited by nrapallo; 11-24-2009 at 05:23 PM. Reason: added newer version without any duplicate words ("dups") |
|
Advert | |
|
11-06-2009, 01:31 AM | #3 |
GuteBook/Mobi2IMP Creator
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
|
Newer version added without any duplicate ("dups") words
The above post now includes a version of the original wordlist without any duplicated words ("dups") i.e. Websters-Dictionary-1913-wordlist-Text-UTF-8-no-dups.zip!
As always, enjoy! |
11-24-2009, 12:45 AM | #4 |
GuteBook/Mobi2IMP Creator
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
|
New version showing "words" in increasing order of characters
See post #2 above for another sorting of the same word list showing, in increasing order, "words" with 'n' character(s) and their frequency count i.e. -increasing order of characters.zip.
A summary count of this ordering is provided for all 95211 "words" (where "words" represents words/phrases with letters/hyphens/apostrophes/special symbols). There are a lot of out-dated words therein so use your discretion while looking up "modern" terms. Anyone find these lists useful? how? Just curious... Last edited by nrapallo; 11-24-2009 at 12:51 AM. |
Thread Tools | Search this Thread |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
PRS-600 Monospace .txt? | dorino | Sony Reader | 2 | 07-06-2010 05:33 PM |
Txt TO Speech Now Gone?? | kindlereader | Amazon Kindle | 4 | 06-16-2010 03:04 PM |
.txt to .pdb? | nlaplaca | Apple Devices | 1 | 12-21-2008 08:15 PM |
Autorun.txt ? | helfred | Sony Reader | 3 | 12-10-2008 12:43 AM |
txt filename? | fishcube | Sony Reader | 1 | 10-19-2007 12:56 AM |