View Single Post
Old 06-21-2022, 08:42 PM   #324
j.p.s
Grand Sorcerer
j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.
 
Posts: 5,810
Karma: 103362673
Join Date: Apr 2011
Device: pb360
OK, it sounds like spaCy's Entity Ruler is the way to go.

The reason I wanted to have the type column was to correct spaCy's mistakes. It looks like my best option is to use WordDumb to make the XRAY file, then use SQL to fix name, type, and description.

I will start using the book from srandardebooks.com and remove the soft hyphens.

I think XRAY has provisions for aliases (nicknames and other variations. Do spaCy and WordDumb have something similar?
j.p.s is offline   Reply With Quote