View Single Post
Old 06-26-2022, 08:43 PM   #338
j.p.s
Grand Sorcerer
j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.
 
Posts: 5,813
Karma: 103362673
Join Date: Apr 2011
Device: pb360
Quote:
Originally Posted by xxyzz View Post
I though Amazon's files are created manually...
They supposedly are made at least in part manually, but see my thread "How many X-ray mistaken identities can you find in your books?"
https://www.mobileread.com/forums/sh...d.php?t=309190
and "Easily fix egregious X-ray errors"
https://www.mobileread.com/forums/sh...d.php?t=303347
especially post #6, which is one of the books I tested WordDumb on.
Quote:
If you have a CUDA-compatible GPU, you could try spaCy's transformer model(https://spacy.io/usage#gpu), it has higher NER accuracy then the CPU model. You also have to install the dependencies manually and change the code a bit(a few lines, maybe).

Ultimately, you could train your own model for specificity kinds of books.
I use the open source nvidia driver, so I can't use CUDA.

I'm almost out of time for this for a while, but I think I am almost ready to use it for a few books that don't have X-ray instead of testing it against books that do have X-ray.
j.p.s is offline   Reply With Quote