![]() |
#316 |
Evangelist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 442
Karma: 2666666
Join Date: Nov 2020
Device: none
|
|
![]() |
![]() |
![]() |
#317 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 5,802
Karma: 103362673
Join Date: Apr 2011
Device: pb360
|
Quote:
Would it also be possible for the user to supply the JSON file and the dialog have an option to read it? Also it would also be good for the JSON to have optional fields specifying whether entity is person or term and maybe a field for source. |
|
![]() |
![]() |
![]() |
#318 | ||
Evangelist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 442
Karma: 2666666
Join Date: Nov 2020
Device: none
|
Quote:
Quote:
I'm also planning to use Wiktionary's data from https://kaikki.org to add Word Wise <ruby> tag and footnote to EPUB books, both features would take some time to implement. |
||
![]() |
![]() |
![]() |
#319 |
Evangelist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 442
Karma: 2666666
Join Date: Nov 2020
Device: none
|
Hi j.p.s, I have pushed the code to GitHub, please install the test version from GitHub Actions and see if everything works.
I didn't add the source in the new dialog because it's more complicated and I didn't figure out how to make the table and window auto resize. |
![]() |
![]() |
![]() |
#320 | ||
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 5,802
Karma: 103362673
Join Date: Apr 2011
Device: pb360
|
Quote:
I was able to download and install the artifact and bring up the dialog and enter and save information. But when I run "Create X-Ray" the contents of .config/calibre/plugins/worddumb-custom-x-ray.json do not seem to affect the created XRAY.entities.ASIN.asc file that gets created. Quote:
I would also be happy to create the JSON file outside calibre/WordDumb. Of course, having the dialog is nice so that I don't have to guess the format, file name, and location. Having all the entries in one file might not work so well for all books. Maybe eventually WordDumb could look for a worddumb-custom-x-ray.json file in the same directory as the book? |
||
![]() |
![]() |
![]() |
#321 | |
Evangelist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 442
Karma: 2666666
Join Date: Nov 2020
Device: none
|
Quote:
I can add another dialog to select the book then show the edit X-Ray dialog and save the JSON file in the book folder. You could create and edit the JSON file without using the edit dialog. The `is person` column will be changed to NER label. |
|
![]() |
![]() |
![]() |
#322 | ||
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 5,802
Karma: 103362673
Join Date: Apr 2011
Device: pb360
|
Quote:
I think I understand better now, this new capbility is for correcting entities that WordDumb will detect, not help WordDumb detect an entity. I know you are having trouble with too many columns in the dialog, but I think making the first JSON column the entity id (assuming 2 runs of WordDumb with the exact same configuration settings would detect the same entities and give them the same id number. Can WordDumb be run from the command line? Quote:
Of course I am most interested to learn what I am doing wrong. Should I use the 1984 or some other book for practice so that we can duplicate each others results? |
||
![]() |
![]() |
![]() |
#323 |
Evangelist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 442
Karma: 2666666
Join Date: Nov 2020
Device: none
|
The current feature only use the customized description if spaCy can find the name from the book and both names must be the same. And if the customized X-Ray is person but spaCy thinks it's not a person, the code will set the entity as a person.
With spaCy's Entity Ruler, I can let spaCy to find these customized X-Ray entities if it can't find them before. It's not the column numbers that I'm worrying about, it's just I'm having trouble with auto resize the table and the dialog window... Especially when these is a combobox in the table. WordDumb can't run in the command line now. I'm not sure whether some features will work in the terminal, for example: device detection. I'm using this book for testing on GitHub but this book has soft hyphens, you may want to remove them or convert to KFX to get better X-Ray quality. |
![]() |
![]() |
![]() |
#324 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 5,802
Karma: 103362673
Join Date: Apr 2011
Device: pb360
|
OK, it sounds like spaCy's Entity Ruler is the way to go.
The reason I wanted to have the type column was to correct spaCy's mistakes. It looks like my best option is to use WordDumb to make the XRAY file, then use SQL to fix name, type, and description. I will start using the book from srandardebooks.com and remove the soft hyphens. I think XRAY has provisions for aliases (nicknames and other variations. Do spaCy and WordDumb have something similar? |
![]() |
![]() |
![]() |
#325 |
Evangelist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 442
Karma: 2666666
Join Date: Nov 2020
Device: none
|
You don't have to modify the sqlite file, Entity Ruler will tell spaCy the X-Ray entity's name and type(NER label). And the .asc file will use the customized description if the name is the same.
I'm using RapidFuzz and Wikipedia(normalized or redirects) to merge similar X-Ray entities. And try to use the full name(has white space or interpunct) for the person X-Ray entity. Entity Ruler also has this feature, I can add another aliases column(enter multiple data by separating them with "," or just one alias) to the table and assign the same id to all aliases. Last edited by xxyzz; 06-21-2022 at 09:15 PM. |
![]() |
![]() |
![]() |
#326 |
Evangelist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 442
Karma: 2666666
Join Date: Nov 2020
Device: none
|
I have pushed the changes to GitHub. I added a new customize X-Ray menu, it opens the edit dialog for selected books.
|
![]() |
![]() |
![]() |
#327 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 5,802
Karma: 103362673
Join Date: Apr 2011
Device: pb360
|
Quote:
I installed the "Hyphenate This!" plugin and downloaded the above book and clicked the "Remove soft hypens.." option, then clicked "OK", but after it was finished the book did not change. Can you attach the exact book that you use for testing so that we can see the same things happen when testing? |
|
![]() |
![]() |
![]() |
#328 |
Evangelist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 442
Karma: 2666666
Join Date: Nov 2020
Device: none
|
I didn't remove soft hyphen for the test files(https://github.com/xxyzz/WordDumb/fi...1564/books.zip) because I didn't know the book has them back then. But the KFX book in the zip file doesn't have soft hyphen, maybe kindlepreviewer removed them in the convert process.
|
![]() |
![]() |
![]() |
#329 | ||||
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 5,802
Karma: 103362673
Join Date: Apr 2011
Device: pb360
|
Quote:
Quote:
Quote:
Quote:
|
||||
![]() |
![]() |
![]() |
#330 |
Evangelist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 442
Karma: 2666666
Join Date: Nov 2020
Device: none
|
Have you tried the latest plugin from GitHub Action? You can add person name and their aliases in the new customize x-ray menu. You can created the ""worddumb-custom-x-ray.json" in a text editor and put it in the book folder. You could add "Livy" and see if it helps. Or add all the name variants as "Livy"'s aliases.
Example file: Code:
[ [ "name a", # entity name "PERSON", # NER label "name-a,name-A", # aliases "name a desc" # description, leave empty to use Wikipedia summary ], [ "name b", "PERSON", "", "" ] ] Last edited by xxyzz; 06-24-2022 at 11:15 PM. |
![]() |
![]() |
![]() |
Tags |
worddumb, x-ray |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
[GUI Plugin] KindleUnpack - The Plugin | DiapDealer | Plugins | 523 | 07-15-2025 06:45 PM |
[GUI Plugin] CalibreSpy | DaltonST | Plugins | 245 | 08-18-2024 09:33 PM |
[GUI Plugin] Manga plugin | mastertea | Plugins | 6 | 01-06-2022 02:43 AM |
[GUI Plugin] Save Virtual Libraries To Column (GUI) | chaley | Plugins | 14 | 04-04-2021 05:25 AM |
[GUI Plugin] Plugin Updater **Deprecated** | kiwidude | Plugins | 159 | 06-19-2011 12:27 PM |