Thread: Wikidata GUI
View Single Post
Old 01-13-2019, 08:22 PM   #1
compurandom
Guru
compurandom ought to be getting tired of karma fortunes by now.compurandom ought to be getting tired of karma fortunes by now.compurandom ought to be getting tired of karma fortunes by now.compurandom ought to be getting tired of karma fortunes by now.compurandom ought to be getting tired of karma fortunes by now.compurandom ought to be getting tired of karma fortunes by now.compurandom ought to be getting tired of karma fortunes by now.compurandom ought to be getting tired of karma fortunes by now.compurandom ought to be getting tired of karma fortunes by now.compurandom ought to be getting tired of karma fortunes by now.compurandom ought to be getting tired of karma fortunes by now.
 
Posts: 919
Karma: 417282
Join Date: Jun 2015
Device: kobo aura h2o, kobo forma
Wikidata GUI

This plugin imports near arbitrary metadata from Wikidata for books that already have a wikidata ID. This includes a bulk search feature to try to do exact title/author searches in wikidata to find books and can try to match books by other identifiers as well. Use the wikidata metadata plugin to find and add IDs to books that are not found by these methods.

There's a todo.txt in the zip file with the complaints I personally have and possible future features. Discussion of these or others will prioritize their implementation.

Features supported in version 2.0.0:
  • Bulk search for books in wikidata using exact match on title, author (faster and less likely to hit query limit than the metadata plugin version)
  • Property, Tag, and Identifier editor to help mange wikidata before merging into books
  • Import from books or manually enter arbitrary wikidata properties and entities and mark for filtering
  • Import from books external identifiers in wikidata for editing
  • Merge in series data with support for a second series in a custom column
  • Identifier editor to manage importing and translating of identifiers back and forth to URIs
  • Convert and search by external identifiers already in book metadata
  • Partially merge selected properties when adding new properties.
  • Mark books to indicate merge status
  • Option to hide uninteresting imported properties and tags
  • Import empty placeholder books for books by same author and books in series

Version History:
Spoiler:

Version 2.0 - 1 Jan 2021
This is mostly a bug fix version, fixing several long standing bugs that have irritated me.
This also now works in python3 for calibre 5.

* Upgrade minimum version to calibre 5 since there's no testing with calibre 4 (If you are interested in testing, let me know.)
* Changes for python3 syntax, unicode changes
* Upgrade SPARQLWrapper
* Tweak error messages in multiple places so the user actually sees them
* Try harder to differentiate between no books found and wikidata got an error and crashed
* Attempt to recognize wikidata throttling (if there is any)
* Change GET to POST to allow querying more books at once, now it times out if you ask for too many, and books with single quotes in their title now can be matched.
* Update the view so metadata changes show up immediately
* Update user agent string to fit wikidata standards

Version 1.3 - 24 Feb 2019

Bulk search for books in wikidata by exact match of title, author
- Use this instead of the Wikidata metadata plugin for faster but less complete results.
- Note: don't select too many books at once or it will fail. (The other plugin gets rate limited on bulk import.)
Add features to import other books in series or by same author
* match existing books
* select books and add empty placeholder books for them

User interface for the book import feature is not yet complete -- suggestions welcome!

Version 1.2.1 - 11 Feb 2019
Fix bug with missing mark_* options in defaults
strip spaces for tags and properties

Version 1.2.0 - 28 Jan 2019

Add support for importing series data
second series supported in a custom column

fixed bug involving merging tags

Version 1.0.1 - 22 Jan 2019
Fix a crash during merge caused by occasional unexpected data from wikidata. This is a quick fix to capture the crash so that the offending book can be marked as in error for manual review and continue with the merge.

A more extensive fix is needed to properly handle conversion to target datatypes and possibly do a better job of merging data based on that.

Version 1.0.0 - 20 Jan 2019
New features:
* Search for books in wikidata by identifier
* Merge button on customize tabs
* Merge button on properties/tags tab can merge just selected properties
* Optionally mark books with merge status
* Option to hide uninteresting imported properties and tags
* convert overdrive gutenberg ids to gutenberg ids
* convert urls to ids from id preferences
* option to disable some delete confirmations
* convert overdrive gutenberg ids to gutenberg ids
* convert and delete urls matching imported IDs
* option to disable some delete confirmations
* option to chain convert IDs
Check forum thread for upcoming features and to give feedback on feature priority.

Bugs fixed:
* sorting now preserves selection
* Fixed a typo in new identifier default tag generation
* Saving ID visibility works now
* Messed with column widths
* Updated missing tooltips
* Performance improvements when updating metadata
* "undefined" dates now update properly

Version 0.3.0 - 13 Jan 2019
Initial release of Wikidata GUI plugin

Majority of planned features are implemented
Most features seem to not be buggy anymore


Known bugs:
Spoiler:
  • This version claims to only work in Calibre 5, because it hasn't been tested in Calibre 4.
  • Import books dialog doesn't (yet) save anything from matched books (unmatched books can be marked to import)
  • Edited author/title on imported books is not used (yet)
  • Bulk query will probably still choke if you select an insane number of books, but if you pick merely too many, wikidata times out trying to find them all. But you can select more than before.
  • Bulk query chokes on titles containing double quotes and control characters
  • Handles multivalue fields poorly when inserting into a non-string field (luckily this isn't very common--needs documentation)
  • Probably needs more documentation, please ask questions!
  • Partially implemented series and author import -- implement metadata update
  • Need to add advanced algorithms to do secondary search as well as sort and rank books in a series when wikidata has incomplete information
Attached Files
File Type: zip Wikidata-gui.zip (78.4 KB, 58258 views)

Last edited by compurandom; 01-01-2021 at 09:49 PM. Reason: Update to 2.0.0
compurandom is offline   Reply With Quote