Quote:
Originally Posted by Boilerplate4U
What I meant is that many of the older variants are far too complex in their structure to find the central thread.
|
Which are you looking at? They all basically come down to the identify method building a search query, running it and examining the results to find the matching books. Then they get the full details for each of the matches.
The complications come in the form of how to to that search and if you need to try various searches. Most will check for an identifier for the site, then an ISBN and then do a search with title and author and possibly a title only search if that makes sense for the site.
But, the interface is described in the calibre source code calibre/ebooks/metadata/sources/base.py.
Quote:
It seems like the api documents would need some simple "howtos" that explain the basic process. Some simple source code templates would be useful as well.
|
Honestly, I don't think a simple source code would help. Because it isn't simple. And it would need a simple site to scrape for it to be useful. And those do not exist. And writing anything like this means someone has to have time. I don't think that there have been enough people interested in writing the metadata source plugins, or even possible sources, to make more extensive documentation worth it. The people who are doing these plugins
Quote:
Scraping content using xpath is quite easy nowadays when you have helper tools like xPather.com and Google Xpath Helper.
|
I'll have to have a look at these the next time a site changes and I have to fix it.
Quote:
Some of the lager content sites I'm going to use also offer api:s like marcxml/marc21 and json-ld which makes things a lot easier.
|
Exactly what sort of metadata are you trying to get???????? I'm sure it is a typo, but...
What sites are you looking at? I am a bit surprised that there are large ebook sites that do not already have metadata source plugins. Unless they are non-English sites. Or maybe very specialise repositories.
You might also find problems with sites that offer APIs if they need any sort of authentication. If you are doing this for personal use, it probably won't be an issue. But, some of the sites will either limit the access based on a developer key or need individual access and that makes them harder for the general user use.
Quote:
When I have some time, I gonna fix a basic HOWTO together with a corresponding code example for future references and put it on github.
Btw, do you know if the metadata plugins are running in parallel or in series (ie when multiple plugins are activated) ??
|
The different sources are run in parallel. There is no interaction between each of them. They return results that calibre then handles.