View Single Post
Old 07-05-2012, 04:30 PM   #2
kiwidude
calibre/Sigil Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,230
Karma: 1345754
Join Date: Oct 2010
Location: London, UK
Device: Kindle Paperwhite 3G, iPad 3, iPad Air
That is liking asking for McDonald's secret sauce recipe I'll try to explain it as I remember it... it is very complicated. And of course this may change in future...

When you have multiple metadata download plugins, it first tries to group the results together based on them being in the same ISBN pools according to Worldcat. Which is why you sometimes see one row and sometimes see multiple rows in the metadata download screen - it all depends on those differing ISBN editions being recorded as referring to the same book. If different websites return different ISBNs for different editions that aren't in the same "group", then you see multiple rows on the metadata download screen.

If you are downloading metadata one book at a time, you get to choose which isbn pool (group of results) to proceed with, and all the other plugins results will get completely ignored. If you use bulk metadata download, that interactive choice is not available and it just chooses the first pool (at the top if interactive). Which means you can get some pretty crappy results particularly if you don't have an ISBN and the plugin/website was forced to guess the book by title/author matching. Which is why I don't recommend using bulk download unless you are very sure about what you are doing and inspect the results afterwards.

So - having chosen a pool you are already frequently down to a subset of the results. Some people like myself often end up running the metadata download several times to make sure we scoop up links to all the sites we want (for hyperlinked access in the book details panel, and to sometimes get better data from a particular plugin missed the first time). Since calibre replaces the ISBN for the book each time metadata download runs, you will often find that on a second run the website(s) that did not return a match in the same pool the first time around will all end up one row/same ISBN pool the second time around provided that website knows the book by that ISBN of course.

Then comes the bit you are asking about - the munging of results. It depends on the field. For some fields it chooses the shortest value under the assumption that has the least "cruft" in it - such as title or publisher. In other fields like comments it chooses the longest. For ratings it averages any non-zero rating values. Pubdate is unfortunately tied to dates from Worldcat which I don't find reliable, but I've mumbled enough about that elsewhere and Kovid has said he will accept a patch if someone will write it.

So it isn't the "first" or "last" found, it is a frankenstein combination.

As for the best to use in combination, you take your choices. Personally as mentioned above my stock plugins I use are B&N (best resolution covers), Goodreads (best series information, genre tags and ratings) and FF (another good all-rounder). I also have the Amazon plugin enabled, but along with B&N I disable the comments columns for those plugins because personally I hate paid for "reviews" as the description, I want an actual synopsis which I more reliably get from FF or Goodreads.

There are other people who like the weird and wonderful tags that come from Google. If you are fans of other niches like Baen books use Webscription for usually top quality covers and descriptions of their books. If you want metadata for international language books there are a range of plugins for those. Just bear in mind that the more plugins you have the slower a metadata download retrieval will be, and that as described above other than the cover or by turning off features of certain plugins you have very little control over exactly which website's result will be stored on your book.
kiwidude is offline   Reply With Quote