03-01-2011, 04:18 AM | #1 |
Wizard
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
|
Cover/metadata retrieval when ISBN is un-configured
I just hooked my Overdrive plugin fully into Calibre so I could test its behavior from the GUI. From the CLI everything was working pretty well, but from the GUI I'm seeing things that concern me. It seems to be related to whether or not an ISBN is configured prior to retrieving the cover. Not sure if similar behavior happens for Metadata.
My test book was 'Bitten' by Kelley Armstrong - popular book, translated in multiple languages, many editions/title variations. I added the epub to Calibre as a brand new book and re-started calibre to make sure all caches/previous references were clear. The epub didn't have any ISBN in it's metadata, hence that field was empty. I had previously disabled all the cover/metadata download plugins except Google Books/ISBNDB (I would have disabled those too, but at some low level Calibre seems to assume one of those will alway be enabled, otherwise metadata download fails instantly). The new Overdrive plugin was also enabled. I could initiate the cover download either using ctrl-D to download all metadata or just clicking the 'download cover' button in edit metadata. The core function for getting covers works more or less the same as Amazon's get_cover_url. I immediately saw calls to this function for numerous titles/ISBNs for multiple editions of the book. It looked like multiple simultaneous threads were calling this for every book returned by ISBNDB/Google? This all started happening well before the xisbn to overdrive ID mapping would have a chance to occur. The way the plugin is plugin is written it does a couple searches against the web server based on each book format, as that was the only way I could find to prioritize ebooks over audio books. It stops on the first successful match. Anyway the net is that metadata download for a single book caused 74 searches against the web server, and only stopped at 74 because I haven't gotten around to cleansing titles and the final variation wasn't considered a string. Over the course of those 74 queries a number of cover URLs were found, but it kept on going. This was even with a number of successful cache lookup matches to xisbn eliminating some queries. There appeared to be some looping going on here that I didn't fully understand, as I saw the same author/title combo going to the server many times. Anyway I'm thinking if the ISBN isn't configured perhaps only the first closest match ISBN should be used, and I don't think title variations should be attempted unless perhaps the first title didn't return a cover. Willing to dig into tuning some of this myself, but I don't know that much about the core of the metadata download code, so some guidance would be helpful. When an ISBN was pre-set before downloading the cover everything was quite well behaved, functioning exactly as expected. Last edited by ldolse; 03-01-2011 at 04:25 AM. |
03-01-2011, 04:32 AM | #2 |
Wizard
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
|
Further along this same thread - the ISBN I configured for the book is the ISBN for the epub edition I saw in the Overdrive metadata.
That worked well for covers, but didn't work for other metadata types. Apparently this ISBN doesn't exist in either Google/ISBNDB, and due to that fact Calibre immediately came back with 'No matches found for this book' instead of attempting to query the other metadata providers that are configured. Last edited by ldolse; 03-01-2011 at 05:20 AM. |
03-01-2011, 04:46 AM | #3 |
Calibre Plugins Developer
Posts: 4,637
Karma: 2162064
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
Maybe Kovid has changed things in the last few weeks but one of the limitations I found for my Goodreads covers download is that you can *only* download covers for books with an ISBN. It is hard-baked into a number of places in the Calibre code that if the book has no ISBN then you are out of luck. I reported this as a ticket here and Kovid said it would be addressed as part of the new API.
|
03-01-2011, 05:27 AM | #4 | |
Wizard
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
|
Quote:
The difference I'm seeing for covers vs. metadata providers is it's not checking the validity of the ISBN before cover download, but for metadata it will only proceed to get metadata after confirming the ISBN is in Google's or ISBNDB's databases. Now that I think about I have a hunch that this is because three interfaces are sharing the same code - bulk download, download cover button, and 'Fetch Metadata' button - the fetch metadata button checks each ISBN to see if a cover exists so it can display that info to the user - I suspect it's that bit that doesn't play well with the overdrive plugin. Last edited by ldolse; 03-01-2011 at 05:30 AM. |
|
03-01-2011, 09:53 AM | #5 |
creator of calibre
Posts: 43,857
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
There are three stages to the metadata download process:
1) An identify stage: this uses isbndb and google books to get the book isbn from title/author or get the title/author from the isbn. This also queries the cover download plugins has_cover method 2) Social metadata download: This downloads tags/rating/comments/series based on the isbn from step 1 3) cover download. picks the first cover returned by the cover download plugins based on the metadata discovered so far. All builting cover download plugins use isbn. |
03-01-2011, 10:20 AM | #6 | |
Wizard
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
|
Quote:
I can see an argument for bulk fetches iterating through the ISBN's to the first ISBN that can be associated with a cover, but I don't see why it needs to keep going after that. The other issue is that as soon as an ISBN matches it begins using that title in the related cover searches, which in general is more likely to fail than the original cover, since more often than not that variant includes subtitle/series info (though the title cleansing discussed in the other thread could mitigate that somewhat). |
|
03-01-2011, 10:22 AM | #7 |
creator of calibre
Posts: 43,857
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Because calibre tries to find the best match.
|
03-01-2011, 10:35 AM | #8 |
Wizard
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
|
Best match based on what Criteria? When I manually search for 'Bitten' in the example above, ISBNDB has 8 matches where the title matches exactly, Google has one, and the one Calibre chooses is the one that includes series information in the title which wasn't originally there... The one Calibre chose isn't wrong, but I wouldn't call it more right than the ones where the title actually matched exactly.
The version that gets chosen is the one at the top of the list when you manually go to the 'Fetch Metadata' button and have it populate a list of options. |
03-01-2011, 10:46 AM | #9 |
creator of calibre
Posts: 43,857
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
See the code in metadata.fetch for exactly how the results are sorted/merged
|
11-05-2011, 09:43 AM | #10 |
Enthusiast
Posts: 34
Karma: 10
Join Date: Aug 2011
Device: Amazon Kindle 3
|
Hi. Sorry for refreshing this old thread, but it's related to my question.
Many pages have a few variants of cover (different edition etc.). Possibility of returning more than one cover would be useful. Of course there should be some limitation for plugins to avert cover-flooding. I am big fan of cover flow, so I find this useful. I hope you're too and you'll find this worth consideration. Thank you. |
11-05-2011, 10:29 AM | #11 |
creator of calibre
Posts: 43,857
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Returning multiple covers is slow. Each cover has to be downloaded before it can be displayed. And you cannot typically download covers in parallel from the same site (as you can from different sites) as that would overload the site's servers. Imagine 5 million calibre users all downloading 10 covers per book from some poor site
For browsing multiple covers, a google image search actually works pretty well. See for example the search the internet calibre plugin. Not as nice as cover flow, obviously, but not too bad either. |
11-05-2011, 01:01 PM | #12 | |
Enthusiast
Posts: 34
Karma: 10
Join Date: Aug 2011
Device: Amazon Kindle 3
|
Quote:
We must try that some day |
|
11-05-2011, 10:07 PM | #13 |
creator of calibre
Posts: 43,857
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
links for the covers are not enough, you have to also download the actual cover data. Still, I have no fundamental objection to allowing plugins to download multiple covers, however, I am not very motivated to implement it either, so, patches welcome.
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
recipe content retrieval | Torx | Recipes | 1 | 04-06-2013 03:58 PM |
Slow cover and Metadata retrieval times. | chango714 | Calibre | 3 | 03-20-2011 10:40 AM |
content retrieval recipe | Torx | Amazon Kindle | 0 | 12-17-2010 11:05 AM |
Recipe - save some date for later retrieval | mh445 | Calibre | 3 | 07-19-2010 04:06 PM |
"BOOKS" button leads to an empty display after configured to the CF card? | genome2k | iRex | 12 | 09-24-2008 08:14 AM |