Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Development

Notices

Reply
 
Thread Tools Search this Thread
Old 03-01-2011, 05:18 AM   #1
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
Cover/metadata retrieval when ISBN is un-configured

I just hooked my Overdrive plugin fully into Calibre so I could test its behavior from the GUI. From the CLI everything was working pretty well, but from the GUI I'm seeing things that concern me. It seems to be related to whether or not an ISBN is configured prior to retrieving the cover. Not sure if similar behavior happens for Metadata.

My test book was 'Bitten' by Kelley Armstrong - popular book, translated in multiple languages, many editions/title variations. I added the epub to Calibre as a brand new book and re-started calibre to make sure all caches/previous references were clear. The epub didn't have any ISBN in it's metadata, hence that field was empty.

I had previously disabled all the cover/metadata download plugins except Google Books/ISBNDB (I would have disabled those too, but at some low level Calibre seems to assume one of those will alway be enabled, otherwise metadata download fails instantly). The new Overdrive plugin was also enabled.

I could initiate the cover download either using ctrl-D to download all metadata or just clicking the 'download cover' button in edit metadata.

The core function for getting covers works more or less the same as Amazon's get_cover_url. I immediately saw calls to this function for numerous titles/ISBNs for multiple editions of the book. It looked like multiple simultaneous threads were calling this for every book returned by ISBNDB/Google? This all started happening well before the xisbn to overdrive ID mapping would have a chance to occur.

The way the plugin is plugin is written it does a couple searches against the web server based on each book format, as that was the only way I could find to prioritize ebooks over audio books. It stops on the first successful match.

Anyway the net is that metadata download for a single book caused 74 searches against the web server, and only stopped at 74 because I haven't gotten around to cleansing titles and the final variation wasn't considered a string. Over the course of those 74 queries a number of cover URLs were found, but it kept on going. This was even with a number of successful cache lookup matches to xisbn eliminating some queries. There appeared to be some looping going on here that I didn't fully understand, as I saw the same author/title combo going to the server many times.

Anyway I'm thinking if the ISBN isn't configured perhaps only the first closest match ISBN should be used, and I don't think title variations should be attempted unless perhaps the first title didn't return a cover.

Willing to dig into tuning some of this myself, but I don't know that much about the core of the metadata download code, so some guidance would be helpful.


When an ISBN was pre-set before downloading the cover everything was quite well behaved, functioning exactly as expected.

Last edited by ldolse; 03-01-2011 at 05:25 AM.
ldolse is offline   Reply With Quote
Old 03-01-2011, 05:32 AM   #2
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
Further along this same thread - the ISBN I configured for the book is the ISBN for the epub edition I saw in the Overdrive metadata.

That worked well for covers, but didn't work for other metadata types.

Apparently this ISBN doesn't exist in either Google/ISBNDB, and due to that fact Calibre immediately came back with 'No matches found for this book' instead of attempting to query the other metadata providers that are configured.

Last edited by ldolse; 03-01-2011 at 06:20 AM.
ldolse is offline   Reply With Quote
Old 03-01-2011, 05:46 AM   #3
kiwidude
calibre/Sigil Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,230
Karma: 1334002
Join Date: Oct 2010
Location: London, UK
Device: Kindle Paperwhite 3G, iPad 3, iPad Air
Maybe Kovid has changed things in the last few weeks but one of the limitations I found for my Goodreads covers download is that you can *only* download covers for books with an ISBN. It is hard-baked into a number of places in the Calibre code that if the book has no ISBN then you are out of luck. I reported this as a ticket here and Kovid said it would be addressed as part of the new API.
kiwidude is offline   Reply With Quote
Old 03-01-2011, 06:27 AM   #4
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
Quote:
Originally Posted by kiwidude View Post
Maybe Kovid has changed things in the last few weeks but one of the limitations I found for my Goodreads covers download is that you can *only* download covers for books with an ISBN. It is hard-baked into a number of places in the Calibre code that if the book has no ISBN then you are out of luck. I reported this as a ticket here and Kovid said it would be addressed as part of the new API.
Well it definitely didn't seem to want to download covers without an ISBN, though the Overdrive plugin doesn't use ISBN. However it was retrieving ISBN data beforehand from google/isbndb before attempting cover retrieval. The problem was that instead of picking the first/best ISBN from ISBNDB/Google it seemed to be searching for all of the covers at once, and was also over-writing the title in every case.

The difference I'm seeing for covers vs. metadata providers is it's not checking the validity of the ISBN before cover download, but for metadata it will only proceed to get metadata after confirming the ISBN is in Google's or ISBNDB's databases.


Now that I think about I have a hunch that this is because three interfaces are sharing the same code - bulk download, download cover button, and 'Fetch Metadata' button - the fetch metadata button checks each ISBN to see if a cover exists so it can display that info to the user - I suspect it's that bit that doesn't play well with the overdrive plugin.

Last edited by ldolse; 03-01-2011 at 06:30 AM.
ldolse is offline   Reply With Quote
Old 03-01-2011, 10:53 AM   #5
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 26,312
Karma: 5382313
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
There are three stages to the metadata download process:

1) An identify stage: this uses isbndb and google books to get the book isbn from title/author or get the title/author from the isbn. This also queries the cover download plugins has_cover method

2) Social metadata download: This downloads tags/rating/comments/series based on the isbn from step 1

3) cover download. picks the first cover returned by the cover download plugins based on the metadata discovered so far. All builting cover download plugins use isbn.
kovidgoyal is offline   Reply With Quote
Old 03-01-2011, 11:20 AM   #6
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
Quote:
Originally Posted by kovidgoyal View Post
There are three stages to the metadata download process:

1) An identify stage: this uses isbndb and google books to get the book isbn from title/author or get the title/author from the isbn. This also queries the cover download plugins has_cover method

2) Social metadata download: This downloads tags/rating/comments/series based on the isbn from step 1

3) cover download. picks the first cover returned by the cover download plugins based on the metadata discovered so far. All builting cover download plugins use isbn.
Understood - I guess my concern is with stage 1. The behavior makes sense when a user clicks the 'fetch metadata' button, as you want to provide them with a list of choices. However there are at least two ways to download metadata which don't allow for user interaction - Bulk Metadata download and the 'download cover' button. Why iterate through every single possible record in those scenarios when the user isn't going to get any choice in the matter anyway? I'm looking at this from both a performance perspective of getting the information downloaded as well as load on the metadata providers that have varying levels of love of Calibre users.

I can see an argument for bulk fetches iterating through the ISBN's to the first ISBN that can be associated with a cover, but I don't see why it needs to keep going after that. The other issue is that as soon as an ISBN matches it begins using that title in the related cover searches, which in general is more likely to fail than the original cover, since more often than not that variant includes subtitle/series info (though the title cleansing discussed in the other thread could mitigate that somewhat).
ldolse is offline   Reply With Quote
Old 03-01-2011, 11:22 AM   #7
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 26,312
Karma: 5382313
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Because calibre tries to find the best match.
kovidgoyal is offline   Reply With Quote
Old 03-01-2011, 11:35 AM   #8
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
Best match based on what Criteria? When I manually search for 'Bitten' in the example above, ISBNDB has 8 matches where the title matches exactly, Google has one, and the one Calibre chooses is the one that includes series information in the title which wasn't originally there... The one Calibre chose isn't wrong, but I wouldn't call it more right than the ones where the title actually matched exactly.

The version that gets chosen is the one at the top of the list when you manually go to the 'Fetch Metadata' button and have it populate a list of options.
ldolse is offline   Reply With Quote
Old 03-01-2011, 11:46 AM   #9
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 26,312
Karma: 5382313
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
See the code in metadata.fetch for exactly how the results are sorted/merged
kovidgoyal is offline   Reply With Quote
Old 11-05-2011, 10:43 AM   #10
fenuks
Enthusiast
fenuks began at the beginning.
 
Posts: 33
Karma: 10
Join Date: Aug 2011
Device: Amazon Kindle 3
Hi. Sorry for refreshing this old thread, but it's related to my question.
Quote:
Originally Posted by kovidgoyal View Post
3) cover download. picks the first cover returned by the cover download plugins based on the metadata discovered so far.
Many pages have a few variants of cover (different edition etc.). Possibility of returning more than one cover would be useful. Of course there should be some limitation for plugins to avert cover-flooding. I am big fan of cover flow, so I find this useful. I hope you're too and you'll find this worth consideration. Thank you.
fenuks is offline   Reply With Quote
Old 11-05-2011, 11:29 AM   #11
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 26,312
Karma: 5382313
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Returning multiple covers is slow. Each cover has to be downloaded before it can be displayed. And you cannot typically download covers in parallel from the same site (as you can from different sites) as that would overload the site's servers. Imagine 5 million calibre users all downloading 10 covers per book from some poor site

For browsing multiple covers, a google image search actually works pretty well. See for example the search the internet calibre plugin. Not as nice as cover flow, obviously, but not too bad either.
kovidgoyal is offline   Reply With Quote
Old 11-05-2011, 02:01 PM   #12
fenuks
Enthusiast
fenuks began at the beginning.
 
Posts: 33
Karma: 10
Join Date: Aug 2011
Device: Amazon Kindle 3
Quote:
Originally Posted by kovidgoyal View Post
you cannot typically download covers in parallel from the same site (as you can from different sites) as that would overload the site's servers.
Typically not, but sometimes there is possibility to get links for another covers directly from same site. I didn't except that every metadata source plugin will return n covers (n>1), but when there are convenient conditions for that plugin author should have possibility to make use of them.
Quote:
Originally Posted by kovidgoyal View Post
Imagine 5 million calibre users all downloading 10 covers per book from some poor site
We must try that some day
fenuks is offline   Reply With Quote
Old 11-05-2011, 11:07 PM   #13
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 26,312
Karma: 5382313
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
links for the covers are not enough, you have to also download the actual cover data. Still, I have no fundamental objection to allowing plugins to download multiple covers, however, I am not very motivated to implement it either, so, patches welcome.
kovidgoyal is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
recipe content retrieval Torx Recipes 1 04-06-2013 04:58 PM
Slow cover and Metadata retrieval times. chango714 Calibre 3 03-20-2011 11:40 AM
content retrieval recipe Torx Amazon Kindle 0 12-17-2010 12:05 PM
Recipe - save some date for later retrieval mh445 Calibre 3 07-19-2010 05:06 PM
"BOOKS" button leads to an empty display after configured to the CF card? genome2k iRex 12 09-24-2008 09:14 AM


All times are GMT -4. The time now is 05:38 PM.


MobileRead.com is a privately owned, operated and funded community.