View Single Post
Old 03-01-2011, 04:18 AM   #1
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
Cover/metadata retrieval when ISBN is un-configured

I just hooked my Overdrive plugin fully into Calibre so I could test its behavior from the GUI. From the CLI everything was working pretty well, but from the GUI I'm seeing things that concern me. It seems to be related to whether or not an ISBN is configured prior to retrieving the cover. Not sure if similar behavior happens for Metadata.

My test book was 'Bitten' by Kelley Armstrong - popular book, translated in multiple languages, many editions/title variations. I added the epub to Calibre as a brand new book and re-started calibre to make sure all caches/previous references were clear. The epub didn't have any ISBN in it's metadata, hence that field was empty.

I had previously disabled all the cover/metadata download plugins except Google Books/ISBNDB (I would have disabled those too, but at some low level Calibre seems to assume one of those will alway be enabled, otherwise metadata download fails instantly). The new Overdrive plugin was also enabled.

I could initiate the cover download either using ctrl-D to download all metadata or just clicking the 'download cover' button in edit metadata.

The core function for getting covers works more or less the same as Amazon's get_cover_url. I immediately saw calls to this function for numerous titles/ISBNs for multiple editions of the book. It looked like multiple simultaneous threads were calling this for every book returned by ISBNDB/Google? This all started happening well before the xisbn to overdrive ID mapping would have a chance to occur.

The way the plugin is plugin is written it does a couple searches against the web server based on each book format, as that was the only way I could find to prioritize ebooks over audio books. It stops on the first successful match.

Anyway the net is that metadata download for a single book caused 74 searches against the web server, and only stopped at 74 because I haven't gotten around to cleansing titles and the final variation wasn't considered a string. Over the course of those 74 queries a number of cover URLs were found, but it kept on going. This was even with a number of successful cache lookup matches to xisbn eliminating some queries. There appeared to be some looping going on here that I didn't fully understand, as I saw the same author/title combo going to the server many times.

Anyway I'm thinking if the ISBN isn't configured perhaps only the first closest match ISBN should be used, and I don't think title variations should be attempted unless perhaps the first title didn't return a cover.

Willing to dig into tuning some of this myself, but I don't know that much about the core of the metadata download code, so some guidance would be helpful.


When an ISBN was pre-set before downloading the cover everything was quite well behaved, functioning exactly as expected.

Last edited by ldolse; 03-01-2011 at 04:25 AM.
ldolse is offline   Reply With Quote