Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 06-23-2011, 08:04 AM   #1
madeinlisboa
Enjoy Life
madeinlisboa began at the beginning.
 
Posts: 24
Karma: 10
Join Date: Jun 2011
Location: Portugal
Device: Kindle
Bulk metadata download incoherent

I almost never get the same metadata from two consecutive metadata downloads. Why is that?
Most of the times I have to edit each book individually in order to add proper metadata to them, because some tags are returned blank.
My suggestion is collect the metadata from the source most complete or in last case, merge data from several sources in order to fill all tags.
madeinlisboa is offline   Reply With Quote
Old 06-23-2011, 11:13 AM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 26,314
Karma: 5382313
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Feel free to submit a patch, I eagerly await it.
kovidgoyal is online now   Reply With Quote
Old 06-23-2011, 11:40 AM   #3
madeinlisboa
Enjoy Life
madeinlisboa began at the beginning.
 
Posts: 24
Karma: 10
Join Date: Jun 2011
Location: Portugal
Device: Kindle
Quote:
Originally Posted by kovidgoyal View Post
Feel free to submit a patch, I eagerly await it.
Did I offend you? I only posted a suggestion. I thought this was the purpose of this forum. Unfortunately I don't have the time and skills for that, but I'm sure that selecting the most complete source or merging them is not that hard. I really don't understand the reaction
madeinlisboa is offline   Reply With Quote
Old 06-23-2011, 12:48 PM   #4
theducks
Grand Sorcerer
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 15,054
Karma: 5936659
Join Date: Aug 2009
Location: (The original) Silicon Valley, USA
Device: Galaxy Tab 2, Astak Pocket Pro, K4NT
Quote:
Originally Posted by madeinlisboa View Post
Did I offend you? I only posted a suggestion. I thought this was the purpose of this forum. Unfortunately I don't have the time and skills for that, but I'm sure that selecting the most complete source or merging them is not that hard. I really don't understand the reaction
You want the program to 'mind read' what you consider "Good Meta-data" and then are unwilling to provide the code needed

Like covers, I prefer to hand pick the metadata by using the Manual Download button on the individual title Metadata entry page. (I also fine tune Which source supplies, Wich data.
theducks is offline   Reply With Quote
Old 06-23-2011, 08:21 PM   #5
madeinlisboa
Enjoy Life
madeinlisboa began at the beginning.
 
Posts: 24
Karma: 10
Join Date: Jun 2011
Location: Portugal
Device: Kindle
I didn't say good metadata, I said non empty metadata. I just suggested Calibre should ignore blank fields and use from other sources if available, just that. It is not the current situation. Calibre returns blank metadata from some fields even if it is present on other sources.
madeinlisboa is offline   Reply With Quote
Old 06-23-2011, 09:34 PM   #6
kiwidude
calibre/Sigil Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,230
Karma: 1334002
Join Date: Oct 2010
Location: London, UK
Device: Kindle Paperwhite 3G, iPad 3, iPad Air
@madinlisboa - as someone who has developed a number of metadata plugins and pored through Kovid's code I can confirm that you are not correct in your assumptions on the behaviour.

The current metadata plugins do merge fields together from different sources. However it will only do this under certain conditions. One of those conditions is that the ISBNs for each of those sources refers to the same edition, which is determined by a call to the XISBN service to identify an ISBN "pool". If results from your sources have ISBNs that fall in the same pool, then data will be selected from across those results - so it could well be series information from one website and comments from another etc.

The only other exception to this I am aware of is something I created an request for in Launchpad, and it is the more rare situation of where you get multiple results, but only a subset of which have an ISBN. This most frequently happens with future book releases. The current behaviour is to take only data from the result with the ISBN as the assumption is that this is a "better" result. I want this changed so that it will also merge with an ISBNless results from another source. So if Goodreads gives you an ISBN, but FantasticFiction has series information but no ISBN, you get both merged together.

As to why you get different results when you repeatedly search, I presume you are talking about on the same book? In which case the answer to that comes down to the ISBN. Every time you do a metadata download for a book, the ISBN for the book will get overwritten with one from the results. This means your ISBN may flop around a bit with multiple metadata retrieves. And then you may find that one of the other metadata sources will give you a different result to previous, because it happens to have different metadata for that edition of the book.

If you want maximum population of data there are a number of steps you can take. Firstly use plugins that populate the fields you are interested in for the books you have. For instance if you want series information, the Goodreads or Fantastic Fiction plugins are the best. For tags as genres, use Goodreads. For covers, I prefer B&N with FF or Goodreads as a fallback option. Others have said they like Amazon which is fine for some things (though no series or tags information and covers can be hit and miss and low resolution).

Secondly, a good ISBN will always give you the best match. The Extract ISBN plugin can help with this, provided the ISBN is in the book for it to find. Note this plugin is not infallible, not all books have an ISBN it can read, or worse sometimes the ISBN it finds is from the publisher advertising some other book within it. However a very high % of the time it gets it right, and will give you the best chance of a quality edition match with most metadata sources.

Thirdly, download metadata one at a time if you want serious quality. Take care as to what fields you have selected to overwrite - if you have title and authors checked, then make sure in the results it really is the book you expect. The metadata sources are not miracle workers as they are at the mercy of the results returned by the website search engines. Sometimes they prioritise books they want to sell in the search results (like box sets) over the actual edition you want. Or you might have the wrong ISBN and get data for a different book. By reviewing the search results of metadata download you can make sure you get a result for the right book.

Fourthly, it might be necessary to do multiple retrieves for a single book. Remember the ISBN may change with each retrieve. So if it fails to find a match on a site for your first ISBN (or perhaps you didn't even have an ISBN), it might find one with the second download. I make sure that where possible every single book in my library is linked to its FantasticFiction page, B&N and Goodreads. If I have the right ISBN I can get this in one download, sometimes it takes more.

And finally, in the case of my plugins at least, sometimes it is necessary to manually assign the identifier. All my Metadata plugins work on the basis that if you have a website specific identifier (ff: or barnesnoble: or goodreads: ) then it will jump straight to the webpage for that book to pull metadata from it. This is both fast and means you are not at the mercy of the website search engine. These identifiers will get populated automatically when the plugin finds a match for a book. However if needed you can force such a match manually by typing in the id it needs (FF/B&N) or use the Goodreads Sync plugin "Link book" feature to assign a Goodreads id. If you do this and then do a metadata retrieval as I said above it will pull data from that specific edition page of the book.

That's my tips, if you want seriously good data. It may sound like more work than hitting Ctrl+D on a bunch of books and expecting "magic" to happen. In reality you can do several books a minute and you only have to do it once...

Last edited by kiwidude; 06-23-2011 at 09:41 PM. Reason: typos
kiwidude is offline   Reply With Quote
Old 06-24-2011, 02:18 PM   #7
speakingtohe
Wizard
speakingtohe ought to be getting tired of karma fortunes by now.speakingtohe ought to be getting tired of karma fortunes by now.speakingtohe ought to be getting tired of karma fortunes by now.speakingtohe ought to be getting tired of karma fortunes by now.speakingtohe ought to be getting tired of karma fortunes by now.speakingtohe ought to be getting tired of karma fortunes by now.speakingtohe ought to be getting tired of karma fortunes by now.speakingtohe ought to be getting tired of karma fortunes by now.speakingtohe ought to be getting tired of karma fortunes by now.speakingtohe ought to be getting tired of karma fortunes by now.speakingtohe ought to be getting tired of karma fortunes by now.
 
Posts: 4,597
Karma: 25170940
Join Date: Apr 2010
Device: sony PRS-T1 and T3, Kobo Mini and Aura HD, Tablet
@kiwidude
Thanks. That answered a few questions that I was also wondering about.
New metadata plugins are pretty damn decent IMO. (od ones were fine but new ones are better)

Helen
speakingtohe is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Does "Download Metadata & Covers" also download social metadata? iridius Library Management 3 02-22-2011 01:50 PM
Editing Metadata in Bulk ballast Calibre 5 08-15-2010 04:14 PM
Performance in Bulk Metadata Changes pfooti Calibre 3 01-02-2010 08:59 PM
metadata in bulk Lorraine Froggy Calibre 1 11-14-2009 10:42 PM
Bulk Metadata Download iain_benson Calibre 1 09-29-2009 12:42 PM


All times are GMT -4. The time now is 01:54 AM.


MobileRead.com is a privately owned, operated and funded community.