12-20-2016, 02:23 PM | #1 |
Groupie
Posts: 156
Karma: 511136
Join Date: May 2013
Location: -- Home is where the RV stops (Texas ~6 months/year)
Device: Kindle Fire HDX, Fire HD, Paperwhite,Android & Windpws phones
|
Getting Metadata from Amazon
I get most of my of books from Amazon, and they seem to have the best selection (for ebooks) of metadata. By best I mean not only the data, but being able to use the initial data to do a global search & replace (for example, all the various forms of Science Fiction to "SFF__". Some parts of their data is just silly, and I periodically do search-and-delete on it [such as "Two Hours or More (65-100 Pages)," and it's cousins] I do a lot of editing on metadata, and have a lot of custom tags, but in many cases I'm dependent, at least initially, on the original Amazon data as a starting point.
However, the metadata for IDs that is returned (i.e., mobi-asin:B01M33A032) has a very poor record of finding the correct book from Amazon when I hit [Download Metadata] ==> considerably less than 50%. It will bring back the wrong book, no books, or the book in dead-tree version, which normally has little metadata. However, if I manually put in "Amazon:B01M33A032" in the IDs field, my success rate goes to 80% - 90%. So, is there some script or add-in available to do this automatically on a selected subset of books? With 500+ new books from the recent Open Roads giveaway, I really don't want to do this manually. My goal is to select the 500+ books, run a script to add the "Amazon:B0xxxxxxxx" automagically, the use [CTRL-D] to download metadata only and get the correct Amazon ebook metadata most of the time. Thanks. |
12-20-2016, 09:14 PM | #2 |
creator of calibre
Posts: 43,844
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Metadata download does not use mobi-asin as there is no way to know which country store a mobi-asin corresponds to. You can use the search and replace feature of te bulk metadata edit dialog to mass convert the mobi-asin: to amazon: identifiers before running the download.
|
Advert | |
|
12-23-2016, 12:33 AM | #3 | |
Evangelist
Posts: 417
Karma: 6913952
Join Date: Aug 2013
Location: Hamden, CT
Device: Kindle Paperwhite (11th gen), Scribe
|
Quote:
|
|
12-26-2016, 07:00 PM | #4 |
Connoisseur
Posts: 66
Karma: 14170
Join Date: Oct 2011
Device: kindle 1
|
The last few days I have been having really spoty results downloading metadata from amazon. One out of 10 books will actually download. These are books that I just downloaded from my amazon account via computer and then added to calibre. They do not bring any metadata such as comments or tags with them however the books are good and I can read them with the book viewer. I then hit control D after selecting all books that I downloaded. Typically it just errors out on all of them. Sometimes I can pick a specific book and hit e for edit and then download metadata from there. More often it isn't working there either.
maybe later I will try again and it works. I read this thread and enabled the overdrive plugin and one of three books I was trying to download metadata just moments before actually worked but the other two did not. For something that was working well just days ago all the time, it is really frustrating. I'm thinking some of it is on amazons end as I am getting this error sometimes. Not sure that it is a valid message about to many books as this time I was only trying to get metadata for three books. I can change my IP address and it will still give that error. It's just frustrating. get_details failed for url: 'https://www.amazon.com/Darker-Element-Beyond-Godhunter-Book-ebook/dp/B00VH4AQIY/ref=sr_1_2/163-9040836-8902326?s=books&ie=UTF8&qid=1482796511&sr=1-2' Traceback (most recent call last): File "site-packages/calibre/ebooks/metadata/sources/amazon.py", line 297, in run File "site-packages/calibre/ebooks/metadata/sources/amazon.py", line 310, in get_details File "site-packages/calibre/ebooks/metadata/sources/amazon.py", line 315, in parse_details CaptchaError: Amazon returned a CAPTCHA page, probably because you downloaded too many books. Wait for some time and try again. ************************************************** ****************************** |
12-27-2016, 05:53 AM | #5 | |
Groupie
Posts: 167
Karma: 158116
Join Date: Oct 2015
Device: Kobo Glo HD (landscape), Kobo Aura One
|
Quote:
Thanks for the hint. My only question is whether this is really the intention to have the identifier double i.e. mobi-asin and amazon (instead replace). |
|
Advert | |
|
12-28-2016, 11:41 AM | #6 | |
Wizard
Posts: 1,760
Karma: 9918418
Join Date: Feb 2013
Location: Here on the perimeter, there are no stars
Device: Kobo H2O, iPad mini 3, Kindle Touch
|
Quote:
To be clear, these are all books I bought on Amazon US as Kindle-format ebooks, and I've used the "convert mobi-asin to amazon identifier" trick. Amazon should be giving metadata on all of them, but something about the bulk download attempt isn't working. |
|
12-28-2016, 12:25 PM | #7 |
Junior Member
Posts: 6
Karma: 10
Join Date: Jul 2016
Device: Kindle Apps (Desktop, Android phone & tablet)
|
I just started seeing this again as well. I saw it first back in July and there was some discussion then about the use of hard-coded vs random user-agents (https://www.mobileread.com/forums/sho...d.php?t=276443). But I installed the 2.75 Calibre update a few days ago and noticed this behavior again yesterday. The trick of changing the ID tag from mobi-asin:XXXX to amazon:XXXX seems to fix it, but I thought I'd bring up the user-agent issue and ask if that might also be involved.
|
12-28-2016, 12:59 PM | #8 |
creator of calibre
Posts: 43,844
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
THere have been no changes to the user agent in any recent release.
|
12-30-2016, 10:22 AM | #9 | |
Junior Member
Posts: 1
Karma: 10
Join Date: Dec 2016
Device: tablet, kindle pw, kobo aura one
|
Quote:
Of course this should be handled as an option to the amazon module defaulting to false. After playing around with this I came up with the following changes to amazon.py. I know there are different ways to work around this, but this is really easy to use (at least if most of your ebooks come from a single amazon portal that is available for selection in your amazon.py module.) Would you be willing to consider adding something like that to you upstream module? Code:
--- a/src/calibre/ebooks/metadata/sources/amazon.py +++ b/src/calibre/ebooks/metadata/sources/amazon.py @@ -793,6 +793,9 @@ class Amazon(Source): Option('domain', 'choices', 'com', _('Amazon website to use:'), _('Metadata from Amazon will be fetched using this ' 'country\'s Amazon website.'), choices=AMAZON_DOMAINS), + Option('use_mobi_asin', 'bool', False, + _('use ebook-internal mobi-asin to match eBook'), + _('Match eBook on selected Amazon site using the mobi-asin identifier contained in most Amazon eBooks')), ) def __init__(self, *args, **kwargs): @@ -837,6 +840,8 @@ class Amazon(Source): key = key.lower() if key in ('amazon', 'asin'): return 'com', val + if (self.prefs['use_mobi_asin'] and key in ('mobi-asin')): + return self.prefs['domain'], val if key.startswith('amazon_'): domain = key.partition('_')[-1] if domain and (domain in self.AMAZON_DOMAINS or domain in extra_domains): |
|
12-30-2016, 12:22 PM | #10 |
creator of calibre
Posts: 43,844
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Sure, I have no objection to adding such an option.
|
02-15-2017, 10:44 AM | #11 |
Connoisseur
Posts: 66
Karma: 14170
Join Date: Oct 2011
Device: kindle 1
|
Update on this for me. A few days after my last post on this it cleared up and worked fine. Without me doing anything. About 4 days ago it started doing it again. The error message for amazon indicates a captcha field blocking the query.
****************************** Amazon.com ****************************** Request extra headers: [('User-agent', u'Mozilla/5.0 (Windows NT 6.1; Trident/7.0; rv:11.0) like Gecko')] Found 0 results Downloading from Amazon.com took 0.408433914185 Plugin Amazon.com failed Traceback (most recent call last): File "site-packages/calibre/ebooks/metadata/sources/identify.py", line 48, in run File "site-packages/calibre/ebooks/metadata/sources/amazon.py", line 1163, in identify File "site-packages/calibre/ebooks/metadata/sources/amazon.py", line 1073, in parse_results_page CaptchaError: Amazon returned a CAPTCHA page, probably because you downloaded too many books. Wait for some time and try again. ************************************************** ****************************** I find the message confusing based on number of books. I have been able to do 100 to 200 books at the time with no issues other than individual books it was unable to identify. Other times a single book will give this message. I have thought maybe it had something to do with sometimes running through my vpn service that could have thousands of other users using that ip and to many of us collectively are doing similar requests. However I can disconnect from the vpn and get identical results/errors. The error makes me think it is a amazon security feature causing the problem. However it seemed to start and end not long after I respectively updated calibre in both instances. Take this feeling with a huge grain of salt though as it's only worth the digital ink im printing it with |
02-15-2017, 11:15 AM | #12 |
creator of calibre
Posts: 43,844
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
It is an amazon ant-bot measure and it uses statistical techniques -- so it is impossible to predict/understand its behavior.
|
07-02-2017, 12:43 PM | #13 |
Junior Member
Posts: 4
Karma: 10
Join Date: Jul 2017
Device: android
|
I have searched these forums and still haven't seen a solution for the metadata tag download issues from Amazon. I can see the tags on the web page for the book, and if I experiment using the Amazon Product API, I can retrieve them (though this is less than ideal and took a fair amount of registration and setup to get going).
There are no errors in the metadata download log, but just no tags either. This get's fairly annoying as it makes adding medium-to-large sets of books very difficult and time-consuming. Is anyone else having this problem or is it just me? |
07-02-2017, 01:23 PM | #14 |
creator of calibre
Posts: 43,844
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
The plugin does not support reading tgs, IIRC, tags are loaded on the website using javascript.
|
07-02-2017, 05:44 PM | #15 |
Junior Member
Posts: 4
Karma: 10
Join Date: Jul 2017
Device: android
|
Thanks.... would like to make my own plugin I guess
Ah, that makes perfect sense.
I've been looking through the docs and code available for plugins, but haven't figured out yet how do a simple one that would take the book selection and execute my python code that utilizes my amazon api login information. It's tough looking through all the plugins to find one that's a good starting point. |
Tags |
download, metadata, tags |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Can't get Metadata from Amazon | Ginnia | Calibre | 37 | 02-20-2012 10:11 AM |
Amazon metadata: Just me or down for everyone? | CWatkinsNash | Calibre | 7 | 02-03-2012 10:05 PM |
unable to change Amazon source for metadata to amazon UK | callwing | Library Management | 0 | 09-09-2011 10:41 AM |
metadata from amazon errors | kevinrs | Calibre | 1 | 05-09-2011 11:09 AM |
Amazon metadata and covers? | desertgrandma | Devices | 13 | 02-19-2011 07:28 PM |