View Single Post
Old 12-17-2015, 02:06 AM   #243
trying
Member
trying doesn't littertrying doesn't litter
 
Posts: 21
Karma: 104
Join Date: Oct 2013
Device: none
Quote:
Originally Posted by Krazykiwi View Post
Could you make it add any random character at the end? a %s.%random-alpha-char% or %s-%randomalphachar ('scuse my lack of python-fu, but that ought to be a fairly simple little function even if it's not a built-in right?) That would make every request be treated as "the first time".
Sure, just do something like:
Code:
import random
randomletter = chr(random.randint(97,122))
And you could even randomly add a random # of lowercase letters, say 5-10.

But it's not necessary at this point. Just adding a dash works fine, and I also doubt goodreads will change this (for the same reasons you gave in your next post).

Without my one char fix, however, people will see 403 errors when they try to download goodreads metadata for books that already have a "goodreads:" identifier. This behavior is easily reproducible by anyone.

Quote:
Originally Posted by Krazykiwi View Post
I'm not so sure it's hurting anything.

The URL's working with text after the ID is definitely by design, it's not a bug. The url's internally on the site, such as the ones you get when you use the search interface or add a link into one of their forum posts using that search (an entirely different one to the main site search), are constructed by contatenating/truncating the title something similar to how Calibre shortens titles on save to disk. You can see them changing when you edit a book title (or, since Authors work the same way, if you edit an author name on the site.) It's been that way for years too, and is unlikely to be changing .

Secondly, if the GR plugin was truly being a good citizen, it'd be using the API, not scraping.

The API certainly has rate limits set (And doesn't the UI stop you pulling up more than 50 books at a time in some circumstances? It's been a very long time since I had 50 books to add metadata too, so I may be misremembering.)

I think, since Calibre is already being a bit of a dandy highwayman as far as data fetching goes, this isn't so bad. As long as it's not hitting more than once a second, it's not being any worse of a citizen than it's ever been.
I completely agree with everything said.

I presume (though I didn't look into it) that Calibre is purposely not sending more than 1 request per second per Calibre instance in order to try to follow the spirit of the Goodreads terms of use if not the letter.

If the plugin did use the API then the limit would be 1 request per second per developer key. All users of the plugin would then have to get their own key, otherwise there could only be 1 request per second for all simultaneous users of the plugin.
trying is offline   Reply With Quote