View Single Post
Old 09-07-2022, 03:50 PM   #620
kiwidude
Calibre Plugins Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,732
Karma: 2197770
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
Quote:
Originally Posted by Kabutak View Post
Still getting a number of undefined/empty publisher and date entries with the new version. Not unexpected but worth noting. More concerning was getting a number of 403 errors when attempting to fetch the publisher information for a bunch of books. It's possible that I was just being rate limited or something since I was updating a bunch of entries (~50) at once, but figured I'd mention it.
That most definitely looks like rate limiting to me. I hit that with another private plugin I wrote recently against Goodreads which was trying to do stuff for books in bulk - I ended up putting wait delays into that plugin when dealing with multiple books. Which makes it run considerably slower but at least you could let it run in the background. If you do books one at a time then you are usually ok given the inevitable delay in between.

TBH downloading metadata in bulk is never something I have recommended to anyone nor ever do myself - because there are too many cases where the goodreads search engine will suggest the wrong book as the first match and you won't catch it if you are not looking at them one by one. Yup it is slower/more clicks etc but you only need to do it once for each book and it is there forever "correct" rather than a mess you have to clean up later of fixing tags, series, covers, description etc...

If you have any specific examples (goodreads ids ideally) of books that are missing publisher/date data that you can see the info on the page please PM them through to me and I can figure it out. The "joy" of html web scraping is that there are no rules that we can rely on the website authors following, there are always special cases we have to deal with...
kiwidude is offline   Reply With Quote