Advice on how to scrape or use an api for thousands of books?
Hello.
First of all I use Calibre but I am learning development and I decided to make a cli script that gets metadata for all books in a folder.
I managed to make most of it work but I have a problem figuring out how to do a query going over several thousands or even 100000.
Google books has a daily limit of 1000 and open library has a limit of 100 every 5 minutes.
I heard Kovid mention that he used duckduckgo. Mr. Goyal if you read this by any chance could you please tell me how you did it?
I wanted to use pypupeteer on duckduckgo or google but i can't figure out based on their robots.txt what is and isn't permitted.
I don't want to get blacklisted by a mistake.
I also found out that Google thinks queries above 100000 are chump change to them and they will increase it for free if i get issued an api key for which I have to put in my credit card information.
I don't think the users and frankly myself are comfortable with that.
Thank you for reading. Please let me know if i need to further elaborate.
|