View Single Post
Old 01-05-2024, 06:04 AM   #1
Ico
Enthusiast
Ico knows the difference between 'who' and 'whom'Ico knows the difference between 'who' and 'whom'Ico knows the difference between 'who' and 'whom'Ico knows the difference between 'who' and 'whom'Ico knows the difference between 'who' and 'whom'Ico knows the difference between 'who' and 'whom'Ico knows the difference between 'who' and 'whom'Ico knows the difference between 'who' and 'whom'Ico knows the difference between 'who' and 'whom'Ico knows the difference between 'who' and 'whom'Ico knows the difference between 'who' and 'whom'
 
Posts: 27
Karma: 10000
Join Date: Jan 2019
Device: Kindle PW4
Advice on how to scrape or use an api for thousands of books?

Hello.

First of all I use Calibre but I am learning development and I decided to make a cli script that gets metadata for all books in a folder.

I managed to make most of it work but I have a problem figuring out how to do a query going over several thousands or even 100000.

Google books has a daily limit of 1000 and open library has a limit of 100 every 5 minutes.

I heard Kovid mention that he used duckduckgo. Mr. Goyal if you read this by any chance could you please tell me how you did it?

I wanted to use pypupeteer on duckduckgo or google but i can't figure out based on their robots.txt what is and isn't permitted.
I don't want to get blacklisted by a mistake.

I also found out that Google thinks queries above 100000 are chump change to them and they will increase it for free if i get issued an api key for which I have to put in my credit card information.

I don't think the users and frankly myself are comfortable with that.

Thank you for reading. Please let me know if i need to further elaborate.
Ico is offline   Reply With Quote