Quote:
Originally Posted by kovidgoyal
You will find amazon will start captchaing your scraper soon enough.
|
It already does, but the script doesn't run the selenium webdriver in headless mode and waits for you to enter the captcha before proceeding. I've found that as long as the same browser session and tab are used for every request, the captcha only needs to be entered once at the very beginning of the run.
I've run it for around 800 books several times during development and testing and it's fine so far.
They could very easily change their page structure or apply additional restrictions, though. That's always a risk.