Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 10-24-2025, 06:47 AM   #16
unkn0wn
Guru
unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.
 
Posts: 646
Karma: 85520
Join Date: May 2021
Device: kindle
the WSJ Magaznie and WSJ News will no longer work. We have been extremely lucky for sometime as I found a work around for WSJ with graphql.

the CAPTCHA page is being faced by archive.is when it fetches content from WSJ, we cant do anything anything to fix it. maybe wait for archive to update.
unkn0wn is offline   Reply With Quote
Old 11-03-2025, 05:24 AM   #17
nickredding
onlinenewsreader.net
nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'
 
Posts: 336
Karma: 10143
Join Date: Dec 2009
Location: Kelowna BC
Device: Various
archive.is screening

It appears that archive.is is screening traffic from wifi networks. Accessing archive.is via mobile networks (probably determined from carrier IP address) doesn’t attract the screening. I’m assuming this is an anti-scraping strategy, and it’s not confined to WSJ.
nickredding is offline   Reply With Quote
Advert
Old 11-03-2025, 11:29 AM   #18
unkn0wn
Guru
unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.unkn0wn understands the Henderson-Hasselbalch Equation.
 
Posts: 646
Karma: 85520
Join Date: May 2021
Device: kindle
someone who is facing this issue must try adding delay to the recipe and tell us if it works.

I don't use this recipe much and i could not replicate this issue for testing.
unkn0wn is offline   Reply With Quote
Old 11-05-2025, 11:48 AM   #19
nickredding
onlinenewsreader.net
nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'
 
Posts: 336
Karma: 10143
Join Date: Dec 2009
Location: Kelowna BC
Device: Various
Delay doesn’t affect this, I get screened on the first attempt to load an article via wifi but if I switch to mobile data and try again the article loads. Note that I’m not using Calibre for this.
nickredding is offline   Reply With Quote
Old 12-24-2025, 05:45 PM   #20
PowerfulGarbage
Junior Member
PowerfulGarbage began at the beginning.
 
Posts: 1
Karma: 10
Join Date: Nov 2025
Device: Kindle
I’m also getting this error. Mobile data didn’t change it.
PowerfulGarbage is offline   Reply With Quote
Advert
Old 12-28-2025, 04:11 PM   #21
nickredding
onlinenewsreader.net
nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'
 
Posts: 336
Karma: 10143
Join Date: Dec 2009
Location: Kelowna BC
Device: Various
archive.is behaviour

archive.is uses a combination of web browser detection and geolocation to determine if a captcha challenge should be presented.

Any access from a web browser is challenged.

Any access from a USA IP address is challenged.

From an IP address outside of the USA, mobile apps using iOS or Android http access are not challenged. However, VPNs don’t help, it seems archive.is detects them and issues a challenge.

If the difference between web browser access and iOS/Android app access could be determined, it might be possible to modify the Python mechanize apparatus to mimic the native apps and get around the captcha challenge. However, it would only work for users outside of the USA.

So, unless someone can figure out how to successfully respond to a captcha challenge, it looks like the end of the line for recipes that depend on archive.is
nickredding is offline   Reply With Quote
Old 02-09-2026, 06:20 PM   #22
nickredding
onlinenewsreader.net
nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'
 
Posts: 336
Karma: 10143
Join Date: Dec 2009
Location: Kelowna BC
Device: Various
More on url blocking

It looks like Cloudflare is being used widely as an anti-scraping and bot blocking service.

Cloudflare has developed a mechanism called "Private Access Tokens" which is subscribed to by iOS and Android to provide validation that a network request is originating from an actual user device. This mechanism is invoked both by web browsers and native apps using iOS or Android network requests.

Private Access Tokens are intended to reduce (or even eliminate) the need for captcha challenges to block scrapers and bots, and it seems to be very successful.

It looks like archive.is is using Cloudflare and its own mechanisms (see my previous message) to repel scrapers and bots.

Interestingly, archive.is issues captcha challenges for access from the iOS Safari browser but not for native apps using iOS URLSession.

All of this doesn't suggest a way to get around Cloudflare--calibre is a web scraper and Cloudflare is doing what it is designed to do by blocking it. But it does shed some light on why native apps that access network resources on demand (as opposed to batch scraping them) continue to work.
nickredding is offline   Reply With Quote
Reply

Tags
wsj


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Fetch WSJ (free) recipe fails dagon Recipes 2 03-28-2025 12:04 PM
WSJ recipe fails mjfriedman Recipes 13 10-17-2019 03:09 PM
WSJ recipe fails ebonytowers Recipes 25 09-13-2019 07:28 AM
Wall Street Journal, WSJ, Free version, recipe improvement for full text of all ar winterescape Recipes 16 02-07-2011 02:51 PM
Proper code for fetching Print Version from WSJ and NYT? brad382 Calibre 1 12-20-2008 02:06 PM


All times are GMT -4. The time now is 04:14 AM.


MobileRead.com is a privately owned, operated and funded community.