New Recipe: The Skeptic
Development of this recipe involved two interesting features. The first is that the blog where some of the feeds originate is running "Bad Behavior" and it identified Calibre's recipe scraping as a bad boy, producing a 403 error. Playing around with TamperData (and comparing the headers sent by FireFox to those sent by Calibre) showed that Calibre needed to send at least a simple Accept: header to avoid being seen as a spambot. This recipe adds the needed header to the initial GET request.
The second interesting thing in this recipe is that I wanted to remove all tags that started with "follow," such as "followX" or "followY." This recipe uses a regex in the remove_tags.
|