Quote:
Originally Posted by gambarini
|
This is a classic case of obfuscated links. But let me explain few things first. This January Kovid and myself exchanged several mails regarding problem related to slow feed download. After some experiments I found out that the main culprit was the usage of obfuscated links from feed. The solution was to update default implementation of get_article_url to take into account not only link tag but also feedburner:OrigLink which (if exists) contains the real non-obfuscated link. However this solution does not cover all cases. Sometimes feeds do not have origlink tag but instead use guid tag. In those cases a recipe developer should override get_article_url and read the value of guid tag. That way we get the maximum download speed and optionally we can work on print url if the site offers one.
punto-informatico.it does not offer special print page so you will need to scrape the default page. Just add this to your recipe to get the real links:
Code:
def get_article_url(self, article):
return article.get('guid', None)