View Single Post
Old 03-30-2010, 07:25 AM   #1693
gambarini
Connoisseur
gambarini began at the beginning.
 
Posts: 98
Karma: 22
Join Date: Mar 2010
Device: IRiver Story, Ipod Touch, Android SmartPhone
Quote:
Originally Posted by kiklop74 View Post
This is a classic case of obfuscated links. But let me explain few things first. This January Kovid and myself exchanged several mails regarding problem related to slow feed download. After some experiments I found out that the main culprit was the usage of obfuscated links from feed. The solution was to update default implementation of get_article_url to take into account not only link tag but also feedburner:OrigLink which (if exists) contains the real non-obfuscated link. However this solution does not cover all cases. Sometimes feeds do not have origlink tag but instead use guid tag. In those cases a recipe developer should override get_article_url and read the value of guid tag. That way we get the maximum download speed and optionally we can work on print url if the site offers one.

punto-informatico.it does not offer special print page so you will need to scrape the default page. Just add this to your recipe to get the real links:

Code:
def get_article_url(self, article):
     return article.get('guid',  None)
the link returned from get_article_url is correct (great !!!) but in the epub i find only:

This article was downloaded by calibre from
http://punto-informatico.it/2843719/...e-opteron.aspx

Last edited by gambarini; 03-30-2010 at 10:28 AM.
gambarini is offline