Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 07-07-2011, 10:16 PM   #1
Bortolotto
Member
Bortolotto began at the beginning.
 
Bortolotto's Avatar
 
Posts: 15
Karma: 14
Join Date: Jun 2011
Location: Brazil
Device: Kindle
Question How to handle RSS sites with redirections (cgi-bin)


Hi Buddies!

I was wondering if somebody knows how to handle RSS links with redirection (cgi-bin), like brazilian's website "IDG Now!". Here is one example:

1 - The index page: http://rss.idgnow.com.br/c/32184/f/499640/index.rss

2 - For instance, a link to one article with redirection: http://rss.idgnow.com.br/c/32184/f/4...SS/story01.htm

3 - The redirected news page: http://idgnow.uol.com.br/internet/20...o-de-usuarios/

4 - The printable version: http://idgnow.uol.com.br/internet/20...iciaPrint_view

Thank you!
Bortolotto is offline   Reply With Quote
Old 07-07-2011, 11:01 PM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,305
Karma: 27111242
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
That looks like it is using a url mangling scheme commonly used by some rss feed aggregators. Look at the builtin toi.recipe or ilsole24ore.recipe for examples of handling it.
kovidgoyal is offline   Reply With Quote
Advert
Old 07-09-2011, 12:13 AM   #3
Bortolotto
Member
Bortolotto began at the beginning.
 
Bortolotto's Avatar
 
Posts: 15
Karma: 14
Join Date: Jun 2011
Location: Brazil
Device: Kindle
Thumbs up That is it! My recipe (IDG Now! Brazil) is working pretty well!

Thank you very much Kovid!

I followed your advice and read ilsole24ore.recipe. Based that I made the recipe below (it's attached too), and it's working pretty well.

The source is the IDG Now! Brazil, an affiliate of International Data Group Inc.
I hope my recipe could be usefull!

Spoiler:
from calibre.web.feeds.news import BasicNewsRecipe

class IDGNow(BasicNewsRecipe):
title = 'IDG Now!'
__author__ = 'Diniz Bortolotto'
description = 'Posts do IDG Now!'
oldest_article = 7
max_articles_per_feed = 20
encoding = 'utf8'
publisher = 'Now!Digital Business Ltda.'
category = 'technology, telecom, IT, Brazil'
language = 'pt_BR'
publication_type = 'technology portal'
use_embedded_content = False
extra_css = '.headline {font-size: x-large;} \n .fact { padding-top: 10pt }'

def get_article_url(self, article):
link = article.get('link', None)
if link is None:
return article
if link.split('/')[-1]=="story01.htm":
link=link.split('/')[-2]
a=['0B','0C','0D','0E','0F','0G','0I','0N' ,'0L0S','0A','0J3A']
b=['.' ,'/' ,'?' ,'-' ,'=' ,'&' ,'_','.com','www.','0',':']
for i in range(0,len(a)):
link=link.replace(a[i],b[i])
link=link.split('&')[-3]
link=link.split('=')[1]
link=link + "/IDGNoticiaPrint_view"
return link

feeds = [
(u'Ultimas noticias', u'http://rss.idgnow.com.br/c/32184/f/499640/index.rss'),
(u'Computa\xe7\xe3o Corporativa', u'http://rss.idgnow.com.br/c/32184/f/499643/index.rss'),
(u'Carreira', u'http://rss.idgnow.com.br/c/32184/f/499644/index.rss'),
(u'Computa\xe7\xe3o Pessoal', u'http://rss.idgnow.com.br/c/32184/f/499645/index.rss'),
(u'Internet', u'http://rss.idgnow.com.br/c/32184/f/499646/index.rss'),
(u'Mercado', u'http://rss.idgnow.com.br/c/32184/f/419982/index.rss'),
(u'Seguran\xe7a', u'http://rss.idgnow.com.br/c/32184/f/499647/index.rss'),
(u'Telecom e Redes', u'http://rss.idgnow.com.br/c/32184/f/499648/index.rss')
]

reverse_article_order = True
Attached Files
File Type: txt IDGNowBrazil.txt (2.1 KB, 220 views)
Bortolotto is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Classic G:RSS: Optimized Google Reader (RSS) for the Nook [BETA Testers needed] Fmstrat Barnes & Noble NOOK 24 12-28-2010 12:22 PM
G:RSS: Optimized Google Reader (RSS) for the Kindle 3 (and Nook) Fmstrat Amazon Kindle 47 12-13-2010 12:20 PM
Request for recipes of sites with no rss PipSqueak Recipes 1 10-16-2010 10:05 PM
Firmware Update 2.3.3 Us Dx 2.5 bin. I have it! Anarel Amazon Kindle 104 06-14-2010 03:48 PM
Is there a good way to convert partial rss to full rss feeds. Zorz Other formats 5 05-29-2010 12:17 PM


All times are GMT -4. The time now is 02:12 PM.


MobileRead.com is a privately owned, operated and funded community.