Quote:
Originally Posted by Sciamano
Hi everyone,
I'm new to the boards and need some help with a recipe.
I'm going to be a commuter soon, so I wanted to create a recipe to download all the news that get published on my favorite (italian) soccer team's website.
This is the link to the RSS feed:
http://veleno.inter.it/aas/rss/index_full_it.xml
I've created this very simple custom recipe:
Code:
class AdvancedUserRecipe1300997108(BasicNewsRecipe):
title = u'Inter'
oldest_article = 7
max_articles_per_feed = 100
feeds = [(u'Inter News', u'http://veleno.inter.it/aas/rss/index_full_it.xml')]
remove_tags = [dict(name='div', attrs={'class':'piccolowww'})]
It seems to work fine, except for one little thing: where the article starts, and the date (day of the week, date, time) of the article is written, some letters in the ebook are changed.
For example, this is what it should read for today's news:
Giovedì, 24 Marzo 2011 14:44:03
But this is what I find in the resulting eBook:
Giovedě, 24 Marzo 2011 14:44:03
(see? the "ì" has been transformed to "ě")
Not a big deal, I can live with that, but since I'm a perfectionist, I'd like to solve.
Also if someone helps me remove the rss logo images and "permalink" link after the date, it would be great! I've tried but was not succesful.
Thanks!!
|
Add a line specifying encoding to your recipe:
Code:
class AdvancedUserRecipe1300997108(BasicNewsRecipe):
title = u'Inter'
encoding = 'ISO-8859-15'
oldest_article = 7
max_articles_per_feed = 100
feeds = [(u'Inter News', u'http://veleno.inter.it/aas/rss/index_full_it.xml')]
remove_tags = [dict(name='div', attrs={'class':'piccolowww'})]
and this problem should be solved.