View Single Post
Old 03-11-2011, 01:02 PM   #1
TonyDeWonderful
Junior Member
TonyDeWonderful began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Mar 2011
Device: Kindle 3
Calibre rss recipe -- <em> tag in article titles?

This is my first time posting and I'm fairly new to writing recipes. I'm having a problem with a recipe that I'm using to download an rss feed to my Kindle3.

The recipe itself works fine except that the article titles sometimes contain <em> and </em> tags (for example, the article title on the Kindle and Calibre v. 0.7.48 will show "<em>Godzilla</em> vs. Real Life"). This was also occurring in the main title once you opened the article but I was able to remove that via "preprocess_html".

Since the "preprocess_html" did not affect the article title, can someone provide me some direction as to how to remove the <em> and </em> tags from the article title?

I've included the recipe that I'm using below.

Thanks!

recipe:

Code:
import re
from calibre.web.feeds.recipes import BasicNewsRecipe

class AdvancedUserRecipe1288623850(BasicNewsRecipe):
    title = u'Hit and Run Blog'
    oldest_article = 1 
    max_articles_per_feed = 100
    timefmt = ''
    encoding= 'cp1252'
    preprocess_regexps = [
                                  (re.compile(r"&lt;em&gt;"),lambda match: ''),
                                  (re.compile(r"&lt;/em&gt;"),lambda match: '')
                                  ]
    feeds = [(u'Hit and Run Blog',
'http://feeds.feedburner.com/reason/HitandRun')]
Moderator Notice
Code tags added.

Last edited by Starson17; 03-11-2011 at 02:04 PM.
TonyDeWonderful is offline   Reply With Quote