View Single Post
Old 04-23-2012, 03:48 AM   #1
wmaurer doesn't litterwmaurer doesn't litter
Posts: 44
Karma: 100
Join Date: Oct 2007
Device: Nook Simple Touch, HTC Desire
Recipe help please


I have a simple recipe that's not working the way I want:

class DerBund(BasicNewsRecipe):
    title = ' Nichts verpassen'
    description = 'Nachrichten, Analysen, Bilder und Video zu Politik, Wirtschaft, Sport, Kultur, Wissen, Technik, Auto und mehr.'
    oldest_article = 7
    max_articles_per_feed = 100
    timefmt = ' [%a, %d %b %Y]'
    keep_only_tags = dict(id='singleLeft')
    remove_tags_after = dict(attrs={'class':'publishedDate'})
    remove_tags =  [dict(id=['contentbox', 'metaLine', 'singleRight', 'singleSmallRight']), dict(name=['script', 'noscript', 'style'])]
    no_stylesheets = True
    feeds = [('Front', '')]

    def print_version(self, url):
        return url + "/print.html"

Basically I'm trying to get content from only the singleLeft div, but for some reason, ebook-convert is fetching content from the singleRight and singleSmallRight divs. I've even tried to add these two into the remove_tags method, but it's still being included.

Can offer me a hint as to why this is happening, and how to fix it?

Thanks & Cheers
wmaurer is offline   Reply With Quote