Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 12-20-2010, 11:24 AM   #1
Peter_BE
Junior Member
Peter_BE began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Dec 2010
Device: Kindle3
Updated: Gazet van Antwerpen (GVA_BE)

Hi,

I've been using the GVA recipe for a few months now, but recently the newspaper updated their feeds in such a way that the overview feed contains all articles from the other feeds as well, which basically makes you're now downloading every article twice.

Therefor I removed that feed from the recipe and also corrected an error which made that the "Science" section contained "Media" articles.

This is the new gva_be.recipe:

Code:
#!/usr/bin/env  python

__license__   = 'GPL v3'
__copyright__ = '2009, Darko Miletic <darko.miletic at gmail.com>'
'''
www.gva.be
'''
from calibre.web.feeds.news import BasicNewsRecipe
from calibre.ebooks.BeautifulSoup import Tag

class GazetvanAntwerpen(BasicNewsRecipe):
    title                 = 'Gazet van Antwerpen'
    __author__            = 'Darko Miletic'
    description           = 'News from Belgium in Dutch'
    publisher             = 'Gazet van Antwerpen'
    category              = 'news, politics, Belgium'
    oldest_article        = 2
    max_articles_per_feed = 100
    no_stylesheets        = True
    use_embedded_content  = False
    encoding              = 'utf-8'
    language = 'nl_BE'

    lang                  = 'nl-BE'
    direction             = 'ltr'

    html2lrf_options = [
                          '--comment'  , description
                        , '--category' , category
                        , '--publisher', publisher
                        ]

    html2epub_options = 'publisher="' + publisher + '"\ncomments="' + description + '"\ntags="' + category + '"\noverride_css=" p {text-indent: 0cm; margin-top: 0em; margin-bottom: 0.5em} "'

    keep_only_tags = [dict(name='div', attrs={'id':'article'})]
    remove_tags    = [
                         dict(name=['embed','object'])
                       , dict (name='div',attrs={'class':['note NotePortrait','note']})
                     ]
    remove_tags_after  = dict(name='span', attrs={'class':'author'})

    feeds = [
              (u'Binnenland'            , u'http://www.gva.be/syndicationservices/artfeedservice.svc/rss/mostrecent/binnenland'    )
             ,(u'Buitenland'            , u'http://www.gva.be/syndicationservices/artfeedservice.svc/rss/mostrecent/buitenland'    )
             ,(u'Stad & Regio'          , u'http://www.gva.be/syndicationservices/artfeedservice.svc/rss/mostrecent/stadenregio'   )
             ,(u'Economie'              , u'http://www.gva.be/syndicationservices/artfeedservice.svc/rss/mostrecent/economie'      )
             ,(u'Media & Cultur'        , u'http://www.gva.be/syndicationservices/artfeedservice.svc/rss/mostrecent/mediaencultuur')
             ,(u'Wetenschap'            , u'http://www.gva.be/syndicationservices/artfeedservice.svc/rss/mostrecent/wetenschap'    )
             ,(u'Sport'                 , u'http://www.gva.be/syndicationservices/artfeedservice.svc/rss/mostrecent/sport'         )
            ]

    def preprocess_html(self, soup):
        del soup.body['onload']
        for item in soup.findAll(style=True):
            del item['style']
        soup.html['lang']     = self.lang
        soup.html['dir' ]     = self.direction
        mlang = Tag(soup,'meta',[("http-equiv","Content-Language"),("content",self.lang)])
        mcharset = Tag(soup,'meta',[("http-equiv","Content-Type"),("content","text/html; charset=utf-8")])
        soup.head.insert(0,mlang)
        soup.head.insert(1,mcharset)
        return soup
Peter_BE is offline   Reply With Quote
Old 12-20-2010, 12:09 PM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,320
Karma: 27111242
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
thanks, builtin recipe updated.
kovidgoyal is online now   Reply With Quote
Advert
Reply

Tags
gazet van antwerpen, gva


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Don van Vliet dies TGS Lounge 10 12-20-2010 11:02 AM
updated metadata=updated file? jomaweb Calibre 13 01-28-2010 08:12 PM
Science Fiction Sutphen, Van Tassel: The Doomsman. V1. 21 May 2009 crutledge Kindle Books 2 05-23-2009 07:45 AM
Science Fiction Sutphen, Van Tassel: The Doomsman. V1. 21 May 2009 crutledge BBeB/LRF Books 0 05-21-2009 06:37 PM
Short Fiction Sutphen, Van Tassel: The Doomsman. V1. 21 May 2009 crutledge IMP Books 0 05-21-2009 06:34 PM


All times are GMT -4. The time now is 08:05 AM.


MobileRead.com is a privately owned, operated and funded community.