Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 05-14-2011, 09:47 PM   #1
schuster
Zealot
schuster doesn't litterschuster doesn't litter
 
Posts: 119
Karma: 100
Join Date: Jan 2011
Location: Germany / NRW /Köln
Device: prs-650 / prs-350 /kindle 3
recipe for Golem.de - German

because "derdon" had asked for the recipe
viel spass damit

Code:
import string, re
from calibre import strftime
from calibre.web.feeds.recipes import BasicNewsRecipe
from calibre.ebooks.BeautifulSoup import BeautifulSoup
class AdvancedUserRecipe1303841067(BasicNewsRecipe):

    title          = u'Golem.de'
    __author__  = 'schuster'

    oldest_article = 7
    max_articles_per_feed = 10
    no_stylesheets         = True
    use_embedded_content   = False
    language               = 'de'
    cover_url = 'http://www.e-energy.de/images/logo_golem.jpg'
    masthead_url = 'http://www.golem.de/staticrl/images/logo.png'
    extra_css = '''
                    h2{font-family:Arial,Helvetica,sans-serif; font-size: x-small;}
                    h1{ font-family:Arial,Helvetica,sans-serif;  font-size:x-large; font-weight:bold;}

                '''
    remove_javascript      = True
    remove_tags_befor = [dict(name='header', attrs={'class':'cluster-header'})]
    remove_tags_after = [dict(name='p', attrs={'class':'meta'})]
    remove_tags = [dict(rel='nofollow'),
		   dict(name='header', attrs={'id':'header'}),
		   dict(name='div', attrs={'class':'dh1'}),
		   dict(name='label', attrs={'class':'implied'}),
		   dict(name='section', attrs={'id':'comments'}),
		   dict(name='li', attrs={'class':'gg_prebackcounterItem'}),
		   dict(name='li', attrs={'class':'gg_prebackcounterItem gg_embeddedIndexCounter'}),
		   dict(name='img', attrs={'class':'gg_embeddedIconRight gg_embeddedIconFS gg_cursorpointer'}),
		   dict(name='div', attrs={'target':'_blank'})
]

    def get_browser(self, *args, **kwargs):
       from calibre import browser
       kwargs['user_agent'] = 'mozilla'
       return browser(*args, **kwargs)

    def get_article_url(self, article):
       original_url = BasicNewsRecipe.get_article_url(self, article)
       raw = browser().open_novisit(original_url).read()
       soup = BeautifulSoup(raw)

    def get_article_url(self, article):
        return article.get('id', article.get('guid', None))

    def preprocess_html(self, soup):
        for alink in soup.findAll('a'):
            if alink.string is not None:
               tstr = alink.string
               alink.replaceWith(tstr)
        return soup

    feeds          = [(u'Audio/Video', u'http://rss.golem.de/rss.php?tp=av&feed=RSS2.0'),
                          (u'Foto', u'http://rss.golem.de/rss.php?tp=foto&feed=RSS2.0'),
                          (u'Games', u'http://rss.golem.de/rss.php?tp=games&feed=RSS2.0'),
                          (u'Handy', u'http://rss.golem.de/rss.php?tp=handy&feed=RSS2.0'),
                          (u'Internet', u'http://rss.golem.de/rss.php?tp=inet&feed=RSS2.0'),
                          (u'Mobile', u'http://rss.golem.de/rss.php?tp=mc&feed=RSS2.0'),
                          (u'OSS', u'http://rss.golem.de/rss.php?tp=oss&feed=RSS2.0'),
                          (u'Politik/Recht', u'http://rss.golem.de/rss.php?tp=pol&feed=RSS2.0'),
                          (u'Security', u'http://rss.golem.de/rss.php?tp=sec&feed=RSS2.0'),
                          (u'Desktop-Applikationen', u'http://rss.golem.de/rss.php?tp=apps&feed=RSS2.0'),
                          (u'Software-Entwicklung', u'http://rss.golem.de/rss.php?tp=dev&feed=RSS2.0'),
                          (u'Wirtschaft', u'http://rss.golem.de/rss.php?tp=wirtschaft&feed=RSS2.0'),
                          (u'Hardware', u'http://rss.golem.de/rss.php?r=hw&feed=RSS2.0'),
                          (u'Software', u'http://rss.golem.de/rss.php?r=sw&feed=RSS2.0'),
                          (u'Networld', u'http://rss.golem.de/rss.php?r=nw&feed=RSS2.0'),
                          (u'Entertainment', u'http://rss.golem.de/rss.php?r=et&feed=RSS2.0'),
                          (u'TK', u'http://rss.golem.de/rss.php?r=tk&feed=RSS2.0'),
                          (u'Wirtschaft', u'http://rss.golem.de/rss.php?r=wi&feed=RSS2.0'),
                          (u'E-Commerce', u'http://rss.golem.de/rss.php?r=ec&feed=RSS2.0')

]
schuster is offline   Reply With Quote
Old 05-15-2011, 10:20 AM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,858
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Isn't there already a recipe for golem.de, is yours different?
kovidgoyal is offline   Reply With Quote
Old 05-15-2011, 11:30 AM   #3
schuster
Zealot
schuster doesn't litterschuster doesn't litter
 
Posts: 119
Karma: 100
Join Date: Jan 2011
Location: Germany / NRW /Köln
Device: prs-650 / prs-350 /kindle 3
hi kovid,

yes the one that is build-in doesn't fetch news, because the site has changed their layout and now linked via feedsportal.com.

see this thread also:
Code:
https://www.mobileread.com/forums/showthread.php?t=133158

greetings
schuster is offline   Reply With Quote
Old 05-15-2011, 11:33 AM   #4
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,858
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Ah, ok
kovidgoyal is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Golem.de changed something derdon Recipes 0 05-14-2011 05:52 AM
The Golem Zeke General Discussions 3 01-27-2011 10:32 PM
Other Fiction Meyrink, Gustav: Der Golem german v1 01 feb 2009 netseeker ePub Books 0 02-01-2009 01:04 PM
Other Fiction Meyrink, Gustav: Der Golem german v1 01 feb 2009 netseeker Kindle Books 0 02-01-2009 01:02 PM
Other Fiction Meyrink, Gustav: Der Golem german v1 01 feb 2009 netseeker BBeB/LRF Books 0 02-01-2009 01:00 PM


All times are GMT -4. The time now is 10:38 AM.


MobileRead.com is a privately owned, operated and funded community.