Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 05-18-2011, 03:43 PM   #1
schuster
Zealot
schuster doesn't litterschuster doesn't litter
 
Posts: 116
Karma: 100
Join Date: Jan 2011
Location: Germany / NRW /Köln
Device: prs-650 / prs-350 /kindle 3
recipe for Börse-online.de - german

Code:
import string, re
from calibre import strftime
from calibre.web.feeds.recipes import BasicNewsRecipe
from calibre.ebooks.BeautifulSoup import BeautifulSoup
class AdvancedUserRecipe1303841067(BasicNewsRecipe):

    title          = u'Börse-online'
    __author__  = 'schuster'
    oldest_article = 1
    max_articles_per_feed = 100
    no_stylesheets         = True
    use_embedded_content   = False
    language               = 'de'
    remove_javascript      = True
    cover_url = 'http://www.dpv.de/images/1995/source.gif'
    masthead_url = 'http://www.zeitschriften-cover.de/cover/boerse-online-cover-januar-2010-x1387.jpg'
    extra_css = '''
                    h1{font-family:Arial,Helvetica,sans-serif; font-weight:bold;font-size:large;}
                    h4{font-family:Arial,Helvetica,sans-serif; font-weight:normal;font-size:small;}
                    img {min-width:300px; max-width:600px; min-height:300px; max-height:800px}
                    p{font-family:Arial,Helvetica,sans-serif;font-size:small;}
                    body{font-family:Helvetica,Arial,sans-serif;font-size:small;}
	'''
    remove_tags_bevor = [dict(name='h3')]
    remove_tags_after = [dict(name='div', attrs={'class':'artikelfuss'})]
    remove_tags = [dict(attrs={'class':['moduleTopNav', 'moduleHeaderNav', 'text', 'blau', 'poll1150']}),
                dict(id=['newsletterlayer', 'newsletterlayerClose', 'newsletterlayer_body', 'newsletterarray_error', 'newsletterlayer_emailadress', 'newsletterlayer_submit', 'kommentar']),
                dict(name=['h2', 'Gesamtranking', 'h3',''])]

    def print_version(self, url):
        return url.replace('.html#nv=rss', '.html?mode=print')



    feeds          = [(u'Börsennachrichten', u'http://www.boerse-online.de/rss/')]
schuster is offline   Reply With Quote
Old 01-07-2013, 11:07 AM   #2
Divingduck
Fanatic
Divingduck never is beset by a damp, drizzly November in his or her soul.Divingduck never is beset by a damp, drizzly November in his or her soul.Divingduck never is beset by a damp, drizzly November in his or her soul.Divingduck never is beset by a damp, drizzly November in his or her soul.Divingduck never is beset by a damp, drizzly November in his or her soul.Divingduck never is beset by a damp, drizzly November in his or her soul.Divingduck never is beset by a damp, drizzly November in his or her soul.Divingduck never is beset by a damp, drizzly November in his or her soul.Divingduck never is beset by a damp, drizzly November in his or her soul.Divingduck never is beset by a damp, drizzly November in his or her soul.Divingduck never is beset by a damp, drizzly November in his or her soul.
 
Posts: 558
Karma: 59934
Join Date: Nov 2010
Location: Germany
Device: Sony PRS-650
I made an update for this recipe because it wasn't working well.

Spoiler:
Code:
from calibre.web.feeds.recipes import BasicNewsRecipe
class AdvancedUserRecipe1303841067(BasicNewsRecipe):

    title                 = u'Börse-online'
    __author__            = 'schuster, Armin Geller'
    oldest_article        = 1
    max_articles_per_feed = 100
    no_stylesheets        = True
    use_embedded_content  = False
    language              = 'de'
    remove_javascript     = True
    encoding              = 'iso-8859-1'
    timefmt               = ' [%a, %d %b %Y]'
    
    
    cover_url = 'http://www.wirtschaftsmedien-shop.de/s/media/coverimages/7576_2013107.jpg'
    masthead_url = 'http://upload.wikimedia.org/wikipedia/de/5/56/B%C3%B6rse_Online_Logo.svg'

    remove_tags_after = [dict(name='div', attrs={'class':['artikelfuss', 'rahmen600']})]
    
    remove_tags = [
                    dict(name='div', attrs={'id':['breadcrumb', 'rightCol', 'clearall']}),
                    dict(name='div', attrs={'class':['footer', 'artikelfuss']}),
                  ]

    keep_only_tags    = [
                          dict(name='div', attrs={'id':['contentWrapper']})
                        ]

    feeds          = [(u'Börsennachrichten', u'http://www.boerse-online.de/rss/')]
    
    def print_version(self, url):
        return url.replace('.html#nv=rss', '.html?mode=print')
Attached Files
File Type: zip Boerse-Online_AGe.zip (784 Bytes, 33 views)
Divingduck is offline   Reply With Quote
Old 11-29-2013, 09:56 AM   #3
Divingduck
Fanatic
Divingduck never is beset by a damp, drizzly November in his or her soul.Divingduck never is beset by a damp, drizzly November in his or her soul.Divingduck never is beset by a damp, drizzly November in his or her soul.Divingduck never is beset by a damp, drizzly November in his or her soul.Divingduck never is beset by a damp, drizzly November in his or her soul.Divingduck never is beset by a damp, drizzly November in his or her soul.Divingduck never is beset by a damp, drizzly November in his or her soul.Divingduck never is beset by a damp, drizzly November in his or her soul.Divingduck never is beset by a damp, drizzly November in his or her soul.Divingduck never is beset by a damp, drizzly November in his or her soul.Divingduck never is beset by a damp, drizzly November in his or her soul.
 
Posts: 558
Karma: 59934
Join Date: Nov 2010
Location: Germany
Device: Sony PRS-650
Kovid,
a new update was necessary due to changes on their pages.

Spoiler:
Code:
from calibre.web.feeds.recipes import BasicNewsRecipe
class AdvancedUserRecipe1303841067(BasicNewsRecipe):

    title                     = u'Börse-online'
    __author__                = 'Armin Geller' #AGE upd 2013-11-29
    oldest_article            = 1
    max_articles_per_feed     = 100
    no_stylesheets            = True
    use_embedded_content      = False
    language                  = 'de'
    remove_javascript         = True
    remove_empty_feeds        = True
    ignore_duplicate_articles = {'title', 'url'}
    encoding                  = 'utf-8'
    timefmt                   = ' [%a, %d %b %Y]'
    
    cover_url = 'http://www.wirtschaftsmedien-shop.de/s/media/coverimages/7576_2013107.jpg'
    masthead_url = 'http://upload.wikimedia.org/wikipedia/de/5/56/B%C3%B6rse_Online_Logo.svg'

    feeds          = [(u'Börsennachrichten', u'http://www.boerse-online.de/rss'),
                       (u'Märkte', u'http://www.boerse-online.de/rss/maerkte'),
                       (u'Chartanalyse', u'http://www.boerse-online.de/rss/maerkte/chartanalyse'),
                       (u'Aktien', u'http://www.boerse-online.de/rss/aktie'),
                       (u'Aktien-Chartanalyse', u'http://www.boerse-online.de/rss/aktie/chartanalyse'),
                       (u'zertifikate', u'http://www.boerse-online.de/rss/zertifikat')
                      ]
    
    def print_version(self, url):
        s1,s2 = url.rsplit('/', 1)
        return 'http://www.boerse-online.de/nachrichten/drucken/'+s2
Attached Files
File Type: zip Boerse-Online_AGeV2.zip (774 Bytes, 20 views)
Divingduck is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Recipe for hungarian HVG Online ironcat Recipes 0 03-23-2011 03:34 AM
Short Fiction Balzac, Honoré de: Die Börse [German]. V1. 20 Feb 2011 Alaska Kindle Books 0 02-20-2011 02:41 PM
Short Fiction Balzac, Honoré de: Die Börse [German]. V1. 20 Feb 2011 Alaska ePub Books 0 02-20-2011 02:39 PM
New Recipe - Wyoming Tribune Eagle Online Tegan Recipes 0 02-12-2011 01:54 PM
would like a recipe to pull down a free online book N13L5 Recipes 17 10-09-2010 10:38 AM


All times are GMT -4. The time now is 08:05 PM.


MobileRead.com is a privately owned, operated and funded community.