Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 05-14-2011, 12:46 PM   #1
schuster
Zealot
schuster doesn't litterschuster doesn't litter
 
Posts: 119
Karma: 100
Join Date: Jan 2011
Location: Germany / NRW /Köln
Device: prs-650 / prs-350 /kindle 3
recipe for FAZ.net - german

Code:
import string, re
from calibre import strftime
from calibre.web.feeds.recipes import BasicNewsRecipe
from calibre.ebooks.BeautifulSoup import BeautifulSoup
class AdvancedUserRecipe1303841067(BasicNewsRecipe):

    title          = u'Faz.net'
    __author__  = 'schuster'
    remove_tags = [dict(attrs={'class':['right', 'ArrowLinkRight', 'ModulVerlagsInfo', 'left', 'Head']}),
                dict(id=['BreadCrumbs', 'tstag', 'FazFooterPrint']),
                dict(name=['script', 'noscript', 'style'])]
    oldest_article = 2
    max_articles_per_feed = 100
    no_stylesheets         = True
    use_embedded_content   = False
    language               = 'de'
    remove_javascript      = True
    cover_url = 'http://www.faz.net/f30/Images/Logos/logo.gif'

    def print_version(self, url):
        return url.replace('.html', '~Afor~Eprint.html')



    feeds          = [(u'Politik', u'http://www.faz.net/s/RubA24ECD630CAE40E483841DB7D16F4211/Tpl~Epartner~SRss_.xml'), 
	      (u'Wirtschaft', u'http://www.faz.net/s/RubC9401175958F4DE28E143E68888825F6/Tpl~Epartner~SRss_.xml'), 
	      (u'Feuilleton', u'http://www.faz.net/s/RubCC21B04EE95145B3AC877C874FB1B611/Tpl~Epartner~SRss_.xml'),
	      (u'Sport', u'http://www.faz.net/s/Rub9F27A221597D4C39A82856B0FE79F051/Tpl~Epartner~SRss_.xml'), 
	      (u'Gesellschaft', u'http://www.faz.net/s/Rub02DBAA63F9EB43CEB421272A670A685C/Tpl~Epartner~SRss_.xml'),
	      (u'Finanzen', u'http://www.faz.net/s/Rub4B891837ECD14082816D9E088A2D7CB4/Tpl~Epartner~SRss_.xml'), 
	      (u'Wissen', u'http://www.faz.net/s/Rub7F4BEE0E0C39429A8565089709B70C44/Tpl~Epartner~SRss_.xml'),
	      (u'Reise', u'http://www.faz.net/s/RubE2FB5CA667054BDEA70FB3BC45F8D91C/Tpl~Epartner~SRss_.xml'), 
	      (u'Technik & Motor', u'http://www.faz.net/s/Rub01E4D53776494844A85FDF23F5707AD8/Tpl~Epartner~SRss_.xml'),
	      (u'Beruf & Chance', u'http://www.faz.net/s/RubB1E10A8367E8446897468EDAA6EA0504/Tpl~Epartner~SRss_.xml'), 
	      (u'Kunstmarkt', u'http://www.faz.net/s/RubBC09F7BF72A2405A96718ECBFB68FBFE/Tpl~Epartner~SRss_.xml'),
	      (u'Immobilien ', u'http://www.faz.net/s/RubFED172A9E10F46B3A5F01B02098C0C8D/Tpl~Epartner~SRss_.xml'), 
	      (u'Rhein-Main Zeitung', u'http://www.faz.net/s/RubABE881A6669742C2A5EBCB5D50D7EBEE/Tpl~Epartner~SRss_.xml'),
	      (u'Atomdebatte ', u'http://www.faz.net/s/Rub469C43057F8C437CACC2DE9ED41B7950/Tpl~Epartner~SRss_.xml')

]
this is another faz.net recipe because the original one is not detailed enough (for me).
this one is in detail (categorys) built like the print version of the newspaper

Last edited by schuster; 05-14-2011 at 03:38 PM. Reason: Declaration
schuster is offline   Reply With Quote
Old 05-15-2011, 10:31 AM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,826
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
There is already a recipe for faz.net is yours different?
kovidgoyal is offline   Reply With Quote
Advert
Old 05-15-2011, 11:26 AM   #3
schuster
Zealot
schuster doesn't litterschuster doesn't litter
 
Posts: 119
Karma: 100
Join Date: Jan 2011
Location: Germany / NRW /Köln
Device: prs-650 / prs-350 /kindle 3
hi kovid,

yes it is different, becaus it has the full range of articels that can be received from faz.net and the categorization from the real-print-version.

greetings
schuster is offline   Reply With Quote
Old 05-16-2011, 06:54 AM   #4
schuster
Zealot
schuster doesn't litterschuster doesn't litter
 
Posts: 119
Karma: 100
Join Date: Jan 2011
Location: Germany / NRW /Köln
Device: prs-650 / prs-350 /kindle 3
for better explanation here some screenshots

first is the site-view

Click image for larger version

Name:	screen1.jpg
Views:	347
Size:	126.2 KB
ID:	71429


and the articel overview

Click image for larger version

Name:	screen2.jpg
Views:	376
Size:	208.6 KB
ID:	71432
schuster is offline   Reply With Quote
Old 05-16-2011, 10:59 AM   #5
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,826
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
No worries, I already updated it, I just forgot to post here.
kovidgoyal is offline   Reply With Quote
Advert
Old 05-26-2011, 01:25 PM   #6
schuster
Zealot
schuster doesn't litterschuster doesn't litter
 
Posts: 119
Karma: 100
Join Date: Jan 2011
Location: Germany / NRW /Köln
Device: prs-650 / prs-350 /kindle 3
*Update*

Code:
from calibre.web.feeds.recipes import BasicNewsRecipe
class AdvancedUserRecipe1303841067(BasicNewsRecipe):

    title          = u'Faz.net'
    __author__  = 'schuster'
    oldest_article = 1
    description           = 'Frankfurter Allgemeine Zeitung'
    max_articles_per_feed = 100
    no_stylesheets         = True
    use_embedded_content   = False
    language               = 'de'
    remove_javascript      = True
    cover_url = 'http://www.faz.net/f30/Images/Logos/logo.gif'

    remove_tags = [dict(attrs={'class':['LinkBoxModulSmall', 'ModulLesermeinungenFooter', 'ModulArtikelServices', 'SocialMediaUnten', 'ArrowLinkRight', 'ModulVerlagsInfo', 'AdData', 'FazFooter', 'Date']}),
                dict(id=['FAZNavHeader', 'FAZNavMain', 'RightColumn', 'FazFooter', 'BreadCrumbs', 'FAZNavSubMain', 'FAZImgEvent']),
                dict(name=['jksrdt'])]


    feeds          = [(u'Politik', u'http://www.faz.net/s/RubA24ECD630CAE40E483841DB7D16F4211/Tpl~Epartner~SRss_.xml'),
          (u'Wirtschaft', u'http://www.faz.net/s/RubC9401175958F4DE28E143E68888825F6/Tpl~Epartner~SRss_.xml'),
          (u'Feuilleton', u'http://www.faz.net/s/RubCC21B04EE95145B3AC877C874FB1B611/Tpl~Epartner~SRss_.xml'),
          (u'Sport', u'http://www.faz.net/s/Rub9F27A221597D4C39A82856B0FE79F051/Tpl~Epartner~SRss_.xml'),
          (u'Gesellschaft', u'http://www.faz.net/s/Rub02DBAA63F9EB43CEB421272A670A685C/Tpl~Epartner~SRss_.xml'),
          (u'Finanzen', u'http://www.faz.net/s/Rub4B891837ECD14082816D9E088A2D7CB4/Tpl~Epartner~SRss_.xml'),
          (u'Wissen', u'http://www.faz.net/s/Rub7F4BEE0E0C39429A8565089709B70C44/Tpl~Epartner~SRss_.xml'),
          (u'Reise', u'http://www.faz.net/s/RubE2FB5CA667054BDEA70FB3BC45F8D91C/Tpl~Epartner~SRss_.xml'),
          (u'Technik & Motor', u'http://www.faz.net/s/Rub01E4D53776494844A85FDF23F5707AD8/Tpl~Epartner~SRss_.xml'),
          (u'Beruf & Chance', u'http://www.faz.net/s/RubB1E10A8367E8446897468EDAA6EA0504/Tpl~Epartner~SRss_.xml'),
          (u'Kunstmarkt', u'http://www.faz.net/s/RubBC09F7BF72A2405A96718ECBFB68FBFE/Tpl~Epartner~SRss_.xml'),
          (u'Immobilien ', u'http://www.faz.net/s/RubFED172A9E10F46B3A5F01B02098C0C8D/Tpl~Epartner~SRss_.xml'),
          (u'Rhein-Main Zeitung', u'http://www.faz.net/s/RubABE881A6669742C2A5EBCB5D50D7EBEE/Tpl~Epartner~SRss_.xml'),
          (u'Atomdebatte ', u'http://www.faz.net/s/Rub469C43057F8C437CACC2DE9ED41B7950/Tpl~Epartner~SRss_.xml')
    ]
schuster is offline   Reply With Quote
Old 05-26-2011, 01:46 PM   #7
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,826
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
updated
kovidgoyal is offline   Reply With Quote
Old 05-27-2011, 05:57 AM   #8
juco
Enthusiast
juco is on a distinguished road
 
juco's Avatar
 
Posts: 28
Karma: 50
Join Date: Oct 2003
Location: Bavaria/Germany
Device: Palm m105, Kindle KB
Until the day before yesterday, I used the "old" receipe that came with calibre, which I had enhanced by adding the other category feeds (in pretty much the same way user schuster did in his script):

Code:
__license__   = 'GPL v3'
__copyright__ = '2008-2011, Kovid Goyal <kovid at kovidgoyal.net>, Darko Miletic <darko at gmail.com>, Ralph Stenzel <ralph at klein-aber-fein.de>'
'''
Profile to download FAZ.NET
'''

from calibre.web.feeds.news import BasicNewsRecipe

class FazNet(BasicNewsRecipe):
    title                 = 'FAZ.NET'
    __author__            = 'Kovid Goyal, Darko Miletic, Ralph Stenzel'
    description           = 'Frankfurter Allgemeine Zeitung'
    publisher             = 'Frankfurter Allgemeine Zeitung GmbH'
    category              = 'news, politics, Germany'
    use_embedded_content  = False
    language = 'de'

    max_articles_per_feed = 30
    no_stylesheets        = True
    encoding              = 'utf-8'
    remove_javascript     = True

    html2lrf_options = [
                          '--comment', description
                        , '--category', category
                        , '--publisher', publisher
                        ]

    html2epub_options = 'publisher="' + publisher + '"\ncomments="' + description + '"\ntags="' + category + '"'

    keep_only_tags = [dict(name='div', attrs={'class':'Article'})]

    remove_tags = [
                     dict(name=['object','link','embed','base'])
                    ,dict(name='div', attrs={'class':['LinkBoxModulSmall','ModulVerlagsInfo']})
                  ]

    feeds = [
              ('FAZ.NET Aktuell', 'http://www.faz.net/s/RubF3CE08B362D244869BE7984590CB6AC1/Tpl~Epartner~SRss_.xml'),
              ('Politik', 'http://www.faz.net/s/RubA24ECD630CAE40E483841DB7D16F4211/Tpl~Epartner~SRss_.xml'),
              ('Wirtschaft', 'http://www.faz.net/s/RubC9401175958F4DE28E143E68888825F6/Tpl~Epartner~SRss_.xml'),
              ('Feuilleton', 'http://www.faz.net/s/RubCC21B04EE95145B3AC877C874FB1B611/Tpl~Epartner~SRss_.xml'),
              ('Sport', 'http://www.faz.net/s/Rub9F27A221597D4C39A82856B0FE79F051/Tpl~Epartner~SRss_.xml'),
              ('Gesellschaft', 'http://www.faz.net/s/Rub02DBAA63F9EB43CEB421272A670A685C/Tpl~Epartner~SRss_.xml'),
              ('Finanzen', 'http://www.faz.net/s/Rub4B891837ECD14082816D9E088A2D7CB4/Tpl~Epartner~SRss_.xml'),
              ('Wissen', 'http://www.faz.net/s/Rub7F4BEE0E0C39429A8565089709B70C44/Tpl~Epartner~SRss_.xml'),
              ('Reise', 'http://www.faz.net/s/RubE2FB5CA667054BDEA70FB3BC45F8D91C/Tpl~Epartner~SRss_.xml'),
              ('Technik & Motor', 'http://www.faz.net/s/Rub01E4D53776494844A85FDF23F5707AD8/Tpl~Epartner~SRss_.xml'),
              ('Beruf & Chance', 'http://www.faz.net/s/RubB1E10A8367E8446897468EDAA6EA0504/Tpl~Epartner~SRss_.xml')
            ]

    def print_version(self, url):
        article, sep, rest = url.partition('?')
        return article.replace('.html', '~Afor~Eprint.html')

    def preprocess_html(self, soup):
        mtag = '<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>'
        soup.head.insert(0,mtag)
        del soup.body['onload']
        for item in soup.findAll(style=True):
            del item['style']
        return soup
This script was working very well and delivered the articles in an outstanding layout to my Kindle 3. However, the programmers of faz.net seem to have changed something recently, so the receipe is not working anymore since last Wednesday. The URLs of the feeds are still valid (and the same as before), but I'm not competent to find and fix the problem involved myself.

The new receipe from user schuster *does* deliver content, however it does not provide the same "polished" and well-looking results (cropping, formatting etc.) which the previous script did. Perhaps someone here is able to fix the receipe cited here in my comment so that it may be put in service again? Any help would be greatly appreciated!

Thanks in advance,
Ralph

Last edited by juco; 05-27-2011 at 08:18 AM.
juco is offline   Reply With Quote
Old 05-27-2011, 10:47 AM   #9
schuster
Zealot
schuster doesn't litterschuster doesn't litter
 
Posts: 119
Karma: 100
Join Date: Jan 2011
Location: Germany / NRW /Köln
Device: prs-650 / prs-350 /kindle 3
hi juco,
proofed on kindle3 and looks like the previous one.
could you pn me what the different in detail.

greetings
schuster is offline   Reply With Quote
Old 05-27-2011, 02:25 PM   #10
juco
Enthusiast
juco is on a distinguished road
 
juco's Avatar
 
Posts: 28
Karma: 50
Join Date: Oct 2003
Location: Bavaria/Germany
Device: Palm m105, Kindle KB
Hi schuster,

thanks for getting in touch with me! Will send you a PN soon in our mother's tongue... ;-)

Yours,
Ralph
juco is offline   Reply With Quote
Old 05-28-2011, 12:13 AM   #11
juco
Enthusiast
juco is on a distinguished road
 
juco's Avatar
 
Posts: 28
Karma: 50
Join Date: Oct 2003
Location: Bavaria/Germany
Device: Palm m105, Kindle KB
The updated FAZ.NET receipe that comes bundled with calibre v0.8.3 works perfectly again. Thank you very much indeeed, Kovid! :-)

Yours,
Ralph
juco is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
LWN.net Weekly News recipe davide125 Recipes 22 11-12-2014 09:44 PM
Request: Inquirer.net Recipe update zoilom Recipes 0 12-21-2010 01:06 AM
A recipe for "Siol.net" BlonG Recipes 1 11-08-2010 11:15 AM
FAZ: Deutschland vs. Google KernelPanic Lounge 3 12-04-2009 06:21 AM


All times are GMT -4. The time now is 02:18 PM.


MobileRead.com is a privately owned, operated and funded community.