Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 03-10-2011, 12:03 PM   #1
derdon
Junior Member
derdon began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Mar 2011
Device: Kindle 3
Arrow Recepie for gulli.de and Golem.de

Hello forum members,

after reading in this forum a little I found this:

"If the recipe is broken, post a polite notice here and the recipe author may respond"

So that is what I'll try:

I'm sorry to report that the Calibre built-in recipes for gulli.de and golem.de are not working (anymore). The one for gulli.de never worked during the last few weeks (that is how long I use Calibre), the one for golem.de stopped working recently... about 4 days ago.
Both recipes download a bit but create an ebook that does not contain the articles from the pages. I use the Version 0.7.48 of Calibre under Ubuntu 10.10.

It would be great if we could find a solution for those problems!

Thank you for your time,


regards

Don
derdon is offline   Reply With Quote
Old 03-10-2011, 05:35 PM   #2
marvin_2
Enthusiast
marvin_2 has a spectacular aura aboutmarvin_2 has a spectacular aura aboutmarvin_2 has a spectacular aura aboutmarvin_2 has a spectacular aura aboutmarvin_2 has a spectacular aura aboutmarvin_2 has a spectacular aura aboutmarvin_2 has a spectacular aura aboutmarvin_2 has a spectacular aura aboutmarvin_2 has a spectacular aura aboutmarvin_2 has a spectacular aura aboutmarvin_2 has a spectacular aura about
 
Posts: 25
Karma: 4472
Join Date: Jan 2011
Device: Kindle
While not the author, and fairly new to recipes, I had started looking into adding more feeds to the Golem recipe. The current version seems to work fine, except for the first table - the small image & caption on the right come out weird on the Kindle, so I took them out. Bit of a pity, since I had changed the recipe from printed to full before to preserve the images ...

Spoiler:
Code:
#!/usr/bin/env  python

from calibre.web.feeds.news import BasicNewsRecipe
class golem_ger(BasicNewsRecipe):
    title          = u'Golem.de'
    language = 'de'
    __author__ = 'Kovid Goyal'
    oldest_article = 7
    max_articles_per_feed = 100
    language              = 'de'
    lang                  = 'de-DE'
    no_stylesheets        = True
    encoding              = 'iso-8859-1'
    recursions = 1
    match_regexps = [r'http://www.golem.de/.*.html']
    
    keep_only_tags     = [
                               dict(name='h1', attrs={'class':'artikelhead'}),
                               dict(name='p', attrs={'class':'teaser'}),
                               dict(name='div', attrs={'class':'artikeltext'}),
                               dict(name='h2', attrs={'id':'artikelhead'}),
                            ]
 

                    
    remove_tags = [
                    dict(name='div', attrs={'id':['similarContent','topContentWrapper','storycarousel','aboveFootPromo','comments','toolbar','breadcrumbs','commentlink','sidebar','rightColumn']}),
                    dict(name='div', attrs={'class':['gg_embeddedSubText','gg_embeddedIndex gg_solid','gg_toOldGallery','golemGallery']}),
                    dict(name='img', attrs={'class':['gg_embedded','gg_embeddedIconRight gg_embeddedIconFS gg_cursorpointer']}),
                    dict(name='td', attrs={'class':['xsmall']}),
                    ]


    # remove_tags_after  = [
      #                      dict(name='div', attrs={'id':['contentad2']})
       #                 ]


    feeds          = [
                      (u'Golem.de', u'http://rss.golem.de/rss.php?feed=ATOM1.0'),
                      (u'Audio/Video', u'http://rss.golem.de/rss.php?tp=av&feed=RSS2.0'),
                      (u'Foto', u'http://rss.golem.de/rss.php?tp=foto&feed=RSS2.0'),
                      (u'Games', u'http://rss.golem.de/rss.php?tp=games&feed=RSS2.0'),
                      (u'Internet', u'http://rss.golem.de/rss.php?tp=inet&feed=RSS1.0'),
                      (u'Mobil', u'http://rss.golem.de/rss.php?tp=mc&feed=ATOM1.0'),
                      (u'Internet', u'http://rss.golem.de/rss.php?tp=inet&feed=RSS1.0'),
                      (u'Politik/Recht', u'http://rss.golem.de/rss.php?tp=pol&feed=ATOM1.0'),
                      (u'Desktop-Applikationen', u'http://rss.golem.de/rss.php?tp=apps&feed=RSS2.0'),
                      (u'Software-Entwicklung', u'http://rss.golem.de/rss.php?tp=dev&feed=RSS2.0'),
                      (u'Wirtschaft', u'http://rss.golem.de/rss.php?tp=wirtschaft&feed=RSS2.0'),
                      (u'Hardware', u'http://rss.golem.de/rss.php?r=hw&feed=RSS2.0'),
                      (u'Software', u'http://rss.golem.de/rss.php?r=sw&feed=RSS2.0'),
                      (u'Networld', u'http://rss.golem.de/rss.php?r=nw&feed=RSS2.0'),
                      (u'Entertainment', u'http://rss.golem.de/rss.php?r=et&feed=RSS2.0'),
                      (u'TK', u'http://rss.golem.de/rss.php?r=tk&feed=RSS2.0'),
                      (u'E-Commerce', u'http://rss.golem.de/rss.php?r=ec&feed=RSS2.0'),
                      (u'Unternehmen/Maerkte', u'http://rss.golem.de/rss.php?r=wi&feed=RSS2.0')
                      ]
                      
                      
                      
                      
    feeds          = [
                      (u'Golem.de', u'http://rss.golem.de/rss.php?feed=ATOM1.0'),
                      (u'Mobil', u'http://rss.golem.de/rss.php?tp=mc&feed=feed=RSS2.0'),
                      (u'OSS', u'http://rss.golem.de/rss.php?tp=oss&feed=RSS2.0'),
                      (u'Politik/Recht', u'http://rss.golem.de/rss.php?tp=pol&feed=RSS2.0'),
                      (u'Desktop-Applikationen', u'http://rss.golem.de/rss.php?tp=apps&feed=RSS2.0'),
                      (u'Software-Entwicklung', u'http://rss.golem.de/rss.php?tp=dev&feed=RSS2.0'),
                      ]


    extra_css = '''
                h1 {color:#0066CC;font-family:Arial,Helvetica,sans-serif; font-size:30px; font-size-adjust:none; font-stretch:normal; font-style:normal; font-variant:normal; font-weight:bold; line-height:20px;margin-bottom:2 em;}
                h2 {color:#4D4D4D;font-family:Arial,Helvetica,sans-serif; font-size:22px; font-size-adjust:none; font-stretch:normal; font-style:normal; font-variant:normal; font-weight:bold; line-height:16px; }
                h3 {color:#4D4D4D;font-family:Arial,Helvetica,sans-serif; font-size:x-small; font-size-adjust:none; font-stretch:normal; font-style:normal; font-variant:normal; font-weight:normal; line-height:5px;}
                h4 {color:#333333; font-family:Arial,Helvetica,sans-serif;font-size:13px; font-size-adjust:none; font-stretch:normal; font-style:normal; font-variant:normal; font-weight:bold; line-height:13px; }
                h5 {color:#333333; font-family:Arial,Helvetica,sans-serif; font-size:11px; font-size-adjust:none; font-stretch:normal; font-style:normal; font-variant:normal; font-weight:bold; line-height:11px; text-transform:uppercase;}
                .teaser {font-style:italic;font-size:12pt;margin-bottom:15pt;}
                .xsmall{font-style:italic;font-size:x-small;}
                .td{font-style:italic;font-size:x-small;}
                img {align:left;}
                '''


Gulli have apparently changed their code - the attached recipe with changed keep/remove tags seems to work.

Spoiler:

Code:
from calibre.web.feeds.news import BasicNewsRecipe

class AdvancedUserRecipe1259599587(BasicNewsRecipe):
    title          = u'Gulli'
    description = 'News from Germany'
    language = 'de'
    __author__ = 'posativ'
    oldest_article = 7
    max_articles_per_feed = 100
    no_stylesheets = True

    feeds          = [(u'gulli:news', u'http://ticker.gulli.com/rss/')]

    remove_tags = [dict(name='div', attrs={'class':['FloatL','_forumBox']})]

    keep_only_tags = [dict(name='div', attrs={'id':['_contentLeft']})]
    
    remove_tags_after  = [dict(name='div', attrs={'class':['_bookmark']})]
    
    
    
    
    
    extra_css = '''
                h1 {color:#008852;font-family:Arial,Helvetica,sans-serif; font-size:25px; font-size-adjust:none; font-stretch:normal; font-style:normal; font-variant:normal; font-weight:bold; line-height:22px; }
                h2 {color:#4D4D4D;font-family:Arial,Helvetica,sans-serif; font-size:18px; font-size-adjust:none; font-stretch:normal; font-style:normal; font-variant:normal; font-weight:bold; line-height:16px; }
                h3 {color:#4D4D4D;font-family:Arial,Helvetica,sans-serif; font-size:15px; font-size-adjust:none; font-stretch:normal; font-style:normal; font-variant:normal; font-weight:bold; line-height:14px;}
                h4 {color:#333333; font-family:Arial,Helvetica,sans-serif;font-size:12px; font-size-adjust:none; font-stretch:normal; font-style:normal; font-variant:normal; font-weight:bold; line-height:14px; }
                h5 {color:#333333; font-family:Arial,Helvetica,sans-serif; font-size:11px; font-size-adjust:none; font-stretch:normal; font-style:normal; font-variant:normal; font-weight:bold; line-height:14px; text-transform:uppercase;}
                .newsdate {color:#333333;font-family:Arial,Helvetica,sans-serif;font-size:10px; font-size-adjust:none; font-stretch:normal; font-style:italic; font-variant:normal; font-weight:bold; line-height:10px; text-decoration:none;}
                .articleInfo {color:#4D4D4D;font-family:Arial,Helvetica,sans-serif;font-size:10px; font-size-adjust:none; font-stretch:normal; font-style:bold; font-variant:normal; font-weight:bold; line-height:10px; text-decoration:none;}
                .byline {color:#666;margin-bottom:0;font-size:12px}
                .blockquote {color:#030303;font-style:italic;padding-left:15px;}
                img {align:center;}
                .li {list-style-type: none}
                '''


golem_de.zip

gulli.zip
marvin_2 is offline   Reply With Quote
Advert
Old 03-11-2011, 01:32 PM   #3
derdon
Junior Member
derdon began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Mar 2011
Device: Kindle 3
You made my day!
Thank you very much for those two recipes, they work perfectly well!

Best regards,

Don
derdon is offline   Reply With Quote
Old 05-12-2011, 12:24 PM   #4
derdon
Junior Member
derdon began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Mar 2011
Device: Kindle 3
Golem Changed something

Hello again,

golem.de proudly presented their new design... nice but none of our recepies works any more: neither the native one in calibre nor the one so kindly posted by marvin_2 some weeks ago.

I would be glad if someone has a new one for golem.de

Best whishes,

derdon

derdon is offline   Reply With Quote
Reply

Tags
golem.de, gulli.de


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
The Golem Zeke General Discussions 3 01-27-2011 10:32 PM
Cybook bei Golem mit neuer Firmware? mtravellerh Andere Lesegeräte 4 06-16-2009 02:52 PM
Txtr video from Golem.de Moejoe News 27 03-10-2009 07:27 AM
Golem testet den Sony Reader Alexander Turcic Sony Reader 4 02-28-2009 11:35 AM
Other Fiction Meyrink, Gustav: Der Golem german v1 01 feb 2009 netseeker BBeB/LRF Books 0 02-01-2009 01:00 PM


All times are GMT -4. The time now is 07:44 AM.


MobileRead.com is a privately owned, operated and funded community.