Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 02-12-2012, 06:12 AM   #1
clanger9
Member
clanger9 doesn't litterclanger9 doesn't litter
 
Posts: 11
Karma: 138
Join Date: Nov 2010
Device: Kindle 3
Kurier recipe update

The Kurier website has been revamped, meaning that the calibre recipe should be updated.

Suggested patch to fix character encoding issues and structural changes, below:

Code:
*** kurier.recipe	Mon Feb  6 07:47:02 2012
--- kurier.recipe.orig	Sat Feb  4 17:43:04 2012
***************
*** 13,22 ****
      publisher             = 'KURIER'
      category              = 'news, politics, Austria'
      oldest_article        = 2
!     max_articles_per_feed = 100
!     timeout               = 30
!     encoding              = None
      no_stylesheets        = True
      use_embedded_content  = False
      language              = 'de_AT'
      remove_empty_feeds    = True
--- 13,21 ----
      publisher             = 'KURIER'
      category              = 'news, politics, Austria'
      oldest_article        = 2
!     max_articles_per_feed = 200
      no_stylesheets        = True
+     encoding              = 'cp1252'
      use_embedded_content  = False
      language              = 'de_AT'
      remove_empty_feeds    = True
***************
*** 30,40 ****
                          , 'language'  : language
                          }
  
!     remove_tags = [ dict(attrs={'id':['artikel_expand_symbol2','imgzoom_close2']}), 
!                     dict(attrs={'class':['linkextern','functionsleiste','functions','social_positionierung','contenttabs','drucken','versenden','leserbrief','kommentieren','addthis_button']})
!                    ]
      keep_only_tags    = [dict(attrs={'id':'content'})]
!     remove_tags_after = [dict(attrs={'id':'author'})]
      remove_attributes = ['width','height']
  
      feeds = [
--- 29,37 ----
                          , 'language'  : language
                          }
  
!     remove_tags = [dict(attrs={'class':['functionsleiste','functions','social_positionierung','contenttabs','drucken','versenden','leserbrief','kommentieren','addthis_button']})]
      keep_only_tags    = [dict(attrs={'id':'content'})]
!     remove_tags_after = dict(attrs={'id':'author'})
      remove_attributes = ['width','height']
  
      feeds = [
***************
*** 44,50 ****
                ,(u'Kultur'     , u'http://kurier.at/rss/kultur_kultur_rss.xml'   )
                ,(u'Freizeit'   , u'http://kurier.at/rss/freizeit_freizeit_rss.xml'   )
                ,(u'Wetter'     , u'http://kurier.at/rss/oewetter_rss.xml'   )
!               ,(u'Sport'      , u'http://kurier.at/newsfeed/detail/sport_rss.xml'   )
              ]
  
      def preprocess_html(self, soup):
--- 41,47 ----
                ,(u'Kultur'     , u'http://kurier.at/rss/kultur_kultur_rss.xml'   )
                ,(u'Freizeit'   , u'http://kurier.at/rss/freizeit_freizeit_rss.xml'   )
                ,(u'Wetter'     , u'http://kurier.at/rss/oewetter_rss.xml'   )
!               ,(u'Verkehr'    , u'http://kurier.at/rss/verkehr_rss.xml'   )
              ]
  
      def preprocess_html(self, soup):
clanger9 is offline   Reply With Quote
Old 02-12-2012, 08:49 AM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,595
Karma: 28548962
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Your patch doesn't apply against the current recipe version: http://bazaar.launchpad.net/~kovid/c.../kurier.recipe
kovidgoyal is offline   Reply With Quote
Old 02-12-2012, 05:36 PM   #3
clanger9
Member
clanger9 doesn't litterclanger9 doesn't litter
 
Posts: 11
Karma: 138
Join Date: Nov 2010
Device: Kindle 3
Sorry, my fault - I updated the recipe some time ago against an earlier version.
Will post a revised patch...
clanger9 is offline   Reply With Quote
Old 02-12-2012, 05:57 PM   #4
clanger9
Member
clanger9 doesn't litterclanger9 doesn't litter
 
Posts: 11
Karma: 138
Join Date: Nov 2010
Device: Kindle 3
This patch was created with `diff -uNr kurier.recipe.old kurier.recipe.new` against the kurier.recipe file in trunk.

Hope it's OK now.

Code:
--- kurier.recipe.old	2012-02-12 22:53:26.000000000 +0000
+++ kurier.recipe.new	2012-02-12 22:54:21.000000000 +0000
@@ -13,9 +13,10 @@
     publisher             = 'KURIER'
     category              = 'news, politics, Austria'
     oldest_article        = 2
-    max_articles_per_feed = 200
+    max_articles_per_feed = 100
+    timeout               = 30
+    encoding              = None
     no_stylesheets        = True
-    encoding              = 'cp1252'
     use_embedded_content  = False
     language              = 'de_AT'
     remove_empty_feeds    = True
@@ -29,9 +30,11 @@
                         , 'language'  : language
                         }
 
-    remove_tags = [dict(attrs={'class':['functionsleiste','functions','social_positionierung','contenttabs','drucken','versenden','leserbrief','kommentieren','addthis_button']})]
+    remove_tags = [ dict(attrs={'id':['artikel_expand_symbol2','imgzoom_close2']}), 
+                    dict(attrs={'class':['linkextern','functionsleiste','functions','social_positionierung','contenttabs','drucken','versenden','leserbrief','kommentieren','addthis_button']})
+                   ]
     keep_only_tags    = [dict(attrs={'id':'content'})]
-    remove_tags_after = dict(attrs={'id':'author'})
+    remove_tags_after = [dict(attrs={'id':'author'})]
     remove_attributes = ['width','height']
 
     feeds = [
@@ -41,7 +44,7 @@
               ,(u'Kultur'     , u'http://kurier.at/rss/kultur_kultur_rss.xml'   )
               ,(u'Freizeit'   , u'http://kurier.at/rss/freizeit_freizeit_rss.xml'   )
               ,(u'Wetter'     , u'http://kurier.at/rss/oewetter_rss.xml'   )
-              ,(u'Verkehr'    , u'http://kurier.at/rss/verkehr_rss.xml'   )
+              ,(u'Sport'      , u'http://kurier.at/newsfeed/detail/sport_rss.xml'   )
             ]
 
     def preprocess_html(self, soup):
clanger9 is offline   Reply With Quote
Old 02-13-2012, 12:52 AM   #5
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,595
Karma: 28548962
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Yeah, that applies, thanks.
kovidgoyal is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Tagesschau - Update of recipe a.peter Recipes 3 05-26-2013 03:57 PM
update recipe cnd.org derekliang Recipes 1 12-14-2011 01:46 AM
Kurier recipe update clanger9 Recipes 0 09-24-2011 09:45 AM
Ekantipur (Update) and Republica New Recipe fab4.ilam Recipes 0 09-24-2011 07:52 AM
Update to The Onion AV Club Recipe sdow1 Recipes 0 01-28-2011 10:06 AM


All times are GMT -4. The time now is 04:23 AM.


MobileRead.com is a privately owned, operated and funded community.