![]() |
#1 |
Junior Member
![]() Posts: 5
Karma: 10
Join Date: Nov 2010
Device: Kindle Paperwhite (2014)
|
Updated Telepolis (News+Artikel) Recipe
Hi There,
I've updated the Telepolis recipe: Changes: *Now has correct Pagebreak on Kindle / Mobi Format *Fetches Articles and News *Not showing comments below articles anymore Code:
# -*- coding: utf-8 -*- __license__ = 'GPL v3' __copyright__ = '2009, Gerhard Aigner <gerhard.aigner at gmail.com>' import re from calibre.web.feeds.news import BasicNewsRecipe class TelepolisNews(BasicNewsRecipe): title = u'Telepolis (News+Artikel)' __author__ = 'Gerhard Aigner' publisher = 'Heise Zeitschriften Verlag GmbH & Co KG' description = 'News from telepolis' category = 'news' oldest_article = 7 max_articles_per_feed = 100 recursion = 0 no_stylesheets = True encoding = "utf-8" language = 'de_AT' use_embedded_content =False remove_empty_feeds = True preprocess_regexps = [(re.compile(r'<a[^>]*>', re.DOTALL|re.IGNORECASE), lambda match: ''), (re.compile(r'</a>', re.DOTALL|re.IGNORECASE), lambda match: ''),] keep_only_tags = [dict(name = 'td',attrs={'class':'bloghead'}),dict(name = 'td',attrs={'class':'blogfliess'})] remove_tags = [dict(name='img'), dict(name='td',attrs={'class':'blogbottom'}), dict(name='td',attrs={'class':'forum'})] feeds = [(u'News', u'http://www.heise.de/tp/news-atom.xml')] html2lrf_options = [ '--comment' , description , '--category' , category , '--publisher', publisher ] html2epub_options = 'publisher="' + publisher + '"\ncomments="' + description + '"\ntags="' + category + '"' def get_article_url(self, article): '''if the linked article is of kind artikel don't take it''' if (article.link.count('artikel') > 1) : return None return article.link def preprocess_html(self, soup): mtag = '<meta http-equiv="Content-Type" content="text/html; charset=' + self.encoding + '">' soup.head.insert(0,mtag) return soup |
![]() |
![]() |
![]() |
#2 |
Junior Member
![]() Posts: 1
Karma: 10
Join Date: Jan 2011
Device: SONY TOUCH 650
|
Hi; I've got an SONY 650 Touch Reader. Unfortunately the device reboots when I access the content of my Telepolis epub download. Do you have any idea? Thanks Pat
|
![]() |
![]() |
Advert | |
|
![]() |
#3 | |
US Navy, Retired
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 9,889
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Kindle PaperWhite SE 11th Gen
|
Quote:
Converting the epub to Mobi then back to epub might make the book viewable. Opening the epub in Sigil then saving it as a epub from within Sigil might remove incompatible code. |
|
![]() |
![]() |
![]() |
#4 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,160
Karma: 27110894
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
No point opening a bug report as I do not provide support for recipes that I haven't written.
|
![]() |
![]() |
![]() |
#5 |
aka zonebattler
![]() Posts: 30
Karma: 50
Join Date: Oct 2003
Location: Fürth, Germany
Device: Kindle KB, Kindle PW Signature Edition (11. Gen)
|
Hi, the Telepolis website was relaunched recently (with major layout changes), causing the current recipe to fail. Is anybody willing, able and determined to update the recipe? BTW, Telepolis is a German and not an Austrian publication: It should be listed under »Deutsch« and not under »German (AT)« since it might be overlooked there.
Thanks, Ralph Last edited by juco; 05-04-2011 at 11:49 AM. |
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 30,887
Karma: 59840450
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Moderator Notice
Moved to Recipes |
![]() |
![]() |
![]() |
#7 |
Junior Member
![]() Posts: 5
Karma: 10
Join Date: Nov 2010
Device: Kindle Paperwhite (2014)
|
Hi,
I've rewritten the recipe (pics are also included now) Code:
# -*- coding: utf-8 -*- import re from calibre.web.feeds.news import BasicNewsRecipe class TelepolisNews(BasicNewsRecipe): title = u'Telepolis (News+Artikel)' __author__ = 'syntaxis' publisher = 'Heise Zeitschriften Verlag GmbH & Co KG' description = 'News from Telepolis' category = 'news' oldest_article = 1 max_articles_per_feed = 100 recursion = 0 no_stylesheets = True encoding = "utf-8" language = 'de' remove_empty_feeds = True keep_only_tags = [dict(name = 'div',attrs={'class':'head'}),dict(name = 'div',attrs={'class':'leftbox'}),dict(name='td',attrs={'class':'strict'})] remove_tags = [ dict(name='td',attrs={'class':'blogbottom'}), dict(name='div',attrs={'class':'forum'}), dict(name='div',attrs={'class':'social'}),dict(name='div',attrs={'class':'blog-letter p-news'}), dict(name='div',attrs={'class':'blog-sub'}),dict(name='div',attrs={'class':'version-div'}),dict(name='div',attrs={'id':'breadcrumb'}) ,dict(attrs={'class':'tp-url'}),dict(attrs={'class':'blog-name entry_'}) ] remove_tags_after = [dict(name='span', attrs={'class':['breadcrumb']})] feeds = [(u'News', u'http://www.heise.de/tp/news-atom.xml')] html2lrf_options = [ '--comment' , description , '--category' , category , '--publisher', publisher ] html2epub_options = 'publisher="' + publisher + '"\ncomments="' + description + '"\ntags="' + category + '"' def preprocess_html(self, soup): mtag = '<meta http-equiv="Content-Type" content="text/html; charset=' + self.encoding + '">' soup.head.insert(0,mtag) return soup |
![]() |
![]() |
![]() |
#8 |
aka zonebattler
![]() Posts: 30
Karma: 50
Join Date: Oct 2003
Location: Fürth, Germany
Device: Kindle KB, Kindle PW Signature Edition (11. Gen)
|
Hi syntaxis,
thank you for your quick reaction! Your updated receipe works fine, however I have three issuses to report: 1) Despite the parameter max_articles_per_feed = 100 (which I reduced to 30), your script only fetches 11 articles, and I have no clue why... 2) Quite a few articles, but not all, (try "Endlich Schluss mit Hartz IV") start with a single lower case character (here: "w") in the first line. This is only a minor aesthetical issue, of course. 3) It would be nice if chapter headlines (if present) would be formated as headlines (i.e. larger and bolder as the regular text). calibre manages to do that on its own if you have no receipe at hand and use the default mode instead... Apart from that, everything works fine: all unnecessary stuff is trimmed off as it should be. Great! The only really annoying behaviour is that the script fetches fewer articles than it is supposed to do. Perhaps you can find a way to fix that? Thanks! Best wishes, Ralph Last edited by juco; 05-09-2011 at 10:37 AM. |
![]() |
![]() |
![]() |
#9 |
Junior Member
![]() Posts: 5
Karma: 10
Join Date: Nov 2010
Device: Kindle Paperwhite (2014)
|
Hi Juco,
if you want to see more arcticles you have to change that line Code:
oldest_article = 1 2 +3 I made some changes, should work now Code:
# -*- coding: utf-8 -*- import re from calibre.web.feeds.news import BasicNewsRecipe class TelepolisNews(BasicNewsRecipe): title = u'Telepolis (News+Artikel)' __author__ = 'syntaxis' publisher = 'Heise Zeitschriften Verlag GmbH & Co KG' description = 'News from Telepolis' category = 'news' oldest_article = 1 max_articles_per_feed = 100 recursion = 0 no_stylesheets =True encoding = "utf-8" language = 'de' remove_empty_feeds = True keep_only_tags = [dict(name = 'div',attrs={'class':'head'}),dict(name = 'div',attrs={'class':'leftbox'}),dict(name='td',attrs={'class':'strict'})] remove_tags = [ dict(name='td',attrs={'class':'blogbottom'}), dict(name='div',attrs={'class':'forum'}), dict(name='div',attrs={'class':'social'}),dict(name='div',attrs={'class':'blog-letter p-news'}), dict(name='div',attrs={'class':'blog-sub'}),dict(name='div',attrs={'class':'version-div'}),dict(name='div',attrs={'id':'breadcrumb'}) ,dict(attrs={'class':'tp-url'}),dict(name= 'div', attrs={'class':['blog-letter e-news','blog-letter m-news','blog-letter w-news','blog-letter t-news', 'blog-letter k-news','blog-letter s-news']}) ] remove_tags_after = [dict(name='span', attrs={'class':['breadcrumb']})] feeds = [(u'News', u'http://www.heise.de/tp/news-atom.xml')] html2lrf_options = [ '--comment' , description , '--category' , category , '--publisher', publisher ] html2epub_options = 'publisher="' + publisher + '"\ncomments="' + description + '"\ntags="' + category + '"' def preprocess_html(self, soup): mtag = '<meta http-equiv="Content-Type" content="text/html; charset=' + self.encoding + '">' soup.head.insert(0,mtag) return soup extra_css = ''' h1 {color:#008852;font-family:Arial,Helvetica,sans-serif; font-size:25px; font-size-adjust:none; font-stretch:normal; font-style:normal; font-variant:normal; font-weight:bold; line-height:22px; } h2 {color:#4D4D4D;font-family:Arial,Helvetica,sans-serif; font-size:18px; font-size-adjust:none; font-stretch:normal; font-style:normal; font-variant:normal; font-weight:bold; line-height:16px; } h3 {color:#4D4D4D;font-family:Arial,Helvetica,sans-serif; font-size:15px; font-size-adjust:none; font-stretch:normal; font-style:normal; font-variant:normal; font-weight:bold; line-height:14px;} h4 {color:#333333; font-family:Arial,Helvetica,sans-serif;font-size:12px; font-size-adjust:none; font-stretch:normal; font-style:normal; font-variant:normal; font-weight:bold; line-height:14px; } h5 {color:#333333; font-family:Arial,Helvetica,sans-serif; font-size:11px; font-size-adjust:none; font-stretch:normal; font-style:normal; font-variant:normal; font-weight:bold; line-height:14px; text-transform:uppercase;} ''' Last edited by syntaxis; 05-16-2011 at 08:16 AM. |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Updated New York Times recipe | nickredding | Recipes | 2 | 11-20-2010 10:53 AM |
[Updated recipe] Ming Pao (明報) - Hong Kong | tylau0 | Recipes | 0 | 11-12-2010 06:24 PM |
[Updated recipe] Ming Pao (明報) - Hong Kong | tylau0 | Recipes | 0 | 11-06-2010 06:46 PM |
Updated New Yorker recipe doesn't fetch comics | yekim54 | Recipes | 2 | 10-09-2010 10:47 PM |
Calibre Recipe: Telepolis (Artikel) (German) | lena_punkt | Calibre | 1 | 09-27-2010 05:03 AM |