![]() |
#1 |
Junior Member
![]() Posts: 2
Karma: 10
Join Date: May 2013
Device: Kindle Paperwhite
|
How To Geek - Recipe Update
Today i updated my first recipe, so I appreciate any suggestions.
Improvements
Bugs Page break after each converted <h2> tag in the created epub: <div class="mbp_pagebreak"></div> How to get rid of it? (Tried to change the common conversion options of Calibre, but they don't affect the news fetch, or?) This causes a page break after each article-heading, so the heading is alone on the first site, and the content starts on the next site. And Calibre can't fetch 'lazy load' images i guess? Images in the article won't be fetched, only a gray circle indicating to the 'lazy load'-feature of this images. Code:
# Based on TonytheBookworm's original recipe __license__ = 'GPL v3' __copyright__ = '2013, Johannes Kopf' import re from calibre.web.feeds.news import BasicNewsRecipe class AdvancedUserRecipe1282101454(BasicNewsRecipe): title = u'How To Geek' language = 'en' __author__ = 'Johannes Kopf' description = 'Daily Computer Tips and Tricks' publisher = 'Howtogeek' category = 'PC,tips,tricks' oldest_article = 2 max_articles_per_feed = 50 no_stylesheets = True remove_javascript = True masthead_url = 'http://blog.stackoverflow.com/wp-content/uploads/how-to-geek-logo.png' cover_url = 'http://www.howtogeek.com/geekers/up/sshot4ebc09559ecbf.jpg' recursions = 1 # Fetch only links from howtogeek.com/number match_regexps = [r'http://www.howtogeek.com/\d*'] remove_tags = [ dict(name='img', attrs={'src':re.compile('.*readmore-button.png.*',re.IGNORECASE)}), dict(name='img', attrs={'class':re.compile('.*lazyLoad.*',re.IGNORECASE)})] remove_tags_before = dict(name='div', attrs={'class':['thecontent']}) remove_tags_after = dict(name='div', attrs={'class':['thecontent']}) keep_only_tags = [ dict(name='div', attrs={'class':['thecontent']}), dict(name=['h2', 'h3']), dict(name='a', attrs={'href':re.compile('.*http://www.howtogeek.com/\d*.*',re.IGNORECASE)})] feeds = [(u'Tips', u'http://feeds.howtogeek.com/howtogeek')] Last edited by JoxX; 05-10-2013 at 01:55 PM. |
![]() |
![]() |
![]() |
Tags |
how to geek, recipe update |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
metro uk recipe update | fleclerc | Recipes | 2 | 01-20-2013 02:30 PM |
The Economist Recipe Update | rainrdx | Recipes | 1 | 01-17-2013 10:17 PM |
shortlist.com recipe update | scissors | Recipes | 3 | 05-19-2012 01:22 AM |
Den of Geek Recipe (Nerdy News Feed) | mrjaded | Recipes | 0 | 09-25-2011 11:10 AM |
Kurier recipe update | clanger9 | Recipes | 0 | 09-24-2011 09:45 AM |