View Single Post
Old 06-16-2012, 06:39 PM   #1
terminalveracity
Member
terminalveracity got an A in P-Chem.terminalveracity got an A in P-Chem.terminalveracity got an A in P-Chem.terminalveracity got an A in P-Chem.terminalveracity got an A in P-Chem.terminalveracity got an A in P-Chem.terminalveracity got an A in P-Chem.terminalveracity got an A in P-Chem.terminalveracity got an A in P-Chem.terminalveracity got an A in P-Chem.terminalveracity got an A in P-Chem.
 
Posts: 18
Karma: 6000
Join Date: Jun 2012
Device: Kindle Keyboard 3G
New recipe for National Geographic Magazine

This feed contains the main articles from the print magazine. Some of the miscellaneous topics (Your Shot, Wallpaper...) don't parse properly and are removed.

Spoiler:
Code:
import re
from calibre.web.feeds.recipes import BasicNewsRecipe

class NatGeoMag(BasicNewsRecipe):
    title                  = 'National Geographic Mag'
    __author__             = 'Terminal Veracity'
    description            = 'The National Geographic Magazine'
    publisher              = 'National Geographic'
    oldest_article         = 31
    max_articles_per_feed  = 50
    category               = 'geography, magazine'
    language               = 'en_US'
    publication_type       = 'magazine'
    cover_url              = 'http://www.yourlogoresources.com/wp-content/uploads/2011/09/national-geographic-logo.jpg'
    use_embedded_content   = False
    no_stylesheets         = True
    remove_javascript      = True
    recursions             = 1
    remove_empty_feeds     = True
    feeds                  = [('National Geographic Magazine', 'http://feeds.nationalgeographic.com/ng/NGM/NGM_Magazine')]
    remove_tags            = [dict(name='div', attrs={'class':['nextpage_continue', 'subscribe']})]
    keep_only_tags         = [dict(attrs={'class':'main_3narrow'})]
    extra_css              = """
                                h1 {font-size: large; font-weight: bold; margin: .5em 0; }
                                h2 {font-size: large; font-weight: bold; margin: .5em 0; }
                                h3 {font-size: medium; font-weight: bold; margin: 0 0; }
                                .article_credits_author {font-size: small; font-style: italic; }
                                .article_credits_photographer {font-size: small; font-style: italic; display: inline }
                             """

    def parse_feeds(self):
        feeds = BasicNewsRecipe.parse_feeds(self)
        for feed in feeds:
            for article in feed.articles[:]:
                if 'Flashback' in article.title:
                    feed.articles.remove(article)
                elif 'Desktop Wallpaper' in article.title:
                    feed.articles.remove(article)
                elif 'Visions of Earth' in article.title:
                    feed.articles.remove(article)
                elif 'Your Shot' in article.title:
                    feed.articles.remove(article)
                elif 'MyShot' in article.title:
                    feed.articles.remove(article)
                elif 'Field Test' in article.title:
                    feed.articles.remove(article)
        return feeds


Note:
The other National Geographic recipes make use of the news feed: http://feeds.nationalgeographic.com/ng/News/News_Main

This recipe uses the magazine feed: http://feeds.nationalgeographic.com/ng/NGM/NGM_Magazine
terminalveracity is offline   Reply With Quote