Quote:
Originally Posted by Gnome Eater
|
Here is a quick version which does not use the "print version", and so retains the photos, but loses the continuation text which is only in the "print version". Another side effect is that the table of contents shows some items, such as "Presented by:", "Game: Animal Z's", "Now: Real Angry Birds", "Next: Elevator to Space", "Desktop Wallpaper", "Visions of Earth", "Flashback", "Your Shot" and "MyShot" (contents shown today) although these items are not extracted. The "Photo Journal" is included, but has no heading before the first photo. remove "dict(attrs={'class':'slide'})" to drop this item. If you do not want the "Editor's Note", remove "dict(attrs={'class':'main_2wide'})"
Code:
class AdvancedUserRecipe1310601553(BasicNewsRecipe):
title = u'NationalGeographicPrintEdition'
oldest_article = 7
max_articles_per_feed = 100
use_embedded_content = False
keep_only_tags = [dict(attrs={'class':'main_3narrow'}),
dict(attrs={'class':'main_2wide'}),
dict(attrs={'class':'slide'})
]
feeds = [(u'National Geographic Print Edition', u'http://feeds.nationalgeographic.com/ng/NGM/NGM_Magazine')]