View Single Post
Old 01-16-2011, 03:02 PM   #7
mufc
Connoisseur
mufc doesn't littermufc doesn't litter
 
Posts: 99
Karma: 170
Join Date: Nov 2010
Location: Airdrie Alberta
Device: Sony 650
U R RIGHT

I was testing that code in another globe recipe and having problems. I tried yours this morning and it works fine. My apologies.
Let me ask you a question.
The code for removing articles. Is it carried out prior to downloading. Why I ask this is my recipe does not include all sections of the G + M. The extra articles I lose could be because they show up first in a section that I do not download. Is that possible ?
Could not get the single page layout to work at all.

Maybe I should just modify your recipe to suit my Sony 650.

Your recipe seems to have given me more questions than answers.

How much of this is generic and could they be adapted to other recipes
Spoiler:
def postprocess_html(self, soup, first_fetch):
# Find and preserve single page article layout, can be first or last
allArts = soup.findAll(True, {'id':'article'})
if len(allArts)==2:
if(len(allArts[0].contents)>len(allArts[1].contents)):
allArts[1].extract()
else:
allArts[0].extract()

return soup

def parse_feeds(self, *args, **kwargs):
parsed_feeds = BasicNewsRecipe.parse_feeds(self, *args, **kwargs)
# Eliminate the duplicates
urlSet = set()

for feed in parsed_feeds:
newArticles = []
for article in feed:
if article.url in urlSet:
feed.articles.remove( article )
else:
urlSet.add(article.url)
newArticles.append(article)

feed.articles = newArticles

return parsed_feeds


Also this is the first instance I have seen of using the mobile instead of web version. What is your reasoning and could it be adapted for other recipes ?
mufc is offline   Reply With Quote