View Single Post
Old 04-25-2013, 11:33 PM   #1
Camper65
Enthusiast
Camper65 began at the beginning.
 
Posts: 32
Karma: 10
Join Date: Apr 2011
Device: Kindle wifi; Dell 2in1
Question need help in trying to update .net recipe

I'm trying to rewrite the default .net recipe, it seems the way the feeds url is handled changed and it no longer works right anymore.

I've been playing with it, I'm comfortable with HTML and CSS, some php and java but didn't really study python but am getting to understand more of it with doing all this.


This one gets the title of the article and sometimes the descriptions of the articles from the newsfeed but doesn't go any further to pass the actual URL of to the article so that it can pull the whole article.

Spoiler:
# vim:fileencoding=UTF-8:ts=4:sw=4:sta:et:sts=4:ai
from calibre.web.feeds.news import BasicNewsRecipe
import re

class NetMagazineRecipe (BasicNewsRecipe):
__author__ = u'Marc Busqué <marc@lamarciana.com>'
__url__ = 'http://www.lamarciana.com'
__version__ = '1.0'
__license__ = 'GPL v3'
__copyright__ = u'2012, Marc Busqué <marc@lamarciana.com>'
title = u'.net magazine Custom'
description = u'net is the world’s best-selling magazine for web designers and developers, featuring tutorials from leading agencies, interviews with the web’s biggest names, and agenda-setting features on the hottest issues affecting the internet today.'
language = 'en'
tags = 'web development, software'
oldest_article = 7
remove_empty_feeds = True
no_stylesheets = True
auto_cleanup = True
cover_url = u'http://media.netmagazine.futurecdn.net/sites/all/themes/netmag/logo.png'
# remove_tags_above = dict(id='header')
# remove_tags_below = [dict(name='footer')]

# keep_only_tags = [
# dict(name='article', attrs={'class': re.compile('^node.*$', re.IGNORECASE)}),
# ]
# remove_tags = [
# dict(name='span', attrs={'class': 'comment-count'}),
# dict(name='div', attrs={'class': 'item-list share-links'}),
# dict(name='footer'),
# ]
# remove_attributes = ['border', 'cellspacing', 'align', 'cellpadding', 'colspan', 'valign', 'vspace', 'hspace', #'alt', 'width', 'height', 'style']
# extra_css = 'img {max-width: 100%; display: block; margin: auto;} .captioned-image div {text-align: center; #font-style: italic;}'

feeds = [
(u'.net', u'http://feeds.feedburner.com/net/topstories?format=xml'),
]



(I have commented out the tag area until I can get it working then can modify it to what is needed and not needed).

In trying to get it to pass the url of the feedburner entry I'm trying the following:

Spoiler:
# vim:fileencoding=UTF-8:ts=4:sw=4:sta:et:sts=4:ai
from calibre.web.feeds.news import BasicNewsRecipe
import re

class NetMagazineRecipe (BasicNewsRecipe):
__author__ = u'Marc Busqué <marc@lamarciana.com>'
__url__ = 'http://www.lamarciana.com'
__version__ = '1.0'
__license__ = 'GPL v3'
__copyright__ = u'2012, Marc Busqué <marc@lamarciana.com>'
title = u'.net magazine Custom'
description = u'net is the world’s best-selling magazine for web designers and developers, featuring tutorials from leading agencies, interviews with the web’s biggest names, and agenda-setting features on the hottest issues affecting the internet today.'
language = 'en'
tags = 'web development, software'
oldest_article = 7
remove_empty_feeds = True
no_stylesheets = True
auto_cleanup = True
cover_url = u'http://media.netmagazine.futurecdn.net/sites/all/themes/netmag/logo.png'
# remove_tags_above = dict(id='header')
# remove_tags_below = [dict(name='footer')]

# keep_only_tags = [
# dict(name='article', attrs={'class': re.compile('^node.*$', re.IGNORECASE)}),
# ]
# remove_tags = [
# dict(name='span', attrs={'class': 'comment-count'}),
# dict(name='div', attrs={'class': 'item-list share-links'}),
# dict(name='footer'),
# ]
# remove_attributes = ['border', 'cellspacing', 'align', 'cellpadding', 'colspan', 'valign', 'vspace', 'hspace', #'alt', 'width', 'height', 'style']
# extra_css = 'img {max-width: 100%; display: block; margin: auto;} .captioned-image div {text-align: center; #font-style: italic;}'

feeds = [
(u'.net', u'http://feeds.feedburner.com/net/topstories?format=xml'),
]

def get_article_url(self, article):

url = article.get('link', None)

return url


Can anyone help me adjust how to pass the url so that the recipe can convert the feed to an actual URL so that it can download the articles. Unfortunately there are no print versions of these articles so the original must be used. Thanks.
Camper65 is offline   Reply With Quote