Found part of the solution, at least now the documents are downloading, now to clean it up before it creates a ebook version. It needed a complete rewrite of the original recipe. Since it's a rewrite, I'm putting my info into it.
So far the code is as follows:
Code:
from calibre.web.feeds.news import BasicNewsRecipe
class dotnetMagazine (BasicNewsRecipe):
__author__ = u'Bonni Salles'
__version__ = '1.0'
__license__ = 'GPL v3'
__copyright__ = u'2013, Bonni Salles'
title = '.net '
oldest_article = 7
no_stylesheets = True
encoding = 'utf8'
use_embedded_content = False
language = 'en'
remove_empty_feeds = True
extra_css = ' body{font-family: Arial,Helvetica,sans-serif } img{margin-bottom: 0.4em} '
# remove_tags_above = dict(id='header')
# remove_tags_below = [dict(name='footer')]
# keep_only_tags = [
# dict(name='article', attrs={'class': re.compile('^node.*$', re.IGNORECASE)}),
# ]
# remove_tags = [
# dict(name='span', attrs={'class': 'comment-count'}),
# dict(name='div', attrs={'class': 'item-list share-links'}),
# dict(name='footer'),
# ]
# remove_attributes = ['border', 'cellspacing', 'align', 'cellpadding', 'colspan', 'valign', 'vspace', 'hspace', #'alt', 'width', 'height', 'style']
# extra_css = 'img {max-width: 100%; display: block; margin: auto;} .captioned-image div {text-align: center; #font-style: italic;}'
feeds = [
(u'net', u'http://feeds.feedburner.com/net/topstories')
]
Now to read on how to remove tags before it processing the html, there's a lot on the page that is not needed. It took a week to figure out that the recipe needed the complete rewrite.