Hi Starson.
Sorry I never noticed all those controls. Hopefully this is better.
I finally got the code to replace the URL with the print version. But it made no difference. Following on from Kovid and yourselfs tips I loaded the various articles into notepad++ and deleted the entire contents between <head></head>.
This stopped the crash on the articles where it's removed. (I've attached a copy of the resultant epub with the header from the first article removed). The only effect is the "previous next section and main" calibre generated header is larger text (and no crash of the sony).
Here is the recipe as it stands
Spoiler:
Code:
import time, re
class AdvancedUserRecipe1306061239(BasicNewsRecipe):
title = u'Out and about live'
description = 'Camping and Caravan - News and Reviews'
author = 'Dave Asbury'
cover_url= 'http://www.outandaboutlive.co.uk/img/template/footer/illustration_3.jpg'
masthead_url = 'http://www.outandaboutlive.co.uk/img/template/cloud_logo.gif'
oldest_article = 56
max_articles_per_feed = 100
remove_empty_feeds = True
remove_javascript = True
no_stylesheets = True
#remove_tags_before = dict(id='Body')
preprocess_regexps = [
(re.compile(r'Other News'), lambda h2 : ''),
(re.compile(r'Magazines'), lambda h4 : '')
]
keep_only_tags = [
dict(attrs={'class':['Content']})
]
remove_tags = [
dict(attrs={'class' : ['ItemSummary','Buttons','jcarousel-skin-oal_magselector']}),
# dict(name='head'),
# dict(name='style')
# dict(name='h4', attrs={'Magazines'})
]
remove_attributes = ['Other News']
def print_version(self, url):
myurl = url.replace('/_', '/Print-_')
print 'New URL =' ,myurl
return myurl
feeds = [(u'Camping News', u'http://feeds.feedburner.com/OAL/News/Camping')
# (u'Camping Features', u'http://feeds.feedburner.com/OAL/Features/Camping'),
# (u'Camping Reviews',u'http://feeds.feedburner.com/OAL/Reviews/Camping'),
# (u'Caravan News',u'http://feeds.feedburner.com/OAL/News/Caravans'),
# (u'Caravan Features',u'http://feeds.feedburner.com/OAL/Features/Caravans'),
# (u'Caravan Reviews',u'http://feeds.feedburner.com/OAL/Reviews/Caravans')
]
Here is the contents of the header I removed from the first article
I did at 1 point remove the <head> from the second article - it too stopped crashing.
Can a post process be done to remove <head></head> contents a second run so to speak. Is it possible there is a bug in Calibre
(I'm on a course for 2 weeks tomorrow so replies may be difficult)
Edit forgot to attach epub - attached next message