View Single Post
Old 02-17-2009, 03:02 PM   #229
xianfox
Ebook Addict
xianfox knows what time it isxianfox knows what time it isxianfox knows what time it isxianfox knows what time it isxianfox knows what time it isxianfox knows what time it isxianfox knows what time it isxianfox knows what time it isxianfox knows what time it isxianfox knows what time it isxianfox knows what time it is
 
xianfox's Avatar
 
Posts: 225
Karma: 2136
Join Date: Jul 2003
Location: Appleton, Wisconsin, USA
Device: Onyx BOOX Note Air 4C, Palma
I'm really close on a custom recipe for my local paper. Here is the code I currently have:

Code:
class AdvancedUserRecipe1234841996(BasicNewsRecipe):
    title          = u'Appleton Post Crescent'
    oldest_article = 7
    max_articles_per_feed = 100
    remove_javascript     = True
    html2lrf_options = ['--ignore-tables']    
    html2epub_options = 'linearize_tables = True' 
    remove_tags = [dict(name='div', attrs={'class':'article-tools'})]
    keep_only_tags     = [dict(name='div', attrs={'class':['article-headline', 'article-bodytext']})]
    
    feeds          = [(u'Latest Headlines', u'http://www.postcrescent.com/apps/pbcs.dll/misc?URL=/templates/RSSlatest.pbs&mime=xml'), (u'Local News', u'http://www.postcrescent.com/apps/pbcs.dll/misc?URL=/templates/RSSlocal.pbs&mime=xml'), (u'Sports', u'http://www.postcrescent.com/apps/pbcs.dll/misc?URL=/templates/RSSsports.pbs&mime=xml')]
The problem is, in the head section of their pages they have a malformed comment that looks like this:

Code:
<!--- OAS MACRO --->
My Sony Reader won't display the resulting output due to this malformed comment. I've tested it by manually removing it from the generated epub file and it works flawlessly.

Can anyone help with a brief bit of code that I can add to my recipe to remove this stubborn comment?

Thanks
xianfox is offline