View Single Post
Old 09-24-2010, 02:32 PM   #1
noah
Junior Member
noah began at the beginning.
 
Posts: 6
Karma: 10
Join Date: Sep 2010
Device: Kindle
The Bay Citizen - recipe help

Thanks to TonytheBookworm whose helpful post got me started with a recipe for The Bay Citizen.

I modified the recipe to extract content from the regular (non-print) story pages, because I wanted the pictures which aren't included in the print versions.

Spoiler:
# this block is pretty much standard on all recipes
#----------------------------------------------------------------------------------------------------------
from calibre.web.feeds.news import BasicNewsRecipe

class AdvancedUserRecipe1282101454(BasicNewsRecipe):
title = 'The Bay Citizen'
language = 'en'
__author__ = 'TonytheBookworm and noah'
description = 'The Bay Citizen'
publisher = 'The Bay Citizen'
category = 'news'
oldest_article = 1 # USE THIS TO DETERMINE HOW FAR BACK YOU WANNA GO IN THE FEED DATE WISE
max_articles_per_feed = 20 # USE TO DETERMINE HOW MANY ARTICLES YOU WISH TO READ PER FEED
no_stylesheets = True # TURNS OFF JAVASCRIPT

masthead_url = 'http://media.baycitizen.org/images/layout/logo1.png' #PUTS NICE LOGO ON MAIN MENU PAGE
#---------------------------------------------------------------------------------------------------------

#here we tell the recipe what feed(s) we wish to obtain
#-----------------------------------------------------------------------------------------
feeds = [
('Main Feed', 'http://www.baycitizen.org/feeds/stories/'),

]
#------------------------------------------------------------------------------------------

keep_only_tags = [dict(name='div', attrs={'class':'story'})]

remove_tags = [dict(name='div', attrs={'class':'socialBar'})]


It works pretty well, but I have two questions/problems:
  1. In both my version and Tony's, Calibre is forming the section menu using the <media:title> element from the feed, instead of the <title> element. How can I get it to use the <title> element, which is what it actually should be doing?
  2. In my version, certain stories appear as complete gobbledygook -- huge strings of strange characters. Help!?
noah is offline   Reply With Quote