View Single Post
Old 12-30-2009, 05:06 AM   #1043
evanmaastrigt
Connoisseur
evanmaastrigt doesn't litterevanmaastrigt doesn't litter
 
Posts: 78
Karma: 192
Join Date: Nov 2009
Device: Sony PRS-600
Quote:
Originally Posted by mtutalo View Post
I am trying to get the Providence Journal
You can not expect the recipe for The Washington Post to work on this website (or any other website for that matter). You have to examine the HTML and adjust accordingly.

For a start: remove the extra_css and remove_tags properties. Then remove the get_article_url(), print_version() and postprocess_html() methods. Then add the following line
Code:
keep_only_tags = [dict(name = 'div', attrs = {'id': 'storycontentleft'})]
Refine by removing unwanted tags and adding css to your liking.
evanmaastrigt is offline