Made a few changes to the guardian recipe:
- Removed the adverts that appeared at the bottom of some articles - found that there were two html sections in the soup, with all the relevant stuff in the first.
- Removed some of the info under the headline. This includes: the author mugshot, "a version of this article appeared" spiel and the link to article history. I have put a comment by each of these in the remove_tags list to make it easy to re-enable if you choose.
- Removed the number next to the ratings stars (that appear in reviews) - you will probably want to remove this if you disable the images (just remove the relevant stuff in preprocess_html)