Quote:
Originally Posted by piflintstone
Ok, I gave up on derstandard.at but instead would like to get the RSS Feeds from "Die Presse" working.
I started with this one: http://diepresse.com/rss/home
Everything is working more or less fine (apart from the terrible formatting, but this should be rather easy to do from what I saw on the calibre manual), but one thing: The links listed there link to the main pages of the articles. I was able to strip unwanted stuff (like menus etc.) but for some reason I was not able to include the articles picture?
Some pictures do show up in the final file, but the article´s main picture will not show up, any idea why?
|
Nobody can help you before you show the actual code.
You should use this to extract article from that site:
Code:
keep_only_tags = [dict(name='div', attrs={'class':'article'})]
remove_tags_after = dict(name='div',attrs={'class':'articletext'})