The Old Man,
You didn't have to wait long; attached is a quick and dirty that will download the first 10 articles in the following Jerusalem Post feed:
Front Page
Israel News
International News
Middle East News
Editorials
kovidgoyal
The last bit of code fixed up the problem with pubdate in the profile for Agenzia Fides.
I still am having some problems with how the summary is being displayed (cosmetic but ugly - various html tags are being displayed. Most notably <b></b> and <br>)
Meanwhile I have start on one for the Christian Science Monitor. And they have one wild way of directing you to the files. The href points to (and later on in a <link></link>) you are pointed to:
http://rss.csmonitor.com/~r/feeds/to...4s01-woaf.html
which resolves to
http://www.csmonitor.com/2008/0124/p04s01-woaf.html
with the print version being at
http://www.csmonitor.com/2008/0124/p04s01-woaf.htm
The rub is that if you change the original address to
http://rss.csmonitor.com/~r/feeds/to...04s01-woaf.htm
it too resolves to the .html file.
At first I thought this was going to be an easy one, the date is in the number 222417173 all we have to do is convert it to ascidate parse out the /2008/0124/ as '/%Y/%m%d/' and build the required address string. Doesn't work the number resolves to 1977 01 18. I can fix it by adding 2001 01 07 as an offset (that may have to be 06). Is that likely to be legitimate? Have I overlooked something.
The Christian Science Monitor also does not return a valid pubdate and unless you set use_pubdate = False you go no where. However in examining the source for the feed there always seems to be two date entries for each article
articlesortdate="0222880260.000000"
articlelocaldate="0222885964.644872"
which seem to be the epochdate of the files. would it not be possible to capture either or both? Can I get at them in my profiles? I am a bit unsure what declarations that would have to be made.