View Single Post
Old 01-26-2008, 11:41 PM   #35
Deputy-Dawg
Groupie
Deputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-books
 
Deputy-Dawg's Avatar
 
Posts: 153
Karma: 799
Join Date: Dec 2007
Device: sony prs505
From DefaultProfile

timefmt = ' [%a %d %b %Y]' # The format of the date shown on the first page
url_search_order = ['guid', 'link'] # THe order of elements to search for a URL when parssing the RSS feed
pubdate_fmt = None # The format string used to parse the


Which would imply that only the classes 'link' and 'guid' are searched for the link. This is born out by the fact that when you process the feed from the Denver Post with

use_pubdate = False

get the error message

Skipping article as it does not have a link url

from the source for the feed for each article in the feed the following code appears:

<li class="regularitem" xmlns:dc="http://purl.org/dc/elements/1.1/">
<h4 class="itemtitle">
<a href="http://www.denverpost.com/ci_8088727">
Man hit in crosswalk, killed
</a>
</h4>
<h5 class="itemposttime">
<span>Posted: </span>
Sat, 26 Jan 2008 20:09:37 -0700
</h5>
<div class="itemcontent" name="decodeable">
A 22-year-old Denver resident was killed in Aurora Saturday when a 71-year-old man driving a pickup ran a red light on South Parker Road, then veered into a crosswalk.
</div>
</li>

the url for the article is only contained in the class itemtitle

similarly in the feeds from izvestia the url is only contained in the classes

mainnewstime and mainnewsnotice

and at that only the variable part of the link in the form:

/world/asia/20080127/97803220.html

Which has to be concantenated with http://www.rian.ru to obtain the fully qualified address.

is it possible to handle either of these cases in web2lrf?

BTW a profile runs much faster in the Terminal than when embedded in libprs500, also I have found that if I attempt to run more than about 3 profiles sequentialy librs500 crashes. I can get around the problem by quitting and restarting. No need to remove the previously captured feeds
Deputy-Dawg is offline   Reply With Quote