I am attaching a copy of the profile for the Christian Science Monitor. I am having a problem that you may have to see to understand. For reference, every article in the feed has a structure like this:
<div class="apple-rss-article apple-rss-read" onclick="javascript:handleArticleClick(this)" showSeparator="true"
articlesortdate="0223013377.017225" articlesorttitle="gaza busts out of its blockade" articlesortsource="" sourceindex="0" articlesortid="00000000000000000010" articlelocaldate="0223013377.017225" articleid="a91c09df43f4cf6a33ffed73cecf111efe81204 a">
<div class="apple-rss-article-footer"></div>
<div class="apple-rss-article-head" >
<div class="apple-rss-unread-dot"><img src="file://localhost/System/Library/Frameworks/PubSub.framework/Versions/A/Resources/PubSubAgent.app/Contents/Resources/unread.tif" width="9" height="9" /></div>
<div class="apple-rss-subject" title="Gaza busts out of its blockade"><a href="http://rss.csmonitor.com/~r/feeds/top/~3/222417168/p01s04-wome.html">Gaza busts out of its blockade</a></a></div>
<div class="apple-rss-summary" >A new hole opens in the Arab-Israeli peace strategy of isolating Hamas.</div>
<div class="apple-rss-date" title="Today, 10:09 PM">Today, 10:09 PM</div>
</div>
<div class="apple-rss-article-body-container">
<div class="apple-rss-article-body">
A new hole opens in the Arab-Israeli peace strategy of isolating Hamas.
<p><a href="http://rss.csmonitor.com/~a/feeds/top?a=rt0NVe"><img src="http://rss.csmonitor.com/~a/feeds/top?i=rt0NVe" border="0" /></a></p>
<div class="feedflare"><a href="http://rss.csmonitor.com/~f/feeds/top?a=7LSTtWD"><img src="http://rss.csmonitor.com/~f/feeds/top?i=7LSTtWD" border="0" /></a> <a href="http://rss.csmonitor.com/~f/feeds/top?a=bYiAxtD"><img src="http://rss.csmonitor.com/~f/feeds/top?i=bYiAxtD" border="0" /></a> <a href="http://rss.csmonitor.com/~f/feeds/top?a=ISh8dED"><img src="http://rss.csmonitor.com/~f/feeds/top?i=ISh8dED" border="0" /></a> <a href="http://rss.csmonitor.com/~f/feeds/top?a=FL3bvEd"><img src="http://rss.csmonitor.com/~f/feeds/top?i=FL3bvEd" border="0" /></a></div>
<img src="http://rss.csmonitor.com/~r/feeds/top/~4/222417168" height="1" width="1" />
<a class="apple-rss-article-link" href="http://rss.csmonitor.com/~r/feeds/top/~3/222417168/p01s04-wome.html">Read more…</a>
<!-- end articlebody --></div></div>
<!-- end article --></div>
The entire block:
A new hole opens in the Arab-Israeli peace strategy of isolating Hamas.
<p><a href="http://rss.csmonitor.com/~a/feeds/top?a=rt0NVe"><img src="http://rss.csmonitor.com/~a/feeds/top?i=rt0NVe" border="0" /></a></p>
<div class="feedflare"><a href="http://rss.csmonitor.com/~f/feeds/top?a=7LSTtWD"><img src="http://rss.csmonitor.com/~f/feeds/top?i=7LSTtWD" border="0" /></a> <a href="http://rss.csmonitor.com/~f/feeds/top?a=bYiAxtD"><img src="http://rss.csmonitor.com/~f/feeds/top?i=bYiAxtD" border="0" /></a> <a href="http://rss.csmonitor.com/~f/feeds/top?a=ISh8dED"><img src="http://rss.csmonitor.com/~f/feeds/top?i=ISh8dED" border="0" /></a> <a href="http://rss.csmonitor.com/~f/feeds/top?a=FL3bvEd"><img src="http://rss.csmonitor.com/~f/feeds/top?i=FL3bvEd" border="0" /></a></div>
<img src="http://rss.csmonitor.com/~r/feeds/top/~4/222417168" height="1" width="1" />
Is being used as a summary in the contents page, I have tried many various forms in the preprocess_regexps section to no avail. I also tried setting summary_length = 0 (and 100 on the off chance it did accept 0 as an argument) and again no effect. Of course the profile is useable but the output is ugly as sin!
Finally is it possible to embed an HTML option in the profile? Specifically the --ignore-tables, again it is only for cosmetic effects.
Last edited by Deputy-Dawg; 01-26-2008 at 11:27 AM.
Reason: Uploaded repaired profile
|