Hi,
match_regexps and filter_regexps seems to be applied only on html link resolution.
I need them in feeds resolution too since some of them have heterogeneous links:
Code:
<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:dct="http://purl.org/dc/terms/">
<channel>
...
<item>
...
<link>http://www.liberation.fr/monde/0101318771-les-electeurs-israeliens-ont-choisi-le-pragmatisme</link>
...
</item>
<item>
...
<link>http://www.liberation.fr/monde/0101318514-repli</link>
...
</item>
<item>
...
<!-- Item I want to ignore because it reference another website that isn't handled by my recipe -->
<link>http://secretdefense.blogs.liberation.fr/defense/2009/02/un-officier-fra.html</link>
...
</item>
...
</channel>
</rss>
I'd like to put something like
Code:
match_regexps = (r'http://www\.liberation\.fr/.*')
in my recipe to filter relevant item.
Is there another option make that happend or can I create a new ticket on trac ?