01-17-2011, 07:30 PM | #1 |
Member
Posts: 11
Karma: 10
Join Date: Jan 2011
Device: Kindle
|
removing articles from feeds with regexp
The RSS feeds from RSS Technica have so-called "Etc" items which do not point to actual stories, but rather are just links to the web etc.
I would like to remove these from the feeds so that they do not get turned into articles. One way I'd go about it is with a regular expression - the titles for these items all start with "Etc:". Here is an example feed: http://feeds.arstechnica.com/arstechnica/apple/ Is there a BasicNewsRecipe method I could use to do this? I might have missed one when going through the API. Thanks! |
01-17-2011, 07:44 PM | #2 |
creator of calibre
Posts: 43,842
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
get_article_url, have it return None for articles you want skipped
|
Advert | |
|
Thread Tools | Search this Thread |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Regexp and Alternate Page Header/Footer | adad | Calibre | 5 | 01-15-2011 09:03 PM |
Multiple line regexp | janvanmaar | Calibre | 19 | 11-02-2010 01:02 PM |
Regexp and header/footer problems | concern | Calibre | 0 | 02-07-2010 03:35 AM |
Any way to import feeds/articles from Google reader into Calibre? | techie_007 | Calibre | 1 | 12-26-2009 11:15 AM |
The Observer feeds and articles | Roger Wilmut | Calibre | 3 | 12-15-2008 12:02 PM |