![]() |
#1 |
Member
![]() Posts: 21
Karma: 10
Join Date: Oct 2014
Device: Android
|
I fixed the Friday Times recipe
I had been using the Friday Times recipe as a template, because it was about the simplest parse_index recipe (that is, a recipe not based on a RSS feed) I could find. However I eventually noticed the recipe itself was broken, and I had to take a break from the other recipe I am working on. So I fixed the Friday Times recipe. Let me know any criticisms or suggestions for the fix.
Code:
from calibre.web.feeds.news import BasicNewsRecipe class TheFridayTimes(BasicNewsRecipe): language = 'en_PK' encoding = 'utf8' version = 1.1 title = u'The Friday Times' category = u'news, Pakistan' description = u"Pakistan's First Independent Weekly Paper" no_stylesheets = True no_javascript = True ignore_duplicate_articles = {'url'} keep_only_tags = [ dict(name='div', attrs={'class':'sidebar_content'}), dict(name='div', attrs={'class':'comment_inner'}) ] remove_tags = [ dict(name='p', attrs={'class':'no-break'}), dict(name='div', attrs={'class':'related_posts'}), dict(name='div', attrs={'id':'respond'}) ] def parse_index(self): toc_page = self.index_to_soup('http://www.thefridaytimes.com/tft/') toc = toc_page.find('div', attrs={'class':'sidebar_left_home_wrapper'}) articles = [] for story in toc.findAll('a'): # skip the links with an image, they are repeated further down if story.find('img') is not None: continue url = story['href'] # If no title, use url as title title = story.get('title', url) self.log('Found article:', story) self.log('\t', url) articles.append({'title':title, 'url':url, 'date':'','description':''}) return [('Current Issue', articles)] Last edited by ireadtheinternet; 11-25-2014 at 06:04 AM. |
![]() |
![]() |
![]() |
#2 |
Member
![]() Posts: 21
Karma: 10
Join Date: Oct 2014
Device: Android
|
If the [s]articles[/s] comments are not wanted, you can comment out the line
Code:
dict(name='div', attrs={'class':'comment_inner'}) EDIT: Noticed this is an official edit now, thanks. Last edited by ireadtheinternet; 12-23-2014 at 11:46 PM. |
![]() |
![]() |
Advert | |
|
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Newsweek Polska - fixed recipe | admroz | Recipes | 1 | 10-16-2013 02:14 PM |
Fixed brand eins recipe | siebert | Recipes | 18 | 07-30-2013 06:56 AM |
Help with Recipe for The Friday Times | multani | Recipes | 0 | 03-11-2013 03:26 PM |
Fixed Sydney Morning Herald Recipe | zephram | Recipes | 0 | 09-29-2011 08:51 AM |
[fixed recipe] Wprost - polish newsmagazine | zaslav | Recipes | 0 | 06-26-2011 04:53 PM |