Business Week Magazine

blackberry4 · 01-18-2014, 10:13 AM

Hello

For a few months now I have noticed the recipe for Business week Magazine is only downloading the headlines but no articles and the cover page is very dated.

Any help would be great.

Thanks very much!

Divingduck · 01-18-2014, 02:04 PM

I made a quick update for this recipe. Hope, this will work for you.

Spoiler:

Code:

import re
from calibre.web.feeds.recipes import BasicNewsRecipe
from collections import OrderedDict

class BusinessWeekMagazine(BasicNewsRecipe):

    title       = 'Business Week Magazine'
    __author__  = 'Rick Shang, Armin Geller' # AGE Upd 2014-01-18

    description = 'A renowned business publication. Business news, trends and profiles of successful businesspeople.'
    language = 'en'
    category = 'news'
    encoding = 'UTF-8'
    keep_only_tags = [
            dict(name='div', attrs={'id':['content']}),         # AGE 2014-01-18
            ]
    remove_tags = [dict(name='hr'), 
                    dict(name='a', attrs={'class':'sub_sales'}),
                    dict(name='div', attrs={'class':'fieldset'}),
                    dict(name='div', attrs={'id':'taboola_wrapper'})] # AGE 2014-01-18
    no_javascript = True
    no_stylesheets = True

    cover_url             = 'http://images.businessweek.com/mz/covers/current_120x160.jpg'

    def parse_index(self):
        #Go to the issue
        soup = self.index_to_soup('http://www.businessweek.com/magazine/news/articles/business_news.htm')

        #Find date
        mag=soup.find('h2',text='Magazine')
        dates=self.tag_to_string(mag.findNext('h3'))
        self.timefmt = u' [%s]'%dates

        #Go to the main body
        div0 = soup.find('div', attrs={'class':'column left'})
        section_title = ''
        feeds = OrderedDict()
        for div in div0.findAll('a', attrs={'class': None}):
            articles = []
            section_title = self.tag_to_string(div.findPrevious('h3')).strip()
            title=self.tag_to_string(div).strip()
            url=div['href']
            soup0 = self.index_to_soup(url)
            urlprint=soup0.find('a', attrs={'href':re.compile('.*printer.*')})
            if urlprint is not None:
                url=urlprint['href']
            articles.append({'title':title, 'url':url, 'description':'', 'date':''})

            if articles:
                if section_title not in feeds:
                    feeds[section_title] = []
                feeds[section_title] += articles
        div1 = soup.find('div', attrs={'class':'column center'})
        section_title = ''
        for div in div1.findAll('a'):
            articles = []
            desc=self.tag_to_string(div.findNext('p')).strip()
            section_title = self.tag_to_string(div.findPrevious('h3')).strip()
            title=self.tag_to_string(div).strip()
            url=div['href']
            soup0 = self.index_to_soup(url)
            urlprint=soup0.find('a', attrs={'href':re.compile('.*printer.*')})
            if urlprint is not None:
                url=urlprint['href']
            articles.append({'title':title, 'url':url, 'description':desc, 'date':''})
            if articles:
                if section_title not in feeds:
                    feeds[section_title] = []
                feeds[section_title] += articles

        ans = [(key, val) for key, val in feeds.iteritems()]
        return ans

blackberry4 · 01-18-2014, 04:21 PM

Thank you very much! its working perfectly now

01-18-2014, 10:13 AM	#1
blackberry4 Junior Member Posts: 2 Karma: 10 Join Date: Jan 2014 Device: Kindle2	Business Week Magazine Hello For a few months now I have noticed the recipe for Business week Magazine is only downloading the headlines but no articles and the cover page is very dated. Any help would be great. Thanks very much!

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
Business Week Magazine	rainrdx	Recipes	15	09-10-2013 12:57 AM
Business Week Magazine Recipe Broken	agopalak	Recipes	1	09-06-2013 06:47 AM
Business Week Magazine error	garyzeb55	Recipes	1	04-26-2013 07:52 PM
Business Week Magazine error	garyzeb55	Recipes	1	04-05-2013 09:40 PM
Business Week problem	garyzeb55	Recipes	2	03-26-2013 10:02 AM

01-18-2014, 04:21 PM	#3
blackberry4 Junior Member Posts: 2 Karma: 10 Join Date: Jan 2014 Device: Kindle2	Thank you very much! its working perfectly now

Advert