Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 08-27-2011, 07:37 PM   #1
RichardN
Junior Member
RichardN began at the beginning.
 
Posts: 8
Karma: 10
Join Date: Mar 2011
Location: London, UK
Device: Paperwhite
The Spectator Magazine - Request/Help

The Spectator is a UK political magazine without RSS for the main articles.

There are 7 main sections ) Politics, Essays, Wit & Wisdom, Comnists, Business, Art, Books.

Each of these has a several pages with the article heading and a few sentences and a link to the main article.

For exampel if you look at http://www.spectator.co.uk/essays/ you will see one page with perhaps six articles and numbers leading to further pages.

The http://www.spectator.co.uk/business-and-investments/ page is similar but with a cleck here for more articles.

I can see that for each of these sections need to consider as a separate feed, but having done that, I can't see how you can firstly use the parseIndex method nor can I see a way to hande multip pages otehr than hard coding.

If soemone could wirte a recipe I would be grateful - even if it was only for the essays - I could then try and modify it for the other sections.

Richard N in London
RichardN is offline   Reply With Quote
Old 09-04-2011, 06:22 PM   #2
Krittika Goyal
Vox calibre
Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.
 
Krittika Goyal's Avatar
 
Posts: 364
Karma: 228180
Join Date: Jan 2009
Device: Sony reader prs700, kobo
I have included 3 of the sections of the website. also I used auto clean up which removes one or two pictures. you can do the clean up in detail if you wish. for the most part he auto clean up works very well.

Hope this helps

Code:
import string, re
from calibre import strftime
from calibre.web.feeds.recipes import BasicNewsRecipe
from calibre.ebooks.BeautifulSoup import BeautifulSoup

class NYTimes(BasicNewsRecipe):

    title       = 'The Spectator'
    __author__  = 'Krittika Goyal'
    description = 'UK magazine'
    timefmt = ' [%d %b, %Y]'
    needs_subscription = False
    
    no_stylesheets = True
    auto_cleanup = True


    def articles_in_spec_section(self, section_url):
        articles = []
        soup = self.index_to_soup(section_url)
        div = soup.find(id='centre')
        for x in div.findAll(True):
                if x.name == 'h1':
                    # Article found
                    title = self.tag_to_string(x)
                    self.log('\tFound article:', title)
                    a = x.find('a', href=True)
                    if a is None:
                        continue
                    url = a['href']
                    if url.startswith('/'):
                        url = 'http://www.spectator.co.uk'+url
                    articles.append({'title':title, 'url':url,
                           'description':'', 'date':''})
        return articles
                    
   
    # To parse article toc
    def parse_index(self):
        sections = []
        for title, url in [
              ('Politics', 'http://www.spectator.co.uk/politics/all/'),
              ('Essays', 'http://www.spectator.co.uk/essays/'),
              ('Columnists', 'http://www.spectator.co.uk/columnists/all/'),
                   ]:
            self.log('Processing section:', title)
            articles = self.articles_in_spec_section(url)
            if articles:
                 sections.append((title,articles))
#        raise SystemExit(0)
        return sections
Krittika Goyal is offline   Reply With Quote
Old 10-12-2011, 12:44 PM   #3
JanMB
Junior Member
JanMB began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Oct 2011
Device: Kindle
The Spectator - digital edition - paid content

Hi,

The Spectator (UK) has a digital version that is available to subscribers. The content is different from the web news. I am a subscriber and I would like to read The Spectator on my reader.

I am also a subscriber to the German magazine Der Spiegel and I download it regularly. The recipe was created by Nikolas Mangold. I am very happy with it. Der Spiegel has two recipes, just as The Spectator should have, probably: one for the digital version of the print edition (paid content) and one for the web news.

Can anyone help?
Thank you very much.
Jan
JanMB is offline   Reply With Quote
Old 10-12-2011, 02:12 PM   #4
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by JanMB View Post
Can anyone help?
If you can't do it yourself, you will either need to find someone who is already a subscriber to do this job, or you will need to provide your subscription user/password to someone to write it. It's very hard to write or debug if you can't access the site
Starson17 is offline   Reply With Quote
Old 10-13-2011, 04:43 AM   #5
RichardN
Junior Member
RichardN began at the beginning.
 
Posts: 8
Karma: 10
Join Date: Mar 2011
Location: London, UK
Device: Paperwhite
I am happily using a very slightly expanded version of Krittika Goyals code, there are certain sections it does not get correctly ; and I will include them when I have debugged the problem. Try using this which gives most of what is needed


=============================================
Code:
import string, re
from calibre import strftime
from calibre.web.feeds.recipes import BasicNewsRecipe
from calibre.ebooks.BeautifulSoup import BeautifulSoup

class NYTimes(BasicNewsRecipe):

    title       = 'The Spectator'
    __author__  = 'Krittika Goyal'
    description = 'UK magazine'
    timefmt = ' [%d %b, %Y]'
    needs_subscription = False
    
    no_stylesheets = True
    auto_cleanup = True


    def articles_in_spec_section(self, section_url):
        articles = []
        soup = self.index_to_soup(section_url)
        div = soup.find(id='centre')
        for x in div.findAll(True):
                if x.name == 'h1':
                    # Article found
                    title = self.tag_to_string(x)
                    self.log('\tFound article:', title)
                    a = x.find('a', href=True)
                    if a is None:
                        continue
                    url = a['href']
                    if url.startswith('/'):
                        url = 'http://www.spectator.co.uk'+url
                    articles.append({'title':title, 'url':url,
                           'description':'', 'date':''})
        return articles
                    
   
    # To parse article toc
    def parse_index(self):
        sections = []
        for title, url in [
              ('Politics', 'http://www.spectator.co.uk/politics/all/'),
              ('Essays', 'http://www.spectator.co.uk/essays/'),
              ('Wit & Wisdom', 'http://www.spectator.co.uk/wit-and-wisdom/all/'),
              ('Columnists', 'http://www.spectator.co.uk/columnists/all/'),
              ('Arts', 'http://www.spectator.co.uk/arts-and-culture/featured/'),
#              ('Books', 'http://www.spectator.co.uk/books/'),
                   ]:
            self.log('Processing section:', title)
            articles = self.articles_in_spec_section(url)
            if articles:
                 sections.append((title,articles))
#        raise SystemExit(0)
        return sections
==========================================

Last edited by Starson17; 10-13-2011 at 09:08 AM.
RichardN is offline   Reply With Quote
Old 10-13-2011, 08:05 AM   #6
PeterT
Taking a break; Fed up
PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.
 
PeterT's Avatar
 
Posts: 6,847
Karma: 43933700
Join Date: Nov 2007
Location: Toronto
Device: Wife: Touch, Arc, Vox Me: Nexus 7, Glo
Ypu might like to post that wrapped in [ code ] [ /code ] tags to preserve indentation. Remove the spaces from the tags
PeterT is offline   Reply With Quote
Old 10-13-2011, 09:06 AM   #7
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by PeterT View Post
Ypu might like to post that wrapped in [ code ] [ /code ] tags to preserve indentation. Remove the spaces from the tags
I agree with your suggestion, but it's handy to know that the indents are actually preserved, just not displayed. If you need them, try quoting his post, as if replying, and they will appear. Copy from the text in the quote on the reply page, use that, then cancel the reply.
(Rather than leave it hard for others to use, I went ahead and added the code tags to his post.)

Last edited by Starson17; 10-13-2011 at 09:15 AM.
Starson17 is offline   Reply With Quote
Old 10-13-2011, 09:20 AM   #8
PeterT
Taking a break; Fed up
PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.
 
PeterT's Avatar
 
Posts: 6,847
Karma: 43933700
Join Date: Nov 2007
Location: Toronto
Device: Wife: Touch, Arc, Vox Me: Nexus 7, Glo
Quote:
Originally Posted by Starson17 View Post
I agree with your suggestion, but it's handy to know that the indents are actually preserved, just not displayed. If you need them, try quoting his post, as if replying, and they will appear. Copy from the text in the quote on the reply page, use that, then cancel the reply.
(Rather than leave it hard for others to use, I went ahead and added the code tags to his post.)
DUH..

I forgot the old "reply" trick!
PeterT is offline   Reply With Quote
Old 12-28-2012, 02:08 PM   #9
Spectrum
Zealot
Spectrum will become famous soon enoughSpectrum will become famous soon enoughSpectrum will become famous soon enoughSpectrum will become famous soon enoughSpectrum will become famous soon enoughSpectrum will become famous soon enough
 
Spectrum's Avatar
 
Posts: 126
Karma: 570
Join Date: Nov 2008
Device: iPad 1 and iPad 4, KF HD 8.9"
Both the recipes in the thread above does not work - using version 0.9.11.

Last edited by Spectrum; 01-14-2013 at 10:42 AM.
Spectrum is offline   Reply With Quote
Old 01-03-2013, 02:24 PM   #10
RichardN
Junior Member
RichardN began at the beginning.
 
Posts: 8
Karma: 10
Join Date: Mar 2011
Location: London, UK
Device: Paperwhite
I have tried to understand what is happening with the Spectator and it looked to me like there was some kind of encoding .. possibly to deter applications like to Calibre.
I couldn't sort it out .
RichardN is offline   Reply With Quote
Old 01-09-2013, 03:35 AM   #11
Krittika Goyal
Vox calibre
Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.
 
Krittika Goyal's Avatar
 
Posts: 364
Karma: 228180
Join Date: Jan 2009
Device: Sony reader prs700, kobo
does this now need a subscription?
Krittika Goyal is offline   Reply With Quote
Old 01-09-2013, 04:07 AM   #12
Krittika Goyal
Vox calibre
Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.
 
Krittika Goyal's Avatar
 
Posts: 364
Karma: 228180
Join Date: Jan 2009
Device: Sony reader prs700, kobo
see if attached file works
Attached Files
File Type: zip spec.recipe.zip (1.0 KB, 44 views)
Krittika Goyal is offline   Reply With Quote
Old 01-11-2013, 10:01 AM   #13
Spectrum
Zealot
Spectrum will become famous soon enoughSpectrum will become famous soon enoughSpectrum will become famous soon enoughSpectrum will become famous soon enoughSpectrum will become famous soon enoughSpectrum will become famous soon enough
 
Spectrum's Avatar
 
Posts: 126
Karma: 570
Join Date: Nov 2008
Device: iPad 1 and iPad 4, KF HD 8.9"
partial download

Strangely the recipe is downloading the page 1 links in features page but not the contents of the magazine. Tried twice with same result!
Recipe calls for:

return self.index_to_soup('http://www.spectator.co.uk/')

but defaults to

http://www.spectator.co.uk/features/

strange behavior!
Spectrum is offline   Reply With Quote
Old 01-16-2013, 04:12 AM   #14
Krittika Goyal
Vox calibre
Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.Krittika Goyal ought to be getting tired of karma fortunes by now.
 
Krittika Goyal's Avatar
 
Posts: 364
Karma: 228180
Join Date: Jan 2009
Device: Sony reader prs700, kobo
http://www.spectator.co.uk/ has 2 swctions
Coffee house on the left column and magazine on the right column.
the recipe is designed to get the articles from the magazine column.
When I test it that is exactly what it is doing.

i am attaching a copy of the webpage as well as the epub obtained by calibre:

In both:
Britain’s accidental EU exit is the first article and
Greening’s challenge is the last article
Attached Files
File Type: epub spec.epub (533.7 KB, 41 views)
File Type: pdf spec_webpage.pdf (988.6 KB, 66 views)
Krittika Goyal is offline   Reply With Quote
Old 01-17-2013, 07:36 AM   #15
Spectrum
Zealot
Spectrum will become famous soon enoughSpectrum will become famous soon enoughSpectrum will become famous soon enoughSpectrum will become famous soon enoughSpectrum will become famous soon enoughSpectrum will become famous soon enough
 
Spectrum's Avatar
 
Posts: 126
Karma: 570
Join Date: Nov 2008
Device: iPad 1 and iPad 4, KF HD 8.9"
partial download again... saga continues

You got the same results as I got. Just 8 articles from features section just like before - not the complete magazine. Sorry to repeat what I wrote before. Not sure why.
Spectrum is offline   Reply With Quote
Reply

Tags
recipe, request, spectator, web

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Recipe request - Macleans Magazine canislupus Recipes 7 07-24-2011 08:38 AM
Request: Wired Magazine UK StalkS Recipes 4 06-10-2011 03:08 PM
Recipe Request for World Magazine fbrian Recipes 3 06-05-2011 10:10 AM
Help request with italian magazine lorenzo2004 Recipes 1 05-09-2011 04:43 AM
Reason Magazine request c0llin Recipes 0 11-29-2010 03:00 PM


All times are GMT -4. The time now is 12:31 PM.


MobileRead.com is a privately owned, operated and funded community.