![]() |
#1 |
Junior Member
![]() Posts: 2
Karma: 10
Join Date: Dec 2014
Device: Kindle Touch
|
Recipe for "The Pickering Post"
I am a newbie so please forgive my ignorance. /i have tried to get Calibre to download The Pickering Post (& a couple of other web sites that I follow.) using the basic recipe in add a custom news source. All I get is all the HTML source returned with the articles embedded. For the others i get just the cover or blank pages. Clearly I have lot to learn. Could someone help me get started on the best way to approach this. The Pickering Post website is at:
http://pickeringpost.com/ with first few articles at: http://pickeringpost.com/story/most-...ti-muslim/4331 http://pickeringpost.com/story/2014-...ty-awards/4323 http://pickeringpost.com/story/a-ter...ver-xmas-/4317 This is a free subscription site. Any help would be appreciated. Thanks. |
![]() |
![]() |
![]() |
#2 |
Vox calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 412
Karma: 1175230
Join Date: Jan 2009
Device: Sony reader prs700, kobo
|
Here is something to start with:
Code:
from calibre.web.feeds.news import BasicNewsRecipe class NewYorkTimesBookReview(BasicNewsRecipe): title = u'The Pickering Post' language = 'en' description = '' __author__ = 'Krittika Goyal' no_stylesheets = True no_javascript = True auto_cleanup = True def parse_index(self): soup = self.index_to_soup('http://pickeringpost.com/') # Find TOC toc = soup.find('div', id='articles') feeds = [] articles = [] section_title = 'News' for x in toc.findAll(['a'], attrs={'class':['timeline-article bf']}): tt = x.find('h2') title = self.tag_to_string(tt) url = 'http://pickeringpost.com' + x['href'] self.log('\tFound article:', title, url) articles.append({'title':title, 'url':url}) return [('Articles', articles)] |
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Junior Member
![]() Posts: 2
Karma: 10
Join Date: Dec 2014
Device: Kindle Touch
|
Thank you very much. It returns about 5 or 6 articles but does not always work constantly. I have noticed that manually logging in to the web site sometimes has no content but if I actually log in with my user name and pw then the articles show.
Does this mean that I will need to add code with my username and pw in the script? Do you have an example I can add? thanks. |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Recipe for EPUB subscribers of "Tagesspiegel" and "Handelsblatt"? | F.W. | Recipes | 0 | 05-14-2013 11:16 AM |
New recipe for "Süddeutsche Zeitung" using "E-Paper mobile" subscription | Ernst | Recipes | 3 | 02-16-2013 07:37 AM |
Recipe for "Galicia Confidencial" and "De L a V" | roebek | Recipes | 1 | 07-19-2011 09:17 AM |