|
|
#1 |
|
Junior Member
![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4
Karma: 748
Join Date: Jan 2011
Device: Kindle 3
|
Having trouble getting complete article for Reading Eagle
I apologize in advance if this has been discussed -- I couldn't find it.
Here is the RSS Feed: http://readingeagle.com/feeds/all/newsrss.xml I only get the first few lines of each article. Here is my recipe: Code:
class AdvancedUserRecipe1297542834(BasicNewsRecipe):
title = u'Reading Eagle'
use_embedded_content = True
oldest_article = 7
max_articles_per_feed = 100
remove_javascript = True
no_stylesheets = True
remove_empty_feeds = True
feeds = [
(u'local news', u'http://readingeagle.com/feeds/all/newsrss.xml'),
]
|
|
|
|
|
|
#2 | |
|
Member
![]() Posts: 18
Karma: 10
Join Date: Aug 2011
Device: Nook
|
Re: Trouble returning whole article
Quote:
def print_version(self, url): return self.browser.open_novisit(url).geturl() to no avail. How can I get my recipe to follow that Read More url? Is there a builtin recipe for another site that has the same problem that I could crib from? Maddeningly, there is a print version which downloads fine, but the url cannot be derived from the one for the non-print version because it uses a number unrelated to the original article title. |
|
|
|
|
| Advert | |
|
|
|
|
#3 |
|
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,643
Karma: 28549046
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
You want
use_embedded_content = False not True |
|
|
|
|
|
#4 |
|
Member
![]() Posts: 18
Karma: 10
Join Date: Aug 2011
Device: Nook
|
Thanks much for your reply. Unfortunately, that just makes the TOC disappear, so now all I get is 'Start' and no content. Here is my recipe:
Code:
class AdvancedUserRecipe1322154189(BasicNewsRecipe):
title = u'the Progressive'
masthead_url = 'http://progressive.org/sites/all/themes/progress/logo.png'
oldest_article = 7
feeds = [u'http://feeds.feedburner.com/progressivefeed']
def get_cover_url(self):
soup = self.index_to_soup('http://progressive.org')
item = soup.find('div',attrs={'class':'views-field-field-cover-fid'})
if item:
return item.img['src']
return None
David |
|
|
|
|
|
#5 | |
|
Member
![]() Posts: 18
Karma: 10
Join Date: Aug 2011
Device: Nook
|
Quote:
Code:
class AdvancedUserRecipe1297542834(BasicNewsRecipe):
title = u'Reading Eagle'
oldest_article = 7
max_articles_per_feed = 100
remove_empty_feeds = True
auto_cleanup = True
feeds = [
(u'local news', u'http://readingeagle.com/feeds/all/newsrss.xml'),
]
def print_version(self,url):
return url + '#'
|
|
|
|
|
| Advert | |
|
|
|
|
#6 |
|
Member
![]() Posts: 18
Karma: 10
Join Date: Aug 2011
Device: Nook
|
Fixed my recipe for The Progressive using code from the recipe for Alternet! Here it is, for anyone else who wants it (it doesn't get you the whole magazine, just a few articles and some web-only content):
Code:
from calibre.ptempfile import PersistentTemporaryFile
class AdvancedUserRecipe1322154189(BasicNewsRecipe):
title = u'the Progressive'
masthead_url = 'http://progressive.org/sites/all/themes/progress/logo.png'
oldest_article = 7
articles_are_obfuscated = True
use_embedded_content = False
auto_cleanup = True
temp_files= []
feeds = [u'http://feeds.feedburner.com/progressivefeed']
def get_article_url(self, article):
return article.get('link', None)
def get_obfuscated_article(self, url):
br = self.get_browser()
br.open(url)
response = br.follow_link(url_regex = r'/print/[0-9]+', nr = 0)
html = response.read()
self.temp_files.append(PersistentTemporaryFile('_fa.html'))
self.temp_files[-1].write(html)
self.temp_files[-1].close()
return self.temp_files[-1].name
def get_cover_url(self):
soup = self.index_to_soup('http://progressive.org')
item = soup.find('div',attrs={'class':'views-field-field-cover-fid'})
if item:
return item.img['src']
return None
|
|
|
|
![]() |
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Article criticizes speed reading | GA Russell | General Discussions | 18 | 01-17-2011 03:41 PM |
| trouble reading a converted pdf to lrf with unpdf | tuvoc | Calibre | 1 | 06-20-2009 02:28 PM |
| Opinions of reading The Stand 'The Complete & Uncut Version' | snipenekkid | Reading Recommendations | 39 | 06-17-2009 10:02 PM |
| 'El Pais' article (in Spanish) on cyber-reading | Patricia | News | 1 | 03-23-2008 08:04 AM |
| NY Times article about e-books and reading business | SpiderMatt | News | 5 | 02-16-2008 10:55 PM |