MobileRead Forums
Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre

Welcome to the MobileRead Forums.

You are currently viewing our boards as a guest which gives you limited access to view most discussions and access our other features. By joining our free community today, you will have fewer ads, access to post topics, communicate privately with other members, respond to polls, upload content and access many other special features.

If you have any problems with the registration process or your account login, please contact us.

Hint: Don't have time to visit us daily? Subscribe to our main RSS feed to receive our frontpage posts at your convenience.

Notices

Calibre Calibre is an open-source library manager to view, convert and catalog e-books. Cross-platform (Linux, Windows and OS X)

Reply
 
Thread Tools Search this Thread Display Modes
Old 10-09-2008, 01:47 PM   #1
Acey
Member
Acey began at the beginning.
 
Posts: 19
Karma: 10
Join Date: Oct 2008
Device: Sony PRS-505
Question Help with news recipe

I'm having trouble getting the print versions of articles from the Orlando Sentinel. The problem is that they have completely different article numbers for the regular and print-friendly versions of a feature.

For instance:

In this RSS feed: http://feeds.feedburner.com/orlandosentinel

Regular version with the link provided in RSS: http://www.orlandosentinel.com/business/orl-existing-home-sales-orlando-100908,0,2581414.story

Print-friendly version (link is found on regular article's page): http://www.orlandosentinel.com/business/orl-existing-home-sales-orlando-100908,0,95752,print.story

The print-friendly version shows up like this in the regular version:
Code:
<div><img src="/common/images/icons/atools-printer.gif" alt="Print" /><a href="/business/orl-existing-home-sales-orlando-100908,0,95752,print.story" rel="nofollow" >Print</a></div>
What would be the best way to get the printable versions instead of the regular articles?

I already tried this but I think it's just looking at the actual RSS feed instead of each article so it did not help.
Code:
def print_version(self, url):
		soup = self.index_to_soup(url)
		for item in soup.findAll('a', attrs={'rel':'nofollow'}):
			strhref = item['href']
			match = strhref.find('print.story')
			if match > -1:
				return strhref
				
			return None
Thanks in advance for any help you can provide.
Acey is offline   Reply With Quote
Old 10-09-2008, 03:36 PM   #2
kovidgoyal
Creator of calibre, Ph.D.
kovidgoyal can name that ebook in five wordskovidgoyal can name that ebook in five wordskovidgoyal can name that ebook in five wordskovidgoyal can name that ebook in five wordskovidgoyal can name that ebook in five wordskovidgoyal can name that ebook in five wordskovidgoyal can name that ebook in five wordskovidgoyal can name that ebook in five wordskovidgoyal can name that ebook in five wordskovidgoyal can name that ebook in five wordskovidgoyal can name that ebook in five words
 
kovidgoyal's Avatar
 
Posts: 8,951
Karma: 37478
Join Date: Oct 2006
Location: Albuquerque, NM
Device: PRS-500/505/700, K2, BeBook
Code:
def print_version(self, url):
    for a in self.index_to_soup(url).findAll('a', href=re.compile(r'print\.story'):
          if 'Print' in a.string:
                return 'http://www.orlandosentinel.com' + a['href']
    return url
__________________
Get calibre
kovidgoyal is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Recipe for Sydney Daily Telegraph AprilHare Calibre 11 10-06-2008 05:31 PM
Newsweek Recipe SnafuRacer Calibre 5 07-07-2008 03:35 PM
[calibre] recipe - smaller font? moneytoo Calibre 0 06-01-2008 09:00 AM
Calibre recipe Question jplumey Calibre 3 05-23-2008 02:05 PM
Tutorial-- video converter for recipe Angelet Portable Audio/Video 0 04-24-2008 03:36 AM


All times are GMT -4. The time now is 03:50 AM.


MobileRead.com is a privately owned, operated and funded community.