|
|||||||
|
You are currently viewing our boards as a guest which gives you limited access to view most discussions and access our other features. By joining our free community today, you will have fewer ads, access to post topics, communicate privately with other members, respond to polls, upload content and access many other special features. If you have any problems with the registration process or your account login, please contact us. Hint: Don't have time to visit us daily? Subscribe to our main RSS feed to receive our frontpage posts at your convenience. |
| Calibre Calibre is an open-source library manager to view, convert and catalog e-books. Cross-platform (Linux, Windows and OS X) |
![]() |
|
|
Thread Tools | Search this Thread | Display Modes |
|
|
|
|
#1 |
|
Member
![]()
Posts: 19
Karma: 10
Join Date: Oct 2008
Device: Sony PRS-505
|
I'm having trouble getting the print versions of articles from the Orlando Sentinel. The problem is that they have completely different article numbers for the regular and print-friendly versions of a feature.
For instance: In this RSS feed: http://feeds.feedburner.com/orlandosentinel Regular version with the link provided in RSS: http://www.orlandosentinel.com/business/orl-existing-home-sales-orlando-100908,0,2581414.story Print-friendly version (link is found on regular article's page): http://www.orlandosentinel.com/business/orl-existing-home-sales-orlando-100908,0,95752,print.story The print-friendly version shows up like this in the regular version: Code:
<div><img src="/common/images/icons/atools-printer.gif" alt="Print" /><a href="/business/orl-existing-home-sales-orlando-100908,0,95752,print.story" rel="nofollow" >Print</a></div> I already tried this but I think it's just looking at the actual RSS feed instead of each article so it did not help. Code:
def print_version(self, url):
soup = self.index_to_soup(url)
for item in soup.findAll('a', attrs={'rel':'nofollow'}):
strhref = item['href']
match = strhref.find('print.story')
if match > -1:
return strhref
return None
|
|
|
|
|
|
#2 |
|
Creator of calibre, Ph.D.
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
Posts: 8,951
Karma: 37478
Join Date: Oct 2006
Location: Albuquerque, NM
Device: PRS-500/505/700, K2, BeBook
|
Code:
def print_version(self, url):
for a in self.index_to_soup(url).findAll('a', href=re.compile(r'print\.story'):
if 'Print' in a.string:
return 'http://www.orlandosentinel.com' + a['href']
return url
__________________
Get calibre |
|
|
|
![]() |
| Thread Tools | Search this Thread |
| Display Modes | |
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Recipe for Sydney Daily Telegraph | AprilHare | Calibre | 11 | 10-06-2008 05:31 PM |
| Newsweek Recipe | SnafuRacer | Calibre | 5 | 07-07-2008 03:35 PM |
| [calibre] recipe - smaller font? | moneytoo | Calibre | 0 | 06-01-2008 09:00 AM |
| Calibre recipe Question | jplumey | Calibre | 3 | 05-23-2008 02:05 PM |
| Tutorial-- video converter for recipe | Angelet | Portable Audio/Video | 0 | 04-24-2008 03:36 AM |