![]() |
#2341 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Quote:
Let us know if/when you have more questions. |
|
![]() |
![]() |
#2342 |
Junior Member
![]() Posts: 1
Karma: 10
Join Date: Jul 2010
Device: PRS-900
|
Hello, can anyone help me with this one?
http://sanduan.org/home/type.asp?iCa...&nChannel=News Thank you very much |
![]() |
Advert | |
|
![]() |
#2343 | |
Member
![]() Posts: 11
Karma: 10
Join Date: Oct 2009
Device: Kindle International
|
![]()
I have put together a basic recipe to download new articles (or abstracts, if you aren't logged in) from Science Direct. Could someone help improve it? Currently, it does not bold or otherwise highlight the article titles, there seems to be a left indent that I'd prefer to get rid of, and it is downloading the versions of articles with small, grainy images instead of full-sized images. (To get larger images, I need to append "&artImgPref=F" to the URL, but my attempt below doesn't work).
Quote:
![]() |
|
![]() |
![]() |
#2344 |
Member
![]() Posts: 11
Karma: 10
Join Date: Oct 2009
Device: Kindle International
|
![]()
Following up my own message, I now have the article titles highlighted appropriately, though I would still appreciate help in getting the versions of articles with full-sized images and getting rid of the left margin if possible. My code at this point:
Code:
import re from calibre.web.feeds.news import BasicNewsRecipe class ScienceDirect(BasicNewsRecipe): title = u'Science Direct' __author__ = u'Barbara Robson' description = u'New journal articles from my favourite journals on Science Direct. Edit to choose your own favourites. Full text if you have an institutional login; abstracts otherwise.' oldest_article = 10 max_articles_per_feed = 40 no_stylesheets = True cover_url = 'http://rss.sciencedirect.com/images/logo_scid.gif' feeds = [(u'Environmental Modelling and Software', u'http://rss.sciencedirect.com/publication/science/6063'), (u'Ecological Modelling',u'http://rss.sciencedirect.com/publication/science/5934'), (u'Estuarine, Coastal and Shelf Science',u'http://rss.sciencedirect.com/publication/science/6776'), (u'Water Research',u'http://rss.sciencedirect.com/publication/science/5831')] def full_images(self, url): return url.append("&artImgPref=F") remove_tags_before = dict(id='articleContent') # highlight article title preprocess_regexps = [ (re.compile(r'(<div.class="articleTitle">)([^<]+)(<)'), lambda m: '%s<h2 class="h2">%s</h2>%s' % (m.group(1), m.group(2), m.group(3))) ] remove_tags_after = [dict(attrs={'class':'SDTxtSmallBold'})] remove_tags = [dict(attrs={'class':'SDTxtSmallBold'})] Last edited by significance; 07-25-2010 at 07:53 PM. Reason: Clarification |
![]() |
![]() |
#2345 |
Member
![]() Posts: 11
Karma: 10
Join Date: Oct 2009
Device: Kindle International
|
![]()
Is it possible to create news recipes for journals that publish the articles only as PDFs? If so, I'd love one for Limnology and Oceanography (http://www.aslo.org/lo/toc/index.html), if someone has the time. Things that might make this difficult:
1) Articles are published as PDFs 2) The main page has links to issues that are not yet available and issues that are in progress, with some articles available, but not all. I'd want to download only the latest complete issue. 3) Some articles are locked and available for purchase, while others are free to download. If your institution has a subscription, you can download even those available for purchase without paying again. Too complicated? |
![]() |
Advert | |
|
![]() |
#2346 | |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 800
Karma: 194644
Join Date: Dec 2007
Location: Argentina
Device: Kindle Voyage
|
Quote:
Yes. I doubt anybody would undertake this task. At least not for free. |
|
![]() |
![]() |
#2347 | |
Member
![]() Posts: 11
Karma: 10
Join Date: Oct 2009
Device: Kindle International
|
Quote:
If the PDFs don't need to be converted, I'll probably give it a go myself sooner or later. But thanks for your evaluation. I've only spent an afternoon with Calibre so far and I don't know Python, so I'm on a learning curve here. |
|
![]() |
![]() |
#2348 |
Connoisseur
![]() Posts: 51
Karma: 10
Join Date: Jul 2010
Device: colognesbook
|
I just got my first ebook reader and I'm looking at this thread for some interesting rss feeds.
Howevery, what are the chances of a newsfeed posted in 2008 still working? I'm too familar with Calibre yet, but I understand it has gone through some changes and wasn't sure if I should attempt to use older receipes. |
![]() |
![]() |
#2349 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Quote:
|
|
![]() |
![]() |
#2350 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
It's worse than that. They're multi-column pdfs. All recipes currently create EPUBs and Calibre can't convert multicolumn (it's on the ToDo list).
I think you'd be better off with an automated website downloader, such as wget (possibly Web2Disk could also do it, but I'm more familiar with wget). You could restrict wget to grabbing the PDF, then use your batch or script to add the unconverted pdf file to Calibre. It should be possible to automate the whole thing, but not via the recipe system. I assume you have a pdf reader available that will read the unmodified files. |
![]() |
![]() |
#2351 | |
Member
![]() Posts: 11
Karma: 10
Join Date: Oct 2009
Device: Kindle International
|
Quote:
|
|
![]() |
![]() |
#2352 |
Connoisseur
![]() Posts: 51
Karma: 10
Join Date: Jul 2010
Device: colognesbook
|
Anyone want to try 24h Toronto?
They have other Canadian RSS feeds but I'm interested in the Toronto one. http://eedition.toronto.24hrs.ca/epa...6225&type=full http://eedition.toronto.24hrs.ca/epaper/viewer.aspx |
![]() |
![]() |
#2353 |
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 146
Karma: 189664
Join Date: Feb 2009
Device: Glo HD, Aura H20, PRS-T1
|
Perhaps I didn't ask nicely enough the first time around. I'd really like to have someone take a look at making a recipe for http://www.columbian.com/. I mentioned in my previous post that I'd like to have the recipe fetch the print edition. This is done by adding /?print to the end of the url.
I'd very much appreciate a recipe. |
![]() |
![]() |
#2354 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Quote:
The easier you make it, the more likely that someone will pick it up. In your case, you haven't given a link to the feed, just the main page. It's not that hard to write a recipe, and personally, I prefer to help those who've tried to write it and just need a bit of help. Sometimes, a user I've helped on one recipe has gone on to write a dozen more. If you want to try it yourself, try putting your feed(s) into the basic recipe option. Then hit the Advanced button and add this code: Code:
def print_version(self, url): return url + '?print' |
|
![]() |
![]() |
#2355 | |
award-winning bozo
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 258
Karma: 172703
Join Date: Sep 2009
Location: Philadelphia
Device: Kobo Libra 2
|
Quote:
|
|
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Custom column read ? | pchrist7 | Calibre | 2 | 10-04-2010 02:52 AM |
Archive for custom screensavers | sleeplessdave | Amazon Kindle | 1 | 07-07-2010 12:33 PM |
How to back up preferences and custom recipes? | greenapple | Calibre | 3 | 03-29-2010 05:08 AM |
Donations for Custom Recipes | ddavtian | Calibre | 5 | 01-23-2010 04:54 PM |
Help understanding custom recipes | andersent | Calibre | 0 | 12-17-2009 02:37 PM |