Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 03-01-2011, 11:15 AM   #1
davide125
Member
davide125 began at the beginning.
 
Posts: 10
Karma: 10
Join Date: Mar 2011
Device: Amazon Kindle 3G
LWN.net Weekly News recipe

Hi,

I've made a quick recipe to get the latest weekly edition of LWN.net on my kindle. It will fetch the latest current edition if you enter your lwn credentials and have a current subscription; failing that, it will get the latest free edition (usually a few weeks behind).

I have some Python experience but this is my first try at a Calibre recipe, feel free to comment if you see a better way to do things.

Cheers,
Davide
Attached Files
File Type: zip lwn_weekly.zip (1.8 KB, 337 views)
davide125 is offline   Reply With Quote
Old 03-01-2011, 11:17 AM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,600
Karma: 28548974
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
There's already an lwn recipe in calibre, is your different?
kovidgoyal is offline   Reply With Quote
Old 03-01-2011, 11:37 AM   #3
davide125
Member
davide125 began at the beginning.
 
Posts: 10
Karma: 10
Join Date: Mar 2011
Device: Amazon Kindle 3G
Quote:
Originally Posted by kovidgoyal View Post
There's already an lwn recipe in calibre, is your different?
Yes, the lwn recipe already in calibre downloads the latest news published on lwn.net homepage (i.e. their RSS feed). My recipe downloads the "weekly edition", which is a magazine-like publication made by the LWN.net folks. The weekly edition may include some content which is also syndicated on the lwn.net homepage/rss, but most of it is original content.

Davide
davide125 is offline   Reply With Quote
Old 03-01-2011, 11:54 AM   #4
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,600
Karma: 28548974
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Ah, added
kovidgoyal is offline   Reply With Quote
Old 03-22-2011, 02:51 PM   #5
wcooley
Junior Member
wcooley began at the beginning.
 
wcooley's Avatar
 
Posts: 8
Karma: 10
Join Date: Dec 2010
Device: Graphite Kindle DX
I am not having any luck getting this recipe to work. I have verified that I am getting the correct URL by sending the request through my proxy and checking the requested URL via curl.

I've run ebook-convert from the command-line (with and without '--test'):

$ ebook-convert LWN.net\ Weekly\ Edition_6.recipe /tmp/lwn-calibre -d /tmp/lwn-calibre-debug -vv

What's left in either directory, however, appears to be just the skeleton without content. Presumably it's getting destroyed by the processing.

It would be nice if I could preserve the temp directory that it downloads the content into--the contents should be the same as the -d debug directory, but I have a feeling they aren't. (In trying to prevent the temp directory from being removed, I've gone so far as to try "os.kill(getpid(), 9)", but I due to forking/threading I'm not getting it in the right process or right time.)
wcooley is offline   Reply With Quote
Old 03-22-2011, 02:56 PM   #6
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,600
Karma: 28548974
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Simply remove the remove_tags, keep_only_tags etc fields from the recipe.

And if you want to look at downloaded html, implement the preprocess_html method in the recipe and save the soup yourself to a temp files.
kovidgoyal is offline   Reply With Quote
Old 03-22-2011, 03:47 PM   #7
wcooley
Junior Member
wcooley began at the beginning.
 
wcooley's Avatar
 
Posts: 8
Karma: 10
Join Date: Dec 2010
Device: Graphite Kindle DX
Quote:
Originally Posted by kovidgoyal View Post
Simply remove the remove_tags, keep_only_tags etc fields from the recipe.
Thanks; I should've mentioned that I've tried that. I should've also mentioned I'm on 0.7.50.

Quote:
Originally Posted by kovidgoyal View Post
And if you want to look at downloaded html, implement the preprocess_html method in the recipe and save the soup yourself to a temp files.
Thanks; preprocess_html does not seem to get called in this case, but I was able to put it in the parse_index function. (Well, it is now for the articles returned from parse_index.)

pprinting 'ans' that is returned from parse_index, it seems that it hasn't found any sections or content: [('Front Page', [])]

Ok, now that I'm actually digging into it and not just trying ad hoc debugging, I'm seeing that there are a number of problems. First, the article URL in the actual content is relative, so re.compile('^http://lwn.net/Articles/') should be re.compile('^(http://lwn.net)?/Articles/'). But then at the end of the loop it's setting content='' in the article dict, so 'http://lwn.net' has to be prepended to url. But that means that rather than just using the article text that's inline, we're re-downloading each article individually. Yuck.

I'll submit an updated recipe when I have something satisfactory.
wcooley is offline   Reply With Quote
Old 03-22-2011, 04:59 PM   #8
wcooley
Junior Member
wcooley began at the beginning.
 
wcooley's Avatar
 
Posts: 8
Karma: 10
Join Date: Dec 2010
Device: Graphite Kindle DX
Here is a version that works for me; I have tested only with the post-embargoed edition, although it should be the same. I would like to try to use only the one big page and split that into distinct articles but not right now.

https://github.com/wcooley/calibre_r..._weekly.recipe
wcooley is offline   Reply With Quote
Old 03-22-2011, 05:02 PM   #9
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,600
Karma: 28548974
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
updated.
kovidgoyal is offline   Reply With Quote
Old 09-23-2011, 07:58 AM   #10
dovf
Junior Member
dovf began at the beginning.
 
Posts: 1
Karma: 10
Join Date: Sep 2011
Device: Kobo Wireless
This lwn_weekly recipe suddenly stopped working for me last night, I had to edit it as follows in order to get it working:

Code:
--- lwn_weekly.recipe   2011-07-01 10:19:40.000000000 +0300
+++ lwn_weekly.recipe.new       2011-09-23 14:54:36.815730882 +0300
@@ -95,7 +95,7 @@
                 break
 
             article = dict(
-                title=tag_title.string,
+                title=tag_title.contents[0].string,
                 url= 'http://lwn.net' + tag_url['href'].split('#')[0] + '?format=printable',
                 description='', content='', date='')
             articles[section].append(article)
(explanation: The tag_title contains a link which contains the title string, so 'string' on tag_title itself is returning None.)

I don't know what has changed to make it suddenly not work, it's possible that it's something on my system (any ideas what it could be?). But just in case it's something more widespread, I thought I'd share my experience.

Also, I'm new to calibre recipes and BeautifulSoup, so very likely there's a better/more robust way to fix this that what I've done.

Thanks!
Dov

Last edited by dovf; 09-23-2011 at 08:03 AM.
dovf is offline   Reply With Quote
Old 09-23-2011, 11:30 AM   #11
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,600
Karma: 28548974
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Thanks, the robust fix is to use

title = self.tag_to_string(tag_title)
kovidgoyal is offline   Reply With Quote
Old 01-05-2012, 01:38 PM   #12
ametzler
Zealot
ametzler ought to be getting tired of karma fortunes by now.ametzler ought to be getting tired of karma fortunes by now.ametzler ought to be getting tired of karma fortunes by now.ametzler ought to be getting tired of karma fortunes by now.ametzler ought to be getting tired of karma fortunes by now.ametzler ought to be getting tired of karma fortunes by now.ametzler ought to be getting tired of karma fortunes by now.ametzler ought to be getting tired of karma fortunes by now.ametzler ought to be getting tired of karma fortunes by now.ametzler ought to be getting tired of karma fortunes by now.ametzler ought to be getting tired of karma fortunes by now.
 
Posts: 129
Karma: 567800
Join Date: Sep 2011
Location: Austria
Device: Kindle Paperwhite II
Hmm, looks like lwn broke the recipe again, I get

Conversion Error: <b>Failed</b>: Fetch news from LWN.net Weekly Edition
[...]
Exception: Could not find any articles.
ametzler is offline   Reply With Quote
Old 01-06-2012, 03:38 PM   #13
wcooley
Junior Member
wcooley began at the beginning.
 
wcooley's Avatar
 
Posts: 8
Karma: 10
Join Date: Dec 2010
Device: Graphite Kindle DX
I have updated the recipe on GitHub:

https://github.com/wcooley/calibre_r..._weekly.recipe

This includes the title fix mentioned above and a tag-title fix submitted by jerrykan.
wcooley is offline   Reply With Quote
Old 01-07-2012, 09:17 AM   #14
ametzler
Zealot
ametzler ought to be getting tired of karma fortunes by now.ametzler ought to be getting tired of karma fortunes by now.ametzler ought to be getting tired of karma fortunes by now.ametzler ought to be getting tired of karma fortunes by now.ametzler ought to be getting tired of karma fortunes by now.ametzler ought to be getting tired of karma fortunes by now.ametzler ought to be getting tired of karma fortunes by now.ametzler ought to be getting tired of karma fortunes by now.ametzler ought to be getting tired of karma fortunes by now.ametzler ought to be getting tired of karma fortunes by now.ametzler ought to be getting tired of karma fortunes by now.
 
Posts: 129
Karma: 567800
Join Date: Sep 2011
Location: Austria
Device: Kindle Paperwhite II
Quote:
Originally Posted by wcooley View Post
I have updated the recipe on GitHub:
[...]
Splendid. Thank you very much.
ametzler is offline   Reply With Quote
Old 04-09-2012, 11:02 PM   #15
nbigaouette
Junior Member
nbigaouette began at the beginning.
 
Posts: 1
Karma: 10
Join Date: Apr 2012
Device: Samsung Galazy Ace
Thanks a lot for the recipe! It will be a lot easier to read offline. I really appreciate.

The only thing missing would be the inclusion of the comments section. The comments on lwn are of high quality and always interesting to read. Is that possible? It might be harder to include though...

Thanks again.
nbigaouette is offline   Reply With Quote
Reply

Tags
calibre, lwn, recipe

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Recipe for KA-News.de tfeld Recipes 0 12-30-2010 05:45 PM
Request: Inquirer.net Recipe update zoilom Recipes 0 12-21-2010 01:06 AM
LWN article on OpenInkPot wallcraft OpenInkpot 2 10-16-2009 12:01 PM
News.com: Music, movie lobbyists push to spy on your Net traffic Steven Lyle Jordan News 14 08-28-2008 03:13 AM
Gentoo Weekly News, Handbooks in Plucker format hacker Workshop 5 08-28-2007 09:39 AM


All times are GMT -4. The time now is 10:30 AM.


MobileRead.com is a privately owned, operated and funded community.