06-14-2007, 02:43 AM | #1 |
Enthusiast
Posts: 40
Karma: 86
Join Date: Jan 2007
Location: USA, Philippines, Colombia, Mexico
Device: Ipod Touch 3G, 10" Acer Netbook, Kindle 3
|
Automating offline web content collection
I have been trying to find the best way to automate gathering information off the web for later offline reading (on a Nokia 800 internet tablet equipped with FBReader).
I will be traveling extensively for a long period of time with only occasional internet access for my Nokia 800. What I would like to do is automatically gather content from several web sites that I enjoy reading while I have internet access for reading later offline. For instance, a couple of newspapers and the Wall Street Journal (I am a subscriber). This would also reduce the amount of time I need to spend at internet cafes and let me catch up on reading during down time when I am traveling by bus or plane, during days when I am resting, on the beach, etc. (I will have a spare battery and an external charger). Paper books are not even an option because they are too heavy and bulky for a trip of this length and I will not be in any English speaking countries where I could purchase along the way. I have searched here and found several possible solutions but I am not sure about the right one and they all require a considerable learning curve to even determine what they can do. I have briefly looked at Dapper.net, Sitescooper, Website Puller, etc., but I don't really know what is possible with them. Reading RSS Feeds offline really is not a solution for me since the feeds are so incomplete (although that does help me part way with a few blogs that I read). My dream setup would be something that would automatically email me daily an HTML file for the web sites that I am interested in in a way that I could read offline later. Then whenever I pull down my emails on my Nokia 800 I would basically have lots of offline content to read and I wouldn't miss anything since the content is being archived daily. Can anyone suggest a direction for me to research? I am mostly looking for high level direction. I don't even know where to start or if what I am looking for is even possible (within reason). Thanks! Travis |
06-14-2007, 09:11 AM | #2 |
Addict
Posts: 350
Karma: 705
Join Date: Dec 2006
Location: Mumbai, India
Device: Kindle 1/REB 1200
|
If you know Python programming, you might want to use Scrape 'N' Feed to generate an RSS file from a website. There are many solutions to convert from RSS => HTML.
I tend to read mostly blogs subscribed via bloglines, so I tend to use a customized version of bloglines2html. That serves me for my needs. |
06-16-2007, 12:09 AM | #3 | |
Enthusiast
Posts: 40
Karma: 86
Join Date: Jan 2007
Location: USA, Philippines, Colombia, Mexico
Device: Ipod Touch 3G, 10" Acer Netbook, Kindle 3
|
Quote:
And unfortunately I don't have a machine on which to run cron jobs during my long trip, although I am sure that I could beg off on one of my friends. It seems like there is a large learning curve and some custom work involved to any approach to this problem. (which is something that I wanted to confirm with this post, to see if I was missing something) I am figuring that I can do my most of blog reading via just reading my RSS reader on the N800 offline and refreshing the feeds whenever I have bandwidth. The blog writers seem to give complete feeds, although I miss out on the useful comments. Travis |
|
Thread Tools | Search this Thread |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
How is the offline and online web browsing experience? | rsuryase | iRex | 0 | 01-22-2007 12:39 AM |
Pocket PC-Vade Mecum+Sunrise+Mobsync(sync to SD card) - awesome offline web browsing! | chippyt | Reading and Management | 4 | 02-16-2006 08:22 AM |
Web Channel Collection V1.2.0a available | Alexander Turcic | Announcements | 3 | 10-30-2003 02:48 PM |
Web Channel Collection V1.2.0 - available! | Alexander Turcic | Announcements | 9 | 10-10-2003 01:51 PM |
Web Channel Collection V1.1.0 AVAILABLE | Alexander Turcic | Announcements | 15 | 07-20-2003 03:08 PM |