Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > Miscellaneous > Lounge

Notices

Reply
 
Thread Tools Search this Thread
Old 03-13-2005, 04:52 PM   #16
hacker
Technology Mercenary
hacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with others
 
hacker's Avatar
 
Posts: 617
Karma: 2561
Join Date: Feb 2003
Location: East Lyme, CT
Device: Direct Neural Implant
Quote:
Originally Posted by Laurens
Printable versions usually have no navigation or banner images and contain the entire article. Many NYT articles, for instance, are split across multiple pages in the "normal" version, requiring multiple requests to obtain them in their entirety.
What did these "big sites" have to say about your users, through the use of your software, depriving them of their advertiser revenue by bypassing their banner ads and other advertising resources when deep-linking to their print-only pages?

Quote:
Sunrise goes to great lengths to reduce bandwidth usage.
Incidentally, your tool (v0.41f) completely ignores robots.txt, which makes adhering to Entity Tags, Last-Modified, Cache-Control, and so on... basically irrelevant as far as bandwidth savings go. Your tool blindly allows slamming a site for content as fast as possible over and over and over until it has it all, or exhausts maximum fetch depth.

While I think its valuable for Windows users, it has a long way to go before it can compete with high-quality tools that follow Web and Internet standards.

Keep working on it, you'll get there.

Last edited by hacker; 03-13-2005 at 04:55 PM.
hacker is offline   Reply With Quote
Old 03-14-2005, 09:36 AM   #17
doctorow
Guru
doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.doctorow ought to be getting tired of karma fortunes by now.
 
doctorow's Avatar
 
Posts: 914
Karma: 3410461
Join Date: May 2004
Device: Kindle Touch
I think it is cool that Sunrise offers so many ways to customize the download of webpages. On the other hand, I think one should always ask the webmaster first if he is OK with someone spidering his page.

Back to the topic: I started "doing" RSS just recently and still haven't found the best tool. Would you recommend web-based or desktop-based aggregators and why?

Last edited by doctorow; 03-14-2005 at 09:42 AM.
doctorow is offline   Reply With Quote
Advert
Old 03-14-2005, 11:31 AM   #18
Bob Russell
Recovering Gadget Addict
Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.
 
Bob Russell's Avatar
 
Posts: 5,379
Karma: 590871
Join Date: May 2004
Location: Pittsburgh, PA
Device: Note3/DVP11
Some ramblings intended to provide input...

First of all, Hacker, I'd say I'm a big believer in a variety of good tools. You and Laurens and some of the other developers like Picard are doing an incredible job of providing some awesome content tools. Even if there's an "edge" on some of the back-and-forth between you guys, it's really interesting because it shows how the priorities and goals drive the direction of the development, each with it's unique advantages.

And, gee, even a lot of the "so-so" stuff out there is nice, so when the really good stuff is produced it's all the more a great delight for us.

I'm very curious where you're headed with all this! But to stick to the question about RSS advantages/disadvantages and usage, for me...

The advantage of RSS is
* A collection of my predetermined news and blog reading that takes absolutely no per-site setup or maintenance other than maybe to pass on a feed URL.
* If things worked "nice and easy", I'd love to put the content on my PDA just like I have a list of items set up in iSiloX. I click "Convert All" and let it run while I shower. Then I walk away in the morning with all the new content I've arranged for.

The disadvantages are:
* Usually the feed doesn't show the whole article. I'd like to be able, on Windows or PDA, to not have any delays loading pages or have to go to a full "messy" screenful of stuff to read the rest of the article.
* I get the impression that RSS content is a usually subset of the web page content. This may just be an error of perception on my part because of the "packaging", or because I only get to see new content, but it seems real enough to me that I usually go to the original web site instead
* It takes more skills than I have to make RSS feeds "nice" for PDAs.
* RSS readers on a PDA take up additional memory I don't want to give up. One more program to install. In my case, I have a Toshiba e405 with 64meg RAM, 32meg internal flash, and and external memory card. Seems like plenty of space, but it's really not. I allocate the card to backups and music/video. The internal flash has some apps and my clipped content, and my eSword files. My internal memory has programs and a few ebooks that I'm currently reading. I wish I had more internal flash that I could load more programs on. The reason I don't put programs on my external gig card(s) is that I want to be able to swap out cards and not lose any "core" functionality, just music and video content. Like I said, it seems like 64meg would be plenty (even if only 32meg is allocated to program and file storage), but if you use your PDA heavily you find that you just don't have room for even some things you really wish you could install. In my case, more eSword files and Microsoft Streets and Trips and Sunrise and probably this new tool you are working on. Even smaller things like loan calculators get left out because they often require space for an app booster or runtime files or .net framework, etc.
* You can't follow links easily. Not really an RSS issue, and I can't see how it could be overcome with a software tool unless there was some new fancy way to determine what links get followed, but if I read on my (non-wifi) PDA with RSS or iSiloX clippings of a site like Slashdot, I'm stuck if I want to pursue the topic further. I can't even follow the links in the article unless I grab way too much stuff ahead of time. I have a tendency like a true addict, to check my favorite sites often through the day to get a news fix, but I still do it on an internet connected computer because it's faster and easier and I can follow those links or Google for more information. I think desktop is still easier for reading that kind of thing. BUT, PDAs are great for reading "self-contained" articles and news like from newspapers or many blogs.

But even with the disadvantages (you can tell I'm not really a fan of RSS the way it's currently implemented), I'd love something like this if I could get the full article in text form without any hassle (kind of like those feeds Alex had set up, or what you can do with Sunrise if you know what you're doing)
Bob Russell is offline   Reply With Quote
Old 03-14-2005, 11:39 AM   #19
Laurens
Jah Blessed
Laurens is no ebook tyro.Laurens is no ebook tyro.Laurens is no ebook tyro.Laurens is no ebook tyro.Laurens is no ebook tyro.Laurens is no ebook tyro.Laurens is no ebook tyro.Laurens is no ebook tyro.Laurens is no ebook tyro.Laurens is no ebook tyro.
 
Laurens's Avatar
 
Posts: 1,295
Karma: 1373
Join Date: Apr 2003
Location: The Netherlands
Device: iPod Touch
Quote:
Originally Posted by hacker
What did these "big sites" have to say about your users, through the use of your software, depriving them of their advertiser revenue by bypassing their banner ads and other advertising resources when deep-linking to their print-only pages?
None of them complained. Statistically, Sunrise does not make an impact.

Last time I checked, Plucker Desktop came configured with global exclusion filters for well-known ad URLs (windcaster and such). Did no-one complain about their lost ad revenue?

Quote:
Originally Posted by hacker
Incidentally, your tool (v0.41f) completely ignores robots.txt, which makes adhering to Entity Tags, Last-Modified, Cache-Control, and so on... basically irrelevant as far as bandwidth savings go. Your tool blindly allows slamming a site for content as fast as possible over and over and over until it has it all, or exhausts maximum fetch depth.
Plucker and iSilo also ignore robots.txt, don't they? Now why is this a problem all of a sudden? And how does ignoring robots.txt make caching irrelevant?
Laurens is offline   Reply With Quote
Old 03-14-2005, 12:29 PM   #20
Laurens
Jah Blessed
Laurens is no ebook tyro.Laurens is no ebook tyro.Laurens is no ebook tyro.Laurens is no ebook tyro.Laurens is no ebook tyro.Laurens is no ebook tyro.Laurens is no ebook tyro.Laurens is no ebook tyro.Laurens is no ebook tyro.Laurens is no ebook tyro.
 
Laurens's Avatar
 
Posts: 1,295
Karma: 1373
Join Date: Apr 2003
Location: The Netherlands
Device: iPod Touch
I think it remains to be seen whether newsfeeds will really break into the mainstream. In my experience, most non-technical people are not interested because they don't have the time, or are not willing to spend the time, to read all the news.

I'm starting to see where they're coming from. Although I still spend significant time reading news through RSS feeds, I find myself turning down the noise more and more. I used to be subscribed to about eighty channels, now I'm down to thirty.

Newsfeeds are being billed as time-savers, but they can also be a huge distraction. That's why I never leave FeedDemon running in the background. (Incidentally, I don't leave e-mail checks running in the background either, nor do I use IM.)
Laurens is offline   Reply With Quote
Advert
Old 03-14-2005, 11:29 PM   #21
hacker
Technology Mercenary
hacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with others
 
hacker's Avatar
 
Posts: 617
Karma: 2561
Join Date: Feb 2003
Location: East Lyme, CT
Device: Direct Neural Implant
Quote:
Originally Posted by Laurens
Last time I checked, Plucker Desktop came configured with global exclusion filters for well-known ad URLs (windcaster and such). Did no-one complain about their lost ad revenue?
No, because when those were created, most of those advertisers were forcing garbage ads, spyware, popups and other trash on the users. Better off without them, in most cases. Its a fine line, to be sure.

Quote:
Plucker and iSilo also ignore robots.txt, don't they? Now why is this a problem all of a sudden? And how does ignoring robots.txt make caching irrelevant?
I think you mean Plucker's python distiller, not Plucker itself.

Plucker is a viewer, primarily, which supports a document format that can be produced by many tools. The two most-popular document creators for Plucker are currently the Python Distiller (used in Plucker Desktop), and Bill Nalens' C++ distiller. Until recently, the Python distiller did not support robots.txt; now it does.

There is also JPluck, Sunrise, pdaConverter, pler, Bluefish, and my own Perl spider (which, by the way, adheres to the robots exclusion specification, the first and until recently, only Plucker distiller to do so), and probably other tools that we don't know about that can produce a Plucker document using the Plucker document format. At least a dozen commercial companies are using the Plucker viewer and document format now for their core product suites.

But the reason caching pages and ignoring robots.txt makes caching irrelevant, is because you are allowing your tool to fetch content it is forbidden from fetching, via robots.txt. In many cases, the excluded portions of sites are dynamic, and the Last-Modified, Etag, etc. headers will either not be present, or will force a re-fetch. Its wasteful, and makes caching top-level pages irrelevant, if you allow someone to fetch dozens, hundreds, thousands of pages that are forbidden.

But its your tool, and you're free to adhere to the standards, or violate them, as you see fit.
hacker is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Classic G:RSS: Optimized Google Reader (RSS) for the Nook [BETA Testers needed] Fmstrat Barnes & Noble NOOK 24 12-28-2010 01:22 PM
G:RSS: Optimized Google Reader (RSS) for the Kindle 3 (and Nook) Fmstrat Amazon Kindle 47 12-13-2010 01:20 PM
Is there a good way to convert partial rss to full rss feeds. Zorz Other formats 5 05-29-2010 01:17 PM
Firmware Update Two-Point-Five NOW! Sheikspeare Amazon Kindle 137 05-12-2010 03:08 AM
PRS-600 Sound off.....at this point, who's got one? DougFNJ Sony Reader 76 09-23-2009 01:01 PM


All times are GMT -4. The time now is 02:26 PM.


MobileRead.com is a privately owned, operated and funded community.