MobileRead Forums - View Single Post - Is the e-book threatening the future of our literary culture?

DMcCunney · 09-05-2007, 02:10 PM

Quote:

Originally Posted by nekokami

Ah. I was assuming you'd grab a copy of the file. After all, when you look at a website, you're effectively grabbing a copy of the HTML file (or whatever is generated by the script that creates the page, if we're not talking about static pages).

Yes, you are, but the HTML file is not a compressed archive that must be opened and examined. And whether you can get to it at all depends upon the site. Does it require a login/password?

Even if it doesn't, you may not be able to grab the file in a neatly automated manner. Sites use a file called ROBOTS.TXT to specify what a web spider can search and what it shouldn't index. Spiders that ignore ROBOTS.TXT may just get their originating IP address blocked by the site they spider.
______
Dennis