![]() |
#91 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 19,832
Karma: 11844413
Join Date: Jan 2007
Location: Tampa, FL USA
Device: Kindle Touch
|
Quote:
What will happen is the search indexes like Googles will need to become even more more annotated with metadata to categorize the data. So searches for stuff will become more focused. Single word searches won't happen anymore. For example, you might put in a word like france and then a suggestion list will pop up with stuff like food, travel, sites, landmarks, history to narrow down the search. No Google will actually be a provider of search agents which use Goggles search index to mine for what you want. The search agent of course will watch what you do on your pc... It will be like the "awesome bar" in FF3 on steroids suggesting stuff based on your documents, emails, sites you've browsed, forum posts, blogs you follow, etc. For some real interesting search results take a look at cuil.com... it is a new search engine that does some of that "search agent" type stuff. In FF3 you can't see the text box... but in IE it shows. I'll have to email them about that. BOb |
|
![]() |
![]() |
![]() |
#92 |
Fanatic
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 584
Karma: 914
Join Date: Mar 2008
Device: iliad
|
For me an agent is a (theoretical) piece of software you send away, that does collect/process information somewhere else based on its code, and then comes back with the data. Maybe we misunderstood what we were talking about.
To some very primitive degree of this is the way how gnutella searches work. You send away your query string with your destination IP, and it will spread itself autonomously over the gnutella network, and every node that has a hit, or knows of a hit sends the answer to your IP. However its a very primitive searching technique which a) only searches contents from the title line. b) It takes eons to get a good answer list. I mean just imagine the code of google, they have the whole internet indexed (apart from the invisible deep net) and you hit any query, and get an answer in milli seconds... just imagine what great development there is behind this... If somebody would have asked me 20 years ago, if such is technically possible, I'd have said no... And I did see the pre-altavista days, where webcrawler results came drippling it piece by piece.... unsorted of course... Last edited by axel77; 09-04-2008 at 02:10 PM. |
![]() |
![]() |
![]() |
#93 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,293
Karma: 529619
Join Date: May 2007
Device: iRex iLiad, DR800SG
|
|
![]() |
![]() |
![]() |
#94 | |
Liseuse Lover
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 869
Karma: 1035404
Join Date: Jul 2008
Location: Netherlands
Device: PRS-505
|
Quote:
Of course, you could work without the context of one or two lines and let people guess by the url whether a given search result is good or not but somehow I don't think that will go over well. So you are correct, without the cache you will not get the context, just a keyword and a link. Which is a horrible way to search the web. Last edited by acidzebra; 09-04-2008 at 02:30 PM. Reason: I no speel good. |
|
![]() |
![]() |
![]() |
#95 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 19,832
Karma: 11844413
Join Date: Jan 2007
Location: Tampa, FL USA
Device: Kindle Touch
|
Quote:
googleguide http://www.googleguide.com/cached_pages.html talks about the fact that Google will remove the cached pages if requested by the site owner. I assume this is done with the no-cache directive in the header. However, removing it from the cache doesn't remove it from the index. So, they do not "need" to retain the cache once the page has been index to provide search results. BOb |
|
![]() |
![]() |
![]() |
#96 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,293
Karma: 529619
Join Date: May 2007
Device: iRex iLiad, DR800SG
|
Quote:
Also, as acidzebra pointed out, not retaining any cache means that you completely lose the excerpt/context and search results just become a list of URLs with no way of knowing any more details without going to every link. Yes, as you pointed out, there are already ways to get google to not retain a cache as well as to not index your site at all. Taylor doesn't seem to be happy with those solutions though. Last edited by Shaggy; 09-04-2008 at 02:38 PM. |
|
![]() |
![]() |
![]() |
#97 | |||
fruminous edugeek
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 6,745
Karma: 551260
Join Date: Oct 2006
Location: Northeast US
Device: iPad, eBw 1150
|
I really don't want to have to play with yet another browser, but I suppose if Google Chrome gains any appreciable market share, I'll have to install it so I can test the website I am responsible for. :sigh:
The Google ethical question is interesting. I've heard Taylor's concerns before, but not so clearly stated as they ended up being in this thread. Quote:
Let's suppose that one is responsible for providing critical information, e.g. current status on an epidemic. One would need to be able to control whether "stale" versions of that content are displayed. Google cache could display such stale content. However, is it reasonable to require content publishers to "opt in" to allowing this data to be cached for viewing, when again, this is the model upon which the web was built? Perhaps the "no-cache" directive should have more strength than a "gentleman's agreement." There might be problems enforcing this internationally, however. It would probably need to be appended to something like the Berne convention. Regarding the second point: is it reasonable that Google assumes that content publishers want their sites to be listed within a Google search result, regardless of the fact that Google will place advertisements on that page? Is it legal? Is it an ethical use of publicly available, but privately owned data? I'm not an expert on the law, by any means. But I can readily come up with examples of cases in the physical world in which directory listings of publicly available information are provided either for a fee or via advertising-supported means. For example, most telephone companies in the US provide listings of residential and business phone numbers, and advertising is often sold in the same printed volumes. (In fact, in recent years, the phone company I deal with has taken to charging a fee not to list this data.) Names and addresses are collected by numerous agencies and sold to direct marketers. I don't like this process, but it is apparently legal. Photographs of scenic villages are used to advertise venues located in those villages. I don't believe the owners of the houses and gardens that are included in these photographs even need to be asked if they want their property included in such a photograph, so long as the view is from a public location, such as the street. The distinction, again, seems to be where to draw the line between "public" and "private." Quote:
Quote:
It also assumes that users-- owners of private computers-- will be willing to devote their own resources to being part of a search mechanism, rather than putting up with ads (and the bias that comes with them). Based on my observations of seeder vs. leecher ratios on bittorrent networks, I'm not optimistic. But let's assume that some non-commercial search entity exists and is effective. One would need to be able to determine whether to allow searching of ones content by commercial or non-commercial means, and the present robots.txt and nocache directives would not be sufficient, because they would exclude searches by such non-commercial services. The question might then become: do we need a "non-commercial" directive, or a "commercial-ok" directive? I.e., should allowing commercial search engines to index and/or cache one's site be opt-in, or opt-out? I don't know the answer to this. I suspect that the vast majority of people who create web content would prefer to be included in both commercial and non-commercial search engines, even if both were viable. So from a usability standpoint, opt-out would make sense. From a legal and ethical standpoint, however, I can see preferring opt-in. Either way, again we would need something stronger than a "gentleman's agreement," which would probably require an international convention. Well, those are my current thoughts on this complex issue.... |
|||
![]() |
![]() |
![]() |
#98 | ||
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,293
Karma: 529619
Join Date: May 2007
Device: iRex iLiad, DR800SG
|
Quote:
Quote:
|
||
![]() |
![]() |
![]() |
#99 |
Actively passive.
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,042
Karma: 478376
Join Date: Feb 2008
Location: US
Device: Sony PRS-505/LC
|
Some specific lawsuits I know of: New York Times, which provides certain content to subscribers only. Google became a subscriber, took the content, and cached it. Non-subscribers could then view the content directly from Google's cache. To suggest that this was ethical because the New York Times didn't have a no-cache entry etc. is ludicrous, and Google lost that one. Another was Perfect 10, a "men's magazine", which had photos from their magazine on their site, presumably to entice people to join the site and/or subscribe to the magazine. Google cached the images and made them available via images.google.com, circumventing the publisher's intent. Google lost that one, too.
The problem with robots.txt and no-cache is that there is no penalty if a search engine decides not to honor it. One shoudn't have to say "don't use this, it's mine" because copyright already covers that. The court decision in Nevada that the cache constitutes fair use is wrong, and I'm sure we'll see more specific cases in the future. Google has taken this rather narrow ruling to mean that they can now cache ANYTHING, and started scanning copyright books from libraries, which caused several more lawsuits. That's the Google attitude: we will scan, search, index and cache whatever we want, however we want, and you have to sue us if you don't like it. Last edited by Taylor514ce; 09-04-2008 at 03:58 PM. |
![]() |
![]() |
![]() |
#100 | |||
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,293
Karma: 529619
Join Date: May 2007
Device: iRex iLiad, DR800SG
|
Quote:
However, the funny thing is that the judge didn't impose a penalty on Google. Rather, he said that Google should take content down if the original website requests that they do so. Which was actually Google's policy all along. Quote:
Quote:
|
|||
![]() |
![]() |
![]() |
#101 | |
Actively passive.
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,042
Karma: 478376
Join Date: Feb 2008
Location: US
Device: Sony PRS-505/LC
|
Quote:
"If you want to retain your copyrights, then don't post anything on an open website." I disagree with that in principle, and so do these specific cases. I see a major philosophical difference between "we'll take it down if you ask us to", by which time the damage is done, and "you need my permission before you can have it in the first place", which is how copyright in fact works. Last edited by Taylor514ce; 09-04-2008 at 04:34 PM. |
|
![]() |
![]() |
![]() |
#102 | |||
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,293
Karma: 529619
Join Date: May 2007
Device: iRex iLiad, DR800SG
|
Quote:
Quote:
Quote:
|
|||
![]() |
![]() |
![]() |
#103 |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 79,771
Karma: 145864619
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
[OT]Amalthia, is your avatar from The Last Unicorn? I really like that movie.[/OT]
|
![]() |
![]() |
![]() |
#104 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,185
Karma: 32196
Join Date: Jan 2007
Location: Anchorage, AK
Device: Sony Reader PRS-505, PRS-650, PRS-T3, Pocketbook HD2
|
|
![]() |
![]() |
![]() |
#105 |
MR Drone
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,613
Karma: 15612282
Join Date: Oct 2007
Location: DRONEZONE
Device: PB360+, Huawei MP5, Libra H20
|
Google tweaks Chrome licence text
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
HTC Google Chrome OS tablet - more info | SameOldStory | News | 5 | 08-23-2010 07:05 PM |
Sony uses Chrome as default browser | pking36330 | News | 42 | 09-04-2009 03:54 PM |
Wierd warning from google chrome!! | mklynds | Feedback | 3 | 06-13-2009 02:30 AM |
Google Planning Web Browser? | Liqiud | Lounge | 0 | 01-27-2005 07:39 PM |