View Single Post
Old 09-10-2004, 03:24 PM   #23
hacker
Technology Mercenary
hacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with others
 
hacker's Avatar
 
Posts: 617
Karma: 2561
Join Date: Feb 2003
Location: East Lyme, CT
Device: Direct Neural Implant
2GB of mail doesn't make a GMail killer. 10GB of mail doesn't make a GMail killer. I'd certainly trust Google with my data long before I'd trust some fly-by-night company I've never heard of. (But, that being said, I don't trust Google with my data at all, even though they have already admitted to tracking searches people do, per-IP, and storing it in their datastores).

I haven't yet seen a GMail killer, and I doubt we'll see one in the next 2-3 years, if even that short.

Why? Because GMail doesn't "store" email like these other providers. Your email is a pointer to an enormous datastore on their shard, which can contain messages that other users see also.

For example, lets say you have a mailing list with 10,000 subscribers. 5,000 of those subscribers are GMail users, and they all subscribe their GMail addresses to the list. The other 5,000 users are random email addresses across the globe.

When a message is sent to the list, and is sent to the GMail subscribers, one copy of the message is put into the GMail datastore, and 5,000 pointers all reference it. If you are one of the 5,000 people in GMail who "get" that message, you see it in your mailbox. If 1,000 of the GMail users delete it, there are now 4,000 pointers to the original message.

The same goes for spam on GMail. When you report a message as spam, a heuristic is applied to it that identifies it so that NO OTHER GMail USERS would receive that spam again, based on a weighting by other users who have also ranked that message as spam.

So, to store 2GB of mail, isn't really storing that much, you're probably storing 2GB of mail across 100 users who share some percentage of the same sort of messages.

Make sense? And the reasons why this won't be seen in other systems, are several:

  1. Google owns the technology that drives it, incuding the algorithms
  2. Google has an enormous amount of brilliant people to develop and manage it
  3. Google also has an enormous amount of bandwidth and tens of thousands of machines powering their infrastructure
It would be interesting to see more ideas like this coming out, especially the distributed spam heuristic kind of ideas, but for now, it is most-likely locked up in some very tight patents and licensing.
hacker is offline   Reply With Quote