View Single Post
Old 10-27-2006, 06:47 AM   #54
bowerbird
Banned
bowerbird has been very, very naughtybowerbird has been very, very naughtybowerbird has been very, very naughty
 
Posts: 269
Karma: -273
Join Date: Sep 2006
Location: los angeles
collaborative-filtering separates wheat from chaff

radleyp said:
> All kinds of garbage is submitted to publishers today,
> and they save us the trouble of wading through it
> by refusing to publish.

oh good lord.

today, publishing companies are the ones force-feeding us
the garbage of celebrity bios and recent-bestseller clones,
in a feverish search for the blockbuster that's gonna hit big
and make the bottom-line acceptable for this quarter, while
the rest of the front-list tanks so they can use the big losses
to write-off the excess profit from the blockbuster to pad the
salaries of lawyers and accountants who keep them out of jail.

whew! i'm glad i got _that_ out of my system... ;+)

in the past, when publishing houses were run by book-lovers,
yes sir, quality-control was an important role that they filled,
and (as i said up above) they did a good job. when publishing
any title required a huge up-front investment fraught with risk,
those houses were certainly entitled to keep much of the profits.

but that world has now been turned entirely upside-down...

so let's retire those publishers to their yachts in the bahamas.

today's authors can make an e-book available to their readers
-- instantly, worldwide -- with practically zero up-front cost...
so there is no need for deep pockets, and very little "risk" at all.

even more important, the shelfspace in cyberspace is unlimited.
this means there is no requirement to "move the old books out",
which was an important dynamic in the physical-product world,
where production, storage, and distribution were all expensive.

now, production is ~free, storage is ~free, distribution is ~free.
given all that, it's hard to see how we can _fail_ to make money.

the only problem that still remains is that, since the size of the
haystack will steadily increase, it might be hard to find needles.

but here again, many-to-many communication saves the day...

how? with collaborative-filtering, that's how.

maybe you've seen it being used over on amazon.com, with their
"people who bought this book also bought this other book" thing.

yeah, well forget that ass-hat implementation, which is the kind of
"collaborative-filtering" that only a capitalist businessman can love.
a money-grubber like that is only interested in _making_the_sale_,
so of course the variable measured will be the purchase behavior.

but i suppose you've _bought_ books you didn't end up enjoying, eh?
sure you have. the important question is not "did you buy this book?"
the answer we _need_ is "after reading it, did you _enjoy_ this book?"

based on your answer to _that_ question (over a hundred or so books),
a collaborative-filtering system with data from a few thousand people
could deliver books that you would _love_ -- day after day after day --
to your e-mailbox, enough to bury you in books for the rest of your life.

it's just a statistical procedure that compares your ratings profile to
those of other people; and when it finds similarities, then it looks for
highly-rated items from them that you haven't yet seen (i.e., rated).

(likewise, it is able to warn you off any material they have rated low.
in the long run, this might prove to be more worthwhile, since it will
save you money from any purchases that you'll later come to regret.)

it doesn't take a very big "sample size" for most statistics to work;
national polls often have no more than a few hundred respondents.

my guesstimate would be that a well-done system could predict your
rating for a specific book within two-tenths of a point on a 1-10 scale.

against such precision, old-fashioned "word of mouth from friends"
will seem positively _primitive_ in comparison. your current friends
are likely to have encountered much the same content as you, while
"strangers" who live in other corners of the globe are likely to have
experienced a wide spectrum that's completely different from yours;
the "secret sauce" is material that you wouldn't have found otherwise.

further, if the system contained ratings from a few _million_ people --
perhaps like the 20 million who visited youtube in the past 3 months --
you'd have enough reading material to last you _dozens_ of lifetimes,
all of 'em books that you'd happily rate as 9.8 or higher on a 1-10 scale,
books that you will remember for the rest of your life as your "favorites".

and one of 'em might be a book that only 241 people on the whole planet
would love as much as you. everybody else might totally _hate_ the book.
but what do you care? you don't have to interact with those people, right?

and moreover, since our collaborative-filtering system would _know_ that
they'd hate that book, it wouldn't even bother to inform them about it, so
they won't have to "wade through it", so they won't object in the slightest.

this is why any "measures of quality" will come to be seen as superfluous.

by any "objective" scale, a book this unpopular has to be a "bad" book.
but since the 6 billion people who'd detest it don't have to mess with it,
what'll it hurt to keep it around so its 241 fans can enjoy it to the fullest?

likewise songs. likewise videos. likewise photos. likewise digital art.

no "wading through low-quality garbage" to find stuff you really love;
collaborative-filtering will deliver tons of it right to your screen, daily.

but hey, the fun doesn't stop there! no sir, we're just getting _started_...

because let's say i'm the author of that book that only a mere 241 people
in the whole world rated as a "10". on the one hand, you might think that
i must be an terrible author, to write a tract garnering such few admirers,
spread far-and-wide all around the 7 continents. (1 fan from antarctica!)

but from my perspective, what the collaborative-filtering system has done
is _remarkable_, as it has introduced me to 241 people who love my work!
241 people spanning across all 7 continents. (did i mention antarctica?)
how would i have ever found those people myself? it would be impossible!

and they loved it so much, they took advantage of global communications
and contacted me, and each other, and now we are one big happy family!
we have a listserve where we constantly have loads of fun with each other.
indeed, one of my jewish fans fell in love with one of my palestinian fans,
and they're getting married in february! over half of us (127) are going!
(including the guy from antarctica! we look forward to the f-t-f with him!)

meanwhile, all of this love has inspired me so much as a writer that i have
already finished another book, and am halfway done on the one after that!

yeah, i'm just making all this up, but do you see where i'm going with this?

cory doctorow put it this way: "content isn't king, communication is king;
the main reason we absorb content is so we can talk about it with friends."

in the beginning, we'll think of this collaborative-filtering system as
a useful tool that delivers _high-quality_content_ to our doors, and
yes, we'll continue to appreciate that aspect of it for a long time, but
very soon as well, we'll realize that it is actually much more valuable
as a mechanism bringing _high-quality_friendships_ into our lives...

and that, my friends, is quite a double-barreled shot into the future.

-bowerbird
bowerbird is offline   Reply With Quote