How Computer Recommendations Can Dull Our Searching Souls

Lev Grossman wrote an excellent article in TIME on how recommendation engines work (ie, for Netflix movie selection, and for Pandora radio selection) and how they can start turning us into boringly homogenous & predictive blockbuster consumers of the same stuff within one safe space.

Alas, when it comes to movie choices, the options and parameters are so many, that suggestions I get are often unreliable.

TIME Magazine

http://www.time.com/time/magazine/article/0,9171,1992403,00.html 

Thursday, May. 27, 2010

How Computers Know What We Want — Before We Do

By Lev Grossman

Here’s an experiment: try thinking of a song not as a song but as a collection of distinct musical attributes. Maybe the song has political lyrics. That would be an attribute. Maybe it has a police siren in it, or a prominent banjo part, or paired vocal harmony, or punk roots. Any one of those would be an attribute. A song can have as many as 400 attributes — those are just a few of the ones filed under p.

This curious idea originated with Tim Westergren, one of the founders of an Internet radio service based in Oakland, Calif., called Pandora. Every time a new song comes out, someone on Pandora’s staff — a specially trained musician or musicologist — goes through a list of possible attributes and assigns the song a numerical rating for each one. Analyzing a song takes about 20 minutes.

The people at Pandora — no relation to the alien planet — analyze 10,000 songs a month. They’ve been doing it for 10 years now, and so far they’ve amassed a database containing detailed profiles of 740,000 different songs. Westergren calls this database the Music Genome Project. (See the world’s most influential people in the 2010 TIME 100.)

There is a point to all this, apart from settling bar bets about which song has the most prominent banjo part ever. The purpose of the Music Genome Project is to make predictions about what kind of music you’re going to like next. Pandora uses the Music Genome Project to power what’s known in the business as a recommendation engine: one of those pieces of software that gives you advice about what you might enjoy listening to or watching or reading next, based on what you just listened to or watched or read. Tell Pandora you like Spoon and it’ll play you Modest Mouse. Tell it you like Cajun accordion virtuoso Alphonse "Bois Sec" Ardoin and it’ll try you out on some Iry LeJeune. Enough people like telling Pandora what they like that the service adds 2.5 million new users a month. (See the 100 best albums of all time.)

Over the past decade, recommendation engines have become quietly ubiquitous. At the appropriate moment — generally when you’re about to consummate a retail purchase — they appear at your shoulder, whispering suggestively in your ear. Amazon was the pioneer of automated recommendations, but Netflix, Apple, YouTube and TiVo have them too. In the music space alone, Pandora has dozens of competitors. A good recommendation engine is worth a lot of money. According to a report by industry analyst Forrester, one-third of customers who notice recommendations on an e-commerce site wind up buying something based on them. (Watch TIME’s video "The Brains Behind Pandora Radio.")

The trouble with recommendation engines is that they’re really hard to build. They look simple on the outside — if you liked X, you’ll love Y! — but they’re actually doing something fiendishly complex. They’re processing astounding quantities of data and doing so with seriously high-level math. That’s because they’re attempting to second-guess a mysterious, perverse and profoundly human form of behavior: the personal response to a work of art. They’re trying to reverse-engineer the soul.

They’re also changing the way our culture works. We used to learn about new works of art from friends and critics and video-store clerks — from people, in other words. Now we learn about them from software. There’s a new class of tastemakers, and they’re not human.

Learning to Love Dolph Lundgren
Pandora makes recommendations the same way people do, more or less: by knowing something about the music it’s recommending and something about your musical taste. But that’s actually pretty unusual. It’s a very labor-intensive approach. Most recommendation engines work backward instead, using information that comes not from the art but from its audience. (See the 50 best websites of 2009.)

It’s a technique called collaborative filtering, and it works on the principle that the behavior of a lot of people can be used to make educated guesses about the behavior of a single individual. Here’s the idea: if, statistically speaking, most people who liked the first Sex and the City movie also like Mamma Mia!, then if we know that a particular individual liked Sex and the City, we can make an educated guess that that individual will also like Mamma Mia!

It sounds simple enough, but the closer you look, the weirder and more complicated it gets. Take Netflix’s recommendation engine, which it has dubbed Cinematch. The algorithmic guts of a recommendation engine are usually a fiercely guarded trade secret, but in 2006 Netflix decided it wasn’t completely happy with Cinematch, and it took an unusual approach to solving the problem. The company made public a portion of its database of movie ratings — around 100 million of them — and offered a prize of $1 million to anybody who could improve its engine by 10%.

See the 25 best blogs of 2009.

See the 50 worst inventions of all time.

The Netflix competition opened a window onto a world that’s usually locked away deep in the bowels of corporate R&D departments. The eventual winner — which clinched the prize last fall — was a seven-man, four-country consortium called BellKor’s Pragmatic Chaos, which included Bob Bell and Chris Volinsky, two members of AT&T’s research division. Talking to them, you start to see how difficult it is to make a piece of software understand the vagaries of human taste. You also see how, oddly, software understands things about our taste in movies that a human video clerk never could.

The key point to grasp about collaborative-filtering software is that it knows absolutely nothing about movies. It has no preconceptions; it works entirely on the basis of the audience’s reaction. So if a large enough group of people claim to have enjoyed, say, both Saw V and On Golden Pond, the software would be forced to infer that those two movies share some common quality that the viewers enjoyed. Crazy? Or crazy genius? (See the 100 best movies of all time.)

In such a case, the software would have discovered an aesthetic property that we might not even be aware of or have a name for but which in a mathematical sense must be said to exist. Even Bell and Volinsky don’t always know what the properties are. "We might be able to describe them, or we might not be able to," Bell says. "They might be subtleties like ‘action movies that don’t have a lot of blood, don’t have a lot of profanity but have a strong female lead.’ Things like that, which you would never think to categorize on your own." As Volinsky puts it, "A lot of times, we don’t come up with explanations that are explainable."

That makes recommendation engines sound practically psychic, but everyday experience tells us that they’re actually pretty fallible. Everybody has felt the outrage that comes when a recommendation engine accuses one of a secret desire to watch Rocky IV, the one with Dolph Lundgren in it. In 2006, Walmart was charged with racism when its recommendation engine paired Planet of the Apes with a documentary about Martin Luther King. But generally speaking, the weak link in a recommendation engine isn’t the software; it’s us. Collaborative filtering works only as well as the data it has available, and humans produce noisy, low-quality data.

The problem is consistency: we’re just not good at expressing our desires in rating form. We rate things differently after a bad day at work than we would if we were on vacation. Some people are naturally stingy with their stars; others are generous. We rate movies differently depending on whether we rate them right after watching them or if we wait a week, and differently again depending on whether we saw a lousy movie or a good movie in that intervening week. We even rate differently depending on whether we rate a whole batch of movies together or one at a time. (See the 50 best inventions of 2009.)

All this means that there’s a ceiling to how accurate collaborative filtering can get. "There’s a lot of randomness involved," Volinsky admits. "There’s some intrinsic level of error associated with trying to predict human behavior."

The Great Choice Epidemic
Recommendation engines are a response to the strange new world of online retail. It’s a world characterized by a surplus of something we usually can’t get enough of: choice.

We’re drowning in it. As Sheena Iyengar points out in her book The Art of Choosing, in 1994 there were 500,000 different consumer goods for sale in the U.S. Now Amazon alone offers 24 million. When faced with such an oversupply of choice, our little lizard brains go straight to vapor lock. "We think the profusion of possibilities must make it that much easier to find that perfect gift for a friend’s birthday," Iyengar writes, "only to find ourselves paralyzed in the face of row upon row of potential presents." We’re living through an epidemic of choice. We require an informational prosthesis to navigate it. The recommendation engine is that prosthesis: it winnows the millions of options down to a manageable handful.

But there’s a trade-off involved. Recommendation engines introduce a new voice into the cultural conversation, one that speaks to us when we’re at our most vulnerable, which is to say at the point of purchase. What is that voice saying? Recommendation engines aren’t designed to give us what we want. They’re designed to give us what they think we want, based on what we and other people like us have wanted in the past.

Which means they don’t surprise us. They don’t take us out of our comfort zone. A recommendation engine isn’t the spouse who drags you to an art film you wouldn’t have been caught dead at but then unexpectedly love. It won’t force you to read the 18th century canon. It’s no substitute for stumbling onto a great CD just because it has cool cover art. Recommendation engines are the enemy of serendipity and Great Books and the avant-garde. A 19th century recommendation engine would never have said, If you liked Monet, you’ll love Van Gogh! Impressionism would have lasted forever.

See TIME’s internet covers.

See the best social networking applications.

The risk you run with recommendation engines is that they’ll keep you in a rut. They do that because ruts are comfy places — though often they’re deeper than they look. "By definition, we keep you in the same musical neighborhood you start in," says Westergren of the Music Genome Project, "so you could say that’s limiting. But even within a neighborhood, there is a ton of room for discovery. Forty-five percent of the people who use Pandora buy more music after they start, and only 1% buy less." And not being based solely on data from its audience, Pandora isn’t as vulnerable to peer pressure as most recommendation engines are. It doesn’t follow the crowd.

Pandora is unusual, though. The general effect of recommendation engines on shopping behavior is a hot topic among econometricians, if that’s not an oxymoron, but the consensus is this: they introduce us to new things, which is good, but those new things tend to be a lot like the old things, and they tend to be drawn from the shallow pool of things other people have already liked. As a result, they create a blockbuster culture in which the same few runaway hits get recommended over and over again. It’s the backlash against the "long tail," the idea that shopping online is all about near infinite selection and cultural diversity. It has a bad habit of eating its own tail and leaving you back where you started. (See the latest geek culture stories at Techland.com.)

But this isn’t just about retail. The Web has transformed how we shop. Now it’s transforming our social lives too, and recommendation engines are coming along for the ride. Just as Netflix reverse-engineers our response to art, dating sites like Match.com and eHarmony and OKCupid use algorithms to make predictions about that equally ineffable human phenomenon, love; or, failing that, lust. The idea is the same: they break down human behavior into data, then look for patterns in the data that they can use to pair up the humans.

Even if you’re not into online dating, you’re probably on Facebook, currently the second most visited site on the Web. Facebook gives users the option of switching between a straight feed, which shows all their friends’ news in chronological order, and an algorithmically curated selection of the updates Facebook’s recommendation engine thinks they’d most like to see. And in the right-hand column, Facebook uses a different set of algorithms to recommend new friends. If you loved Jason, why not try Jordan?! (See pictures of Facebook headquarters.)

And as for the first most trafficked site on the Web, if you cock your head only slightly to one side, Google is, effectively, a massive recommendation engine, advising us on what we should read and watch and ultimately know. It used to return the same generic results to everyone, but in December it put a service called Personalized Search into wide release. Personalized Search studies the previous 180 days of your searching behavior and skews its results accordingly, based on its best guess as to what you’re looking for and how you look for it.

The principle is almost endlessly generalizable. Anywhere the specter of unconstrained choice confronts us, we’re meeting it by outsourcing elements of the selection process to software. Largely unconsciously, we radiate information about ourselves and our personal preferences all day long, and more and more recommendation engines of all shapes and sizes are hoovering up that data and feeding it back to us, reshaping our reality into a form that they fondly hope will be more to our liking — in an endless feedback loop. The effect is to create a customized world for each of us, one that is ever so slightly childproofed, the sharp edges sanded off, and ever so slightly stifling, like recirculated air. (See 25 websites you can’t live without.)

How far will it go? Will we eventually surf a Web that displays only blogs that conform to our political leanings? A social network in which we see only people of our race and religion? Our horizons, cultural and social, would narrow to a cozy, contented, claustrophobic little dot of total personalization.

Let’s hope not. People weren’t built to play it safe all the time. We were meant to be bored and disappointed and offended once in a while. It’s good for us. That’s what forces us to evolve. Even if it means watching Rocky IV, with Dolph Lundgren. Who knows? You might even like it.

Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Shout it
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter
Google Buzz (aka. Google Reader)

related posts

post a new comment