Ken Jennings and Brad Rutter are accustomed to making others feel the heat as they blaze through Jeopardy clue after Jeopardy clue. But tonight, the quiz show's two greatest champions will oppose a player who can't be psyched out.
It's time for the world to meet Watson.
IBM's Jeopardy-playing computer system appears to viewers at home as an avatar of the Earth on a black screen. In fact, it is a system years in the making, and perhaps the most impressive attempt ever to create a question-answering computer that understands the nuances of human language.
Watson is not connected to the Internet, but its databases overflow with books, scripts, dictionaries, and whatever other material lead researcher David Ferrucci could pack in. Storing information is the computer's strong suit; the grand artificial intelligence challenge of Jeopardy is the subtlety of words.
When the bright lights of Jeopardy go up tonight, there will be no human handler to tell Watson where inside its mighty databases to seek the answers. It must parse each clue and category title to figure out what it's being asked. It must race through its databases, find relevant search terms, and pick out the right response with a high level of confidence. It must understand the puns and geeky quirks of America's Favorite Quiz Show. It must beat two Jeopardy champions to the buzzer. And it too must voice its responses in the form of a question.
Jeopardy is among the most human of games—much more so than the mathematical and closed system of chess, which IBM's Deep Blue mastered back in the 1990s. But is Jeopardy therefore the greatest gaming challenge for a machine? Not quite, AI researcher and computer science professor Bart Massey of Portland State University says. Checkers, chess, Scrabble, bridge, backgammon, poker, Stratego, and more—they've all seen software designers hoping to create the system that cracks the game. But the history of building computers to play games is the history of overestimating the rise of the machines and underestimating the strength of the brain. Watson could crush its all-star competition over the next three days, but computers even today lag behind the best human players in many of our favorite games—including some surprising ones.
The Grid: Where Computers Reign
Picture the playing areas of tic-tac-toe or Connect Four. They are small and two-dimensional; the two players take turns making their marks on the playing surface in an attempt to place three or four marks in a row, while preventing the opponent from doing the same. This elegant simplicity makes these diversions kids’ games, which players give up for more interesting pursuits (gaming or otherwise) as they grow up. The setup also makes these games perfect for computers.
There's no hidden info in tic-tac-toe, Massey writes in his notes (pdf) on AI and games. There's no luck, either. A tic-tac-toe-playing computer learns the same skill of forcing a draw if necessary that many a child has learned after plenty of losses. Most importantly, a computer can do what a child cannot: simulate every possible outcome and choose the perfect move. As a result, tic-tac-toe and Connect Four fall into the category of "solved" games. Computers play them perfectly. You stand no chance.
But what happens when the board gets bigger, and the rules a little more complex? Things get messier.
Next page: Computer programs face off against the masters of checkers, chess, and the ancient board game go.
King Me
Taking a leap up in difficulty from tic-tac-toe brings us to checkers. It's played on an 8-by-8 board rather than a 3-by-3 grid. The objects in question are not stationary marks but pieces that move and interact. And the computing challenge is much, much greater.
Jonathan Schaeffer knows this all too well. The University of Alberta professor is a member of a research team that created a poker-playing AI that can best human players in limit Texas hold 'em. But the labor of his life has been checkers.
In 1989, Schaeffer began creating a program to master the game. And long before Deep Blue played chess master Garry Kasparov, Schaeffer's Chinook program faced checkers master Marion Tinsley, considered by some the greatest player of all time. "From 1950 to 1992, he played in thousands of games," Schaeffer says, "and in serious competitive games he lost a total of three." Tinsely won his first match against the machine, in 1992. By the time they two met again in 1994, Chinook appeared up to the task. They played six games, fighting each other to a draw each time. But Tinsley was in ailing health; he resigned from the match and passed away in 1995.
Schaeffer recounts these battles against the old master with pride, mixed with regret and a touch of annoyance: Regret because the passing of the years didn't allow the upgraded Chinook to finish its face off against Tinsley, and annoyance at those who dismissed the program's performance. The only way to shut up the naysayers would be to play checkers perfectly and prove that the machine could have beaten almost-perfect Tinsley in his prime.
Nobody's naysaying now. Schaeffer solved checkers in 2007. Chinook can calculate every possible series of moves from every possible position.
If you need a yardstick by which to measure the complexity of checkers, consider that 13 years passed between the second Tinsley match—when Chinook could take on the man Schaeffer described as "as close to perfection as you could imagine in a human" and play him to a standstill—and Schaeffer's grand victory over the game itself. If you need another, here it is: There are about 5 x 10^20 possible positions in checkers. At a certain point, ridiculously huge numbers become just ridiculously huge numbers. So Schaeffer's preferred analogy is this: If you emptied the Pacific Ocean, that number represents the number of spoonfuls required to refill it. "If you look at the computing required to go through 10^20 possible positions, this computation would not finish in my lifetime. And I was determined I would live to see the final results."
So Schaeffer's team sought a shortcut. They turned to better algorithms to sort though those positions faster, rather than waiting around for a next generation supercomputer to land in their laps. Those new algorithms cut down the number of positions that Chinook needed to examine from 10^20 to 10^14—a colossal difference, which allowed Schaeffer to complete the work after 18 painstaking years of effort. Not too long to wait for perfection.
Checkmate?
Watson inevitably invites comparison to Deep Blue, the chess-playing program that was IBM's previous high-profile foray into game AI. In many ways, they couldn't be more different. Where Watson must understand human language, chess is written in the computer's mother tongue—math and probability.
Each type of chess piece may move a particular direction and distance, where checkers differentiates solely between regular and kinged pieces. All 64 squares of the board may be used in chess, where checkers uses half as many. As a result, complexity accumulates quickly. It's an avalanche of digits fit for a computer, and in 1997, Deep Blue defeated Russian grand master Garry Kasparov in a six-game match. If machines bested man back when Intel Pentium IIs and CD-RWs first his the market, AI powered by 2011 computing must be close to solving the game, right?
Right?
"Absolutely not," Jonathan Schaeffer says. "There is no way chess can be solved with modern technology. Even if you gave me a million computers, and I was using them 24 hours a day, 7 days a week, I cannot solve chess."
First of all, humans, don't get your hopes up to high: a modern chess AI could beat your brains out (and Deep Blue's, too). But the game of chess scoffs at programmers trying to push it into checkmate. Scientists don't even know the number of possible positions in chess, Schaeffer says. It's probably somewhere between 10^40 and 10^50. So, at the very least, it's 100,000,000,000,000,000,000 times more complex than checkers.
In the face of such statistics, AI expert Bart Massey says, even Moore's Law—the exponential increase in computing power over time—trembles with inadequacy.
And chess isn't even the trickiest of grid games. Consider Go, the game dating back to ancient China in which two players take turns placing stones at the intersections of a 19-by-19 grid, with the intention on encircling the other players’ pieces to capture them. The gaudy numbers of possible moves that the game board allows, supplemented by the need to plan out a strategy far in advance, has made Go the target of programmers for years. And while chess computers are stuck between beating up on humanity and achieving perfection, Massey says Go computers are not even close to human capability. Just recently, he says, the computer Go community achieved a major milestone, besting a top human player—when given a nine-stone handicap.
"There won't be any major popular game solved for a while now," Schaeffer says. "There's a gap."
That's putting it lightly. It's more like a chasm.
Next page: Can you teach a computer to bluff?
Into the Unknown
Don't talk to Imer Satz about chess.
"People like to program chess for two reasons," he says. "Number one, it's assumed to be some kind of gold standard for intelligence by humans. It's not, but that's how people think of it. Secondly, and more importantly, there's always an objectively correct move in chess because the information's all out there." People may disagree on what that correct move is, Satz says—"but that just means that somebody's right and somebody's wrong, or they're all wrong."
Satz himself has played plenty of chess, but his less-than-impressed attitude comes from programming computers to play an even more difficult game: Stratego. And in Stratego, there is no "correct" move.
If you've never played the game, it looks like chess: Two players line up their rows of attackers across from each other on a square board. They attack with the intent of capturing the one piece that will end the game—though in Stratego the object is to capture the flag, not put the king into checkmate.
That's where the major similarities end. Chess is an open book, with its medieval pieces and their positions free for all to see. Stratego is shadowy. You see your opponent move a piece, but you have no idea what that piece is. A marshal, the highest-ranked attacker? Or a miner, a defenseless piece used to diffuse the bombs players place around flags? The only way to tell without attacking it is to infer your opponent's mindset and try to unpack his or her strategy.
Since computers don't have any experience being human, reading minds is not their cup of tea. What Satz has done with his Stratego-playing AI, Probe, is program it to know human tendencies. It can't read your poker face, but it knows the situations in which humans are statistically most likely to bluff; for example, it knows that players often act aggressively with a weak piece to try to mislead the opponent. It knows—from watching humans play, and because Satz told it so—that most players prefer to have strength on their right sides rather than their left (perhaps an effect of our species' dominant right-handedness, a quirk computers do not share). And, most dangerously, it remembers what you like to do and how you like to arrange your pieces. "Some people have played several thousand games against it," Satz says. "Probe is able to remember all the setups those opponents used, which is devastatingly strong information."
Even so, Probe—which dominates other Stratego AIs in computer versus computer tournaments—can match the skill of only average human players. The AI challenge of dealing with incomplete information is too great.
I Cannot Tell a Lie. Can I?
Probe knows when you're most likely to bluff. But it is no actor.
"It has a much more difficult time bluffing itself, because a bluff is a lie," Satz says. "It is really, really hard to program a computer to pretend, which is to say, 'Raise yourself up on your hind legs, let your teeth and fangs show, but for God's sake don't actually attack or you'll get killed.'"
Deception takes center stage in games built on incomplete information. It's true in Stratego, and, of course, it's true in poker. Players who survive are those who are a little unpredictable. "If you're somebody who's so risk-averse that you never take a chance, your opponent will discover that, and that's it," Satz says. "If you're the kind of person who ignores all risk, then you're dead as well."
So how did Jonathan Schaeffer and the poker AI team at the University of Alberta teach their poker players to lie? They didn't.
"In poker, it's trivial," Schaeffer says. "You do absolutely nothing."
Come again?
The simple way to put it, Schaeffer says, is that his team doesn't try to program any kind of psychology into the program. When it plays, it computes a probability distribution for what do to, and that distribution of possible actions includes what we would call bluffing. The machine doesn't see a ridiculous raise on a 2 and 5 as a "bluff," intended to scare off the other players. It simply sees the raise on a 2 and a 5 as one of the choices in its array of options, and it's programmed to choose it once in a while—specifically, the optimal number of times needed to stay unpredictable without losing all its money on bad bets. Therefore, Schaeffer doesn't have to program the machine to lie. The machine can lie without ever knowing the psychology of a bluff or any other gambit. "These are human names for commonly occurring scenarios," Schaeffer notes, dryly.
Still, a poker program is no master gambler. The University of Alberta programs dominate even great poker players in limit Texas hold 'em, where betting is constrained. But when you take the training wheels off and play the game no-limit, humans win. "That is a significant change to the game," Schaeffer says, "and humans are superior."
Last page: Scrabble, Risk and the future
Word Problems
The Jeopardy problem, put simply, is this: Watson could pack more information into its databases than even Ken Jennings could remember, but computers struggle to understand what a clue in plain English is asking them—something kids can do without a second thought—and have even more trouble with the wordplay involved with many Jeopardy clues.
Thus, question-and-answer machines are a profound challenge for artificial intelligence. We all fight this every day when we struggle to tell Google what we're asking for. Just imagine the results if a machine connected to the knowledge of the world could understand your request in plain English: That's why IBM's commercials in the run-up to Watson's Jeopardy appearance have praised its potential to revolutionize medicine, research, and fields far beyond fun and games. Still, games have spurred machine language understanding, even before Jeopardy.
Ten to 15 years ago, Massey says, the hot challenge for programmers was crossword puzzles. After all, he says, there are myriad ways to fill in the board with letters, but only one that fits the clues. And like Jeopardy clue writers, crossword-clue writers are given to puns and bad jokes.
But perhaps the most intriguing word game computers play is Scrabble. It's intriguing because they don't destroy all humans in the competition.
The Scrabble board may seem to provide a perfect arena for computer perfection. After all, it's simple to give an AI a dictionary. Based on the board in front of it and its tiles in hand, you can imagine it choosing the maximum-point move in seconds while its human competitors mentally labor over the proper plays. However, Massey explains, the game of Scrabble is much more than a vocabulary contest.
A Scrabble endgame—the last few turns, when contests at the highest level are frequently decided—is the dream battleground of a tactician, not a linguist. The best Scrabble competitors play not only to score points for themselves, but to block off letters that would prove opportune to their opponents. They can do this because they've gotten a pretty good idea about what letters their opponents have by watching them play, and thinking to themselves: "If she had the 'X' and the 'Q,' she would've played 'quixotic' on that triple word score." That's a leap of logical reasoning beyond knowing all the words, Massey notes. And besides, top 1 percent Scrabble champs know all the words they need to. In any given game they may only see one or two words they didn't know before.
Make no mistake; computers are great at Scrabble. But their skill is not (yet) enough to overcome the best.
Massively Multiplayer Meltdown
Deception, intuition, and mastery of language give us gaming advantages that computers can't touch. But there's something else we have over them: Computers weaken when the number of players expands.
Scrabble can be played with several players, and Jeopardy is a game for three (though it's organized differently). But most of the games computer programmers have attempted to crack have one thing in common: They are heads-up, two-player, zero-sum games, like chess. Truly multiplayer games add many more variables that computers struggle with. Schaeffer's poker systems, for instance, don't play well against more than one opponent.
Massey says there are many more examples: "Risk is another game that computers don't play well at all, because it's a significantly multiplayer game. In Jeopardy you don't really worry much about strategic stuff, you just try to get as many points as you can. You can't play Risk like that. You can't just try to capture as many countries as you can. You have to think not only about how you're going to beat your opponents, but how your opponents are going to interact with each other."
Their technology may have conquered the world, but computers haven't proven capable of conquering the Risk map quite yet.
Minds like us?
Nobody knows the power of the human mind the way programmers and AI experts like Schaeffer, Massey, and Satz do—they've invested countless hours of their lives trying to match and then surpass the human abilities to play games, and they've come up against tactics like deception that machines struggle with and humans pull off with ease. In games like Stratego that play to human strengths, the obvious solution might be to make machines that resemble the human mind as much as possible.
But Satz doesn't see it that way.
"I have a completely different view: That you should never make an assumption that a computer program should operate on the same principles as a human," he says. "I don't think it's valid to make the starting assumption that the way humans have always played a game is necessarily the best way."
That's the power of a gaming computer. It's not just that computers can execute so many more calculations per second than we can, memorize more openings and endgames than we can, and evaluate many more possible options than we can. Because of those superhuman mathematical abilities, a machine plays the game in ways human players might have rejected as wrong, or never considered at all.
As a result, gaming computers aren't just trying to beat us. They're teaching us.
"Computers aren't bound by human preconceptions," Schaeffer says. "We've seen this in chess, and checkers, and bridge, and backgammon. Computers have revolutionized how humans play, because the computer doesn't have all the human baggage—the biases that we're taught. It comes to these games with fresh eyes."