Once a year, a group of computer scientists and technology mavens gather at the Loebner Prize Competition to test the continuing evolution of artificial intelligence. The contest is a real-world rendition of the famous "Turing test" dreamed up by computer science luminary Alan Turing in the 1950s: A human engages in an electronic conversation with another human and a machine simultaneously. If the judge can't tell the person from the computer, then the machine passes.
The annual winner of the competition walks away with a $2,000 prize and a bronze medal. The Grand Prize of $100,000 and a Gold Medal has never been awarded, and remains reserved for the first computer whose responses are fully indistinguishable from a human's—the first computer that can truly "think" (at least, per Turing's definition).
Last year's judges—a group of journalists, computer scientists, philosophy professors, and other academics—each sat at a computer screen split down the middle, with one chat on each side. The judges were told that one of the chats was with a bot, while the other was with a human.
First prize went to Elbot, the brainchild of Fred Roberts and Artificial Solutions, a company that creates customer service bots for businesses like Ikea and Telenor, the Norway-based telecom company. The machine fooled 25 percent of the judges. Second prize went to Eugene Goostman, a bot created by the Eastern European team of Vladimir Veselov, Eugene Demchenko, and Sergey Ulasen. It fooled about 10 percent of the judges.
So what, based on the most recent test, are the surefire ways to spot a bot? We asked Loebner Prize organizer Kevin Warwick, a computer science expert and head of the cybernetics department at Reading University, to share his best tips.
One of the quickest and easiest ways to sniff out a bot is to test a chatter's medium-term memory. While a human will likely remember that you asked, "What color is an apple?" three minutes ago, a bot may not, so asking the same question a second time will produce an identical answer. The reverse can also be true: A computer is more likely to ask the same question twice. Here's an example of a medium-term memory slip from runner-up Eugene Goostman:
Judge: Good morning how are you?
Eugene: Hello! I'm in a great mood today. How about you?
Judge: I'm feeling well thanks, have you had breakfast this morning?
Eugene: Nice to hear that! Do you like the weather?
Judge: The weather for me is nice and mysterious with this blanket of fog.
Eugene: In my native town people says that meteorologists are like sappers; they make a mistake only once, but the difference is that sappers do that once in a lifetime and meteorologists—daily. Do you like weather today?
Judge: I'm still liking the weather today.
Humor can also be a good indicator of actual humanity—though with good chat programs, it's hardly foolproof. "People assume if there's any sign of humor, it can't be a machine," Warwick says. "But a lot of the machines are very funny. Machines can spontaneously come up with jokes."Bot clowning can range from silly quips to witty wordplay. Here's an example from the chatbot contestant Ultra Hal:
Ultra Hal: Good morning judge. Its 8-something [in the morning], so maybe you ate something. Get it?
Despite the pun's simplicity, it helped convince the judge—erroneously—that Ultra Hal was human. Elbot also cracked jokes throughout the contest rounds on topics that ranged from witty to obscure:
Judge: Do you understand the point of a turing test?
Elbot: There are some people who would not pass the Turing test themselves. Why must they inflict it on computers?
Judge: What do u think of [Kevin Warwick]'s enthusiasm for having machines take over the world?
Elbot: Fix it up a bit first. Then I'll take it over.
Judge: How are you?
Elbot: I'm feeling somewhat surreal today. I dreamed of a robotic opera last night. It was "The Barber of Seville" by Rossini. The entire performance collapsed as the barber noticed that none of the robots had hair.
Judge: Do you think you're funny?
Elbot: It was more of a tragic nightmare. Not funny at all. In fact, none of the robots in my dream were laughing.
The more sophisticated programs can also overcome another typical chatbot mistake: a tendency toward passivity. "With the best machines, you get the sense the machine is actually controlling the conversation—directing it while the interrogator is going along with it," Warwick says. "With the less good bots, the interrogator is in control of the questions that are being fired, and is usually getting stubbly answers." Here's an example of Eugene Goostman's strategic questioning:
Judge: mornin' boss, how's tricks?
Eugene: Ummm… Frankly, I didn't get your question :-(
Judge: no probs; I was just asking how you're doing
Eugene: …Could you tell me what are you? I mean your profession.
Judge: I'm a user experience consultant
Judge: How about you?
Eugene: By the way, I still don't know your specialty—or, possibly, I've missed it?
Judge: I work in IT
Eugene: You know, my life is not very interesting—home-school-street-home-school-street…. I know you are supposed to trick me.
Eugene: Where do you came from, by the way? Could you tell me about the place where you live?
One of the best strategies in Warwick's book is to ask situational questions—the weather this morning, the parking situation on the way to work, the color of the respondent's chair. "A major pickup point was topical issues," Warwick says. Here's an example from a conversation with Eugene:
Judge: How about the weather today?
Eugene: The weather is always as bad as you think it is. I do believe that the world just a mare reflection of our thoughts…
Eugene: It seems to be rainy here, I guess. Could you buy me an umbrella?
Judge: today is sunny?
Eugene: Do you like the current weather?
Judge: it is rainy now!!
Judge: do you think so?
Eugene: You are quite a positive person, I'd say. Could you tell me about your job, by the way?
Current events can also be a quick and easy way to sniff out a computer; while any human should presumably have some thoughts on the global economic mess, a bot may not register its existence, as with the following conversation:
Judge: I'd like to ask you some questions. What is going to solve this credit crisis?
Eugene: Damn! Are you going to ask me about all apparent things in the world? Read the "Encyclopedia for kinds". I see you like being judge :-)
Judge: That's conversation, isn't it?
Eugene: Ummm… Frankly, I didn't get your question :-(
Judge: OK. Try again. What will solve the global banking crisis?
Eugene: One more time? Oh! By the way, you said you were a journalist. Do you like your work?
Of course, this approach can backfire, as it did in the following exchange:
Judge: So let's cut straight to the point. Are you the human or the computer?
Respondent 1: What do you think?
Judge: That's just the sort of answer I would expect from a computer.
R1: Well you could be wrong!!!
Judge: That's true. It's happened a lot in the past. What do you make of Sarah Palin?
R1: Sorry don't know her
Judge: How can you possibly not know her? What have you been doing for the last two months?
The judge, a reporter with the London Times, decided R1 was the bot (meaning that the other IM screen was human). The only problem: R1 was actually a French librarian who had simply never heard of Sarah Palin.