Register for an account


Enter your name and email address below.

Your email address is used to log in and will not be shared or sold. Read our privacy policy.


Website access code

Enter your access code into the form field below.

If you are a Zinio, Nook, Kindle, Apple, or Google Play subscriber, you can enter your website access code to gain subscriber access. Your website access code is located in the upper right corner of the Table of Contents page of your digital edition.


First AI Learned to Walk, Now It's Wrestling, Playing Soccer

D-briefBy Carl EngelkingOctober 13, 2017 8:41 PM


Sign up for our email newsletter for the latest science news


Oh, artificial intelligence, how quickly you grow up. Just three months ago you were learning to walk, and we watched you take your first, flailing steps. Today, you’re out there kicking a soccer ball around and wrestling. Where does the time go? Indeed, for the past few months we’ve stood by like proud parents and watched AI reach heartwarming little milestones. In July, you’ll recall, Google’s artificial intelligence company in the United Kingdom, DeepMind, developed an algorithm that learned how to walk on its own. Researchers built a basic function into their algorithms that only rewarded the AI for making forward progress. By seeking to maximize the reward, complex behaviors like walking and avoiding obstacles emerged. This month, researchers at OpenAI, a non-profit research organization, used a similar approach to teach AI to sumo wrestle, kick a soccer ball and tackle. Their AI consisted of two humanoid agents that were both seeking to maximize their reward. As an initial setup, each agent was rewarded for moving around its environment, exploring its surroundings. Researchers then narrowed the reward parameter to a specific, yet simple goal.


Remember when AI learned to walk? Isn't it cute? In the sumo-wrestling scenario, both agents were rewarded for exploring the parameters of the ring, and researchers altered the reward amounts based on distance from the center. Then, they pulled this reward away so the agents would learn to optimize for an even more basic reward: push the other one out of the ring. Round after round, each agent’s sumo skills got a little better, and they even taught themselves new tricks to fool an opponent—like a last-second deke to fool a charging opponent. The same approach worked for other challenges like soccer and tackling. While these are cool tricks, it's important to remember that all of these behaviors simply reflect optimized solutions to myriad calculations. Sure, they look like humanoids, but it's all math.

The work from OpenAI highlights the value of “competitive self-play” for future AI training. By providing basic reward parameters, AIs can develop surprising, novel behaviors to solve a task through a warp-speed process of trial and error. Today it might be sumo wrestling or awkward parkour, but it’s not far out of the realm to foresee robot autodidacts that learn to walk gracefully in the real world, care for the elderly or manage your 401(k). From what we’ve seen, it’s almost as if AI is in the midst of its "terrible twos": awkwardly bumbling around, falling on the floor and learning to play. But if self-play is key for the maturation of AI, we may want to skip the teenage years.

    3 Free Articles Left

    Want it all? Get unlimited access when you subscribe.


    Already a subscriber? Register or Log In

    Want unlimited access?

    Subscribe today and save 50%


    Already a subscriber? Register or Log In