DeepMind’s AIphaStar achieves Starcraft II Grandmaster status, signals “advanced” AI has arrived

0 2

By Matthew Griffin Intelligence and the Senses 17th December 2019

WHY THIS MATTERS IN BRIEF

Starcraft is a game of strategy and as such is a much more complex game for AI to master, but now one of the world’s most advanced AI’s has done just that.

Interested in the Exponential Future? Connect, download a free E-Book, watch a keynote, or browse my blog.

Following on from the announcement their Artificial Intelligence (AI) had hammered some of the world’s top Starcraft II gamers earlier this year DeepMind, Google’s leading edge AI company, just announced they’ve conquered another major milestone after their AlphaStar AI became a grandmaster in the real-time strategy game Starcraft II, which means it’s capable of besting 99.8 percent of all human players in competition – it’s also a moment that last year Bill Gates said would “truly signal that advanced AI has arrived.” The findings are to be published in a research paper in the scientific journal Nature later this month.

Not only that, but DeepMind says it also evened the playing field when testing the new and improved AlphaStar against human opponents who opted into online competitions this past summer. For one, it trained AlphaStar to use all three of the game’s playable races, adding to the complexity of the game at the upper echelons of pro play. It also limited AlphaStar to only viewing the portion of the map a human would see and restricted the number of mouse clicks it could register to 22 non-duplicated actions every five seconds of play, to align it with standard human movement.

And despite holding the agent back and limiting it it was still able to thrash everyone and achieve grandmaster level – the highest possible online competitive ranking. It also became the first system to achieve the feat.

DeepMind sees the advancement as more proof that general-purpose reinforcement learning, which is the machine learning technique underpinning the training of AlphaStar, may one day be used to train self-learning robots, self-driving cars, and create more advanced image and object recognition systems.

“The history of progress in artificial intelligence has been marked by milestone achievements in games. Ever since computers cracked Go, chess and poker, StarCraft has emerged by consensus as the next grand challenge,” said David Silver, a DeepMind principle research scientist on the AlphaStar team, in a statement. “The game’s complexity is much greater than chess, because players control hundreds of units; more complex than Go, because there are 10^26 possible choices for every move; and players have less information about their opponents than in poker.”

Back in March, DeepMind announced that its AlphaStar system was able to best top pro players 10 matches in a row during a prerecorded session, but it lost to pro player Grzegorz “MaNa” Komincz in a final match streamed live online. The company kept improving the system between January and June, when it said it would start accepting invites to play the best human players from around the world. The ensuing matches took place in July and August, DeepMind says.

And the results were stunning – AlphaStar had become among the most sophisticated Starcraft II players on the planet, but remarkably still not quite superhuman. There are roughly 0.2 percent of players capable of defeating it, but it is largely considered only a matter of time before the system improves enough to crush any human opponent.

This research milestone closely aligns with a similar one from San Francisco-based AI research company OpenAI, which has been training AI agents using reinforcement learning to play the sophisticated five-on-five multiplayer game Dota 2. Back in April, the most sophisticated version of the OpenAI Five software, as it’s called, bested the world champion Dota 2 team after only narrowly losing to two less capable E-Sports teams the previous summer. The leap in OpenAI Five’s capabilities mirrors that of AlphaStar’s, and both are strong examples of how this approach to AI can produce unprecedented levels of game-playing ability.

Similar to OpenAI’s Dota 2 bots and other game-playing agents, the goal with this type of AI research is not just to crush humans in various games just to prove it can be done. Instead, it’s to prove that, with enough time, effort, and resources, sophisticated AI software can best humans at virtually any competitive cognitive challenge, be it a board game or a modern video game.

It’s also to show the benefits of reinforcement learning, a special brand of machine learning that’s seen massive success in the last few years when combined with huge amounts of computing power and training methods like virtual simulation.

Like OpenAI, DeepMind trains its AI agents against versions of themselves and at an accelerated pace, something known as synthetic training, so that the agents can clock hundreds of years of play time in the span of a few months. That has allowed this type of software to stand on equal footing with some of the most talented human players of Go and, now, much more sophisticated games like Starcraft and Dota.

Despite this though the software is still restricted to the narrow discipline it’s designed to tackle. The Go playing agent, for example, cannot play Dota, and vice versa. That said though DeepMind did let a more general-purpose version of its Go-playing agent try its hand in chess, which it mastered in just eight hours. That’s because the software isn’t programmed with easy-to-replace rule sets or directions. Instead, DeepMind and other research institutions use reinforcement learning to let the agents figure out how to play on their own, which is why the software often develops novel and wildly unpredictable play styles that have since been adopted by top human players.

“AlphaStar is an intriguing and unorthodox player — one with the reflexes and speed of the best pros but strategies and a style that are entirely its own. The way AlphaStar was trained, with agents competing against each other in a league, has resulted in gameplay that’s unimaginably unusual; it really makes you question how much of StarCraft’s diverse possibilities pro players have really explored,” Diego “Kelazhur” Schwimer, a pro player for team Panda Global, said in a statement. “Though some of AlphaStar’s strategies may at first seem strange, I can’t help but wonder if combining all the different play styles it demonstrated could actually be the best way to play the game.”

DeepMind hopes advances in reinforcement learning achieved by its lab and fellow AI researchers may be more widely applicable at some point in the future. The most likely real-world application for such software is robotics, where the same techniques can properly train AI agents how to perform real-world tasks, like the operation of general purpose robotic hands that can also solve Rubik’s Cubes one handed, in virtual simulation. Then, after simulating years upon years of motor control, the AI can take the reins of a physical robotic arms, and maybe one day even control full-body robots. But DeepMind also sees increasingly more sophisticated — and therefore safer — self-driving cars as another venue for its specific approach to machine learning.

Matthew Griffin / About Author

Matthew Griffin, multi-award winning Futurist and named Futurist of the Year 2024, has been described as a "Walking encyclopaedia of the future" by NASA and a futurist polymath. One of the world's most renowned futurists and strategic foresight experts Matthew is the 15 times author of the blockbuster "Codex of the Future" series, and is the Founder and Futurist in Chief of the 311 Institute, a global Futures and Deep Futures advisory firm working across the next 50 years, XPotential University, the world's first free futures and foresight university, and the World Futures Forum which works with the United Nations to solve the worlds greatest challenges. Matthew is an in demand international keynote, acclaimed university lecturer and mentor, and host of the hit Fanatical Futurist podcast.

A rare talent in his past Matthew helped build and run several multi-billion dollar business units for Atos, Dell-EMC, and IBM, and his ability to identify, track, and explain the impacts of hundreds of emerging technologies and trends on global business, culture, and society has earned him a powerful reputation and a roster of clients that include royal households, world leaders, G7, G20, and G77+ governments, and many of the world's most respected brands including ABB, Accenture, Adidas, AON, ARM, BCG, Centrica, Citi Group, Coca Cola, Dentons, Deloitte, Disney, Dow, EY, KPMG, Lego, Legal & General, LinkedIn, Microsoft, PepsiCo, Qualcomm, RWE, Samsung, T-Mobile, UBS, VISA, and many others. He was also the only futurist invited to talk at the UN COP28 held in Dubai alongside world leaders.

Regularly featured in the global media including the AP, BBC, Bloomberg, CNBC, Discovery, Forbes, Khaleej Times, Telegraph, TIME, ViacomCBS, WIRED, and the WSJ, Matthews mission is to help organisations create a fair and sustainable future whose benefits are shared by everyone irrespective of their ability, background, or circumstances.