Amateur Strategy Befuddles Expert Go-playing AI
- By DSAITrends editors
- December 14, 2022
In 2016, AlphaGo, an AI running on a supercomputer, defeated the human world champion at the game. This was a significant milestone in the field of AI, given the complexity of the game. AlphaGo's victory demonstrated the potential of AI to outperform humans in certain domains.
Fast forward half a dozen years, and the open-sourced KateGo can now beat top-ranking human Go players in the comfort of your home. For the uninitiated, a well-tuned KateGo AI model can be trained with a single top-end consumer GPU to reach superhuman strength, albeit with a few months of initial training.
Defeated by a lesser opponent
However, it turned out that a much weaker adversarial Go-playing program can successfully trick KataGo into losing, according to a new paper. This boils down to the inherent fragility of AI models, which is best described as blind spots where AI models can fail in a completely unexpected fashion.
“KataGo generalizes well to many novel strategies, but it does get weaker the further away it gets from the games it saw during training," explained co-author Adam Gleave, a Ph.D. candidate at UC Berkeley to Ars Technica. “Our adversary has discovered one such 'off-distribution' strategy that KataGo is particularly vulnerable to, but there are likely many others.”
To be clear, the adversarial program is not that good at Go, and can itself be defeated by an amateur human Go player. But when one considers that similar scenarios could surface in any deep-learning AI models, this raises troubling prospects that could have broad implications for the current generation of AI systems when faced with completely unexpected situations.
“The research shows that AI systems that seem to perform at a human level are often doing so in a very alien way, and so can fail in ways that are surprising to humans. This result is entertaining in Go, but similar failures in safety-critical systems could be dangerous,” said Graves.
One possible example would be a self-driving car reacting to an unlikely scenario that it doesn’t expect. This could potentially allow humans to trick it into performing dangerous maneuvers, and which would theoretically be replicable across similar models of autonomous vehicles. Similarly, AI-powered banking apps could well be tricked into performing transactions or transfers that are normally disallowed.
Ultimately, the research highlights the need for great scrutiny and automated testing of AI systems to find worst-case failure modes, says Gleaves. Read the research paper titled: “Adversarial Policies Beat Professional-Level Go AIs” here.
Image credit: iStockphoto/Saran_Poroong