WHY THIS MATTERS IN BRIEF
Noone learns in the same way, but optimising lesson plans for students as well as AI’s, would help all of us learn new skills faster.
The world of education is being transformed by the arrival of Artificial Intelligence (AI), whether it’s AI’s being used to monitor children’s behaviours in schools and grading their work, coaching them, or whether it’s AI that’s actually doing their work for them or AI powered teachers with neural network brains taking the classes. Now, in a turn about face, one of education’s most central components, the curriculum, is being adapted helping AI’s lean new tasks faster.
It’s common knowledge that students learn best when the curricula they’re taking is just right for their level of skill, and the same is true for AI, so now a team of computer scientists in the US have created an AI that can design its own curricula so it can figure out the best way to teach itself and learn new tasks faster. The work could speed learning in self-driving cars and household robots, and it might even help crack previously unsolvable math problems.
In one of the new experiments, an AI program tries to quickly reach a destination by navigating a 2D grid populated with solid blocks. The “agent” improves its abilities through a process called reinforcement learning, a kind of trial and error.
To help it navigate increasingly complex worlds, the researchers, who were led by University of California Berkeley graduate student Michael Dennis and Natasha Jaques, a research scientist at Google, considered two ways in which they could draw the maps. One method randomly distributed blocks, but the AI didn’t learn much. Another method remembered what the AI had struggled with in the past and maximized difficulty accordingly. But that made the worlds too hard, and sometimes even impossible to complete.
So the team created a setting that was just right, using a new approach they call PAIRED. First, they coupled their AI with a nearly identical one, albeit with a slightly different set of strengths, which they called the antagonist. Then, they had a third AI design worlds that were easy for the antagonist but hard for the original protagonist.
That kept the tasks just at the edge of the protagonist’s ability to solve. The designer, like the two agents, used a neural network, a program inspired by the brain’s architecture, to learn its task over many trials.
After training, the protagonist attempted a set of difficult mazes. If it trained using the two older methods, it solved none of the new mazes. But after training with PAIRED, it solved one in five, the team reported last month at the Conference on Neural Information Processing Systems (NeurIPS).
“We were excited by how PAIRED started working pretty much out of the gate,” Dennis says.
In another study, presented at a NeurIPS workshop, Jaques and colleagues at Google used a version of PAIRED to teach an AI agent to fill out web forms and book a flight. Whereas a simpler teaching method led it to fail nearly every time, an AI trained with the PAIRED method succeeded about 50% of the time.
The PAIRED approach is a clever way to get AI to learn, says Bart Selman, a computer scientist at Cornell University and president of the Association for the Advancement of Artificial Intelligence.
Selman and his colleagues presented another approach for so-called “Auto-Curricula” at the meeting. Their task was a game called Sokoban, in which an AI agent must push blocks to target locations. But blocks can get stuck in dead ends, so success often requires planning hundreds of steps ahead – imagine rearranging large furniture in a small apartment.
Their system creates a collection of simpler puzzles to train on, with fewer blocks and targets. Then, based on the recent performance of their AI, it selects puzzles that the agent only occasionally solves, effectively ratcheting the lesson plan to the right level. Sometimes, the right puzzles are hard to predict, Selman says.
“The notion of what is a simpler task is not always obvious.”
The researchers tested their trained agent on 225 problems that no computer had ever solved. It cracked 80% of them, with about one-third of its success coming strictly from the novel training method.
“That was just fun to see,” Selman says. He says he now receives astounded messages from AI researchers who’ve been working on the problems for decades. He hopes to apply the method next to unsolved math proofs.
Pieter Abbeel, a computer scientist at UC Berkeley, also showed at the meeting that autocurricula can help robots learn to manipulate objects. He says the approach could even be used for human students.
“As an instructor, I think, ‘Hey, not every student needs the same homework exercise,’” Abbeel says, noting that AI could help tailor harder or easier material to a student’s needs. As for AI auto-curricula, he says, “I think it’s going to be at the core of pretty much all reinforcement learning.”