WHY THIS MATTERS IN BRIEF
As we continue to endeavour to read the mind of today’s so called black box AI’s there may be more than one way to do it, and while turning to biology is unconventional, it works.
The deep neural networks that power today’s Artificial Intelligence (AI) systems work in mysterious ways. They’re black boxes – a question goes in and an answer comes out the other side, and while we might not know exactly how a black box AI system works, importantly we know that it does work. Over the past year there have been a few attempts to try to read and analyse the minds of these black boxes, from companies like Nvidia who use visualisations, to MIT who tried to analyse the neural network’s layers, to Columbia University who pitted AI’s against each other. But, frankly, none of them even come close to the out the box thinking, if you’ll excuse the pun, of this approach – using biology itself to crack the black box open. And it worked.
A new study that mapped a neural network to the biological components within a simple yeast cell allowed researchers to watch the AI system at work, and it also gave them insights into cell biology in the process, and the resulting technology could help in the quest for new cancer drugs and personalised treatments.
First, let’s cover the basics of the neural networks used in today’s machine learning systems.
Computer scientists provide the framework for a neural network by setting up layers, each of which contains thousands of “artificial neurons” that perform tiny computational tasks. The trainers feed in a dataset which are often, for example, millions of cat and dog photos, millions of Go moves, millions of driver actions and outcomes, and so on, and the system connects together the neurons in its layers in an optimal way in order to make structured sequences of computations.
The system then runs the question through the neural network and checks to see how well it performed its task, for example, how accurately it distinguished cats from dogs. Finally the neural network rearranges the connection patterns between its neurons and runs through the question again to see if the new neural patterns create a better result. When the neural network is able to perform its task with great accuracy, its trainers consider it a success.
These days, black box AI systems are accomplishing remarkable things. They are, just for starters, the technology behind autonomous vehicles, expert level games machines and are also helping diagnose medical ailments with a better accuracy than human doctors.
“Although they’re called neural networks, these systems are only very roughly inspired by human neural systems,” explains Trey Ideker, a professor of Bioengineering and Medicine at University of California San Diego who led the research.
“Look at AlphaGo [the program that beat the Go grandmaster]. The inner workings of the [neural network] are a complete jumble, it looks nothing like the human brain,” Ideker says, “they’ve evolved a completely new thing that just happens to make good predictions. Machine learning systems can analyze the online behaviors of millions of people to flag an individual as a potential terrorist or suicide risk, yet we have no idea how the machine reached that conclusion.”
“For machine learning to be useful and trustworthy in healthcare,” Ideker said, “practitioners need to open up the black box and understand how a system arrives at a decision.”
Ideker wanted to see if he could use a biological approach to crack open AI’s black box and not just to spit out answers, but to show researchers how it reached those conclusions. And he also thought that by mapping a neural network to the components of a yeast cell, his team could learn about the way life works, and he was right.
“We’re interested in a particular [neural network] structure that was optimized not by computer scientists, but by evolution,” he says.
This project was doable because brewer’s yeast, which is a single cell organism, has been studied since the 1850s as a basic biological system.
“It was convenient because we had a lot of knowledge about cell biology that could be brought to the table,” Ideker says. “We actually know an enormous amount about the structure of a yeast cell.”
So his team mapped the layers of a neural network to the virtual components of a yeast cell, starting with the most microscopic elements, the nucleotides that make up its DNA, moving upward to larger structures such as ribosomes which take instructions from the DNA and make proteins, and finally to organelles like the mitochondrion and nucleus, which run the cell’s “operations.” Overall, the teams neural network, which they call DCell, makes use of 2,526 subsystems from the yeast cell.
DCell, which can be accessed online, allows researchers to change a cell’s DNA and see how those changes ripple upward to change its biological processes, and subsequent to that, cell growth and reproduction. Its training data set consisted of several million examples of genetic mutations in real yeast cells, paired with information about the results of those mutations.
The researchers found that DCell could use its simulated yeast to accurately predict cell growth. And since it’s a “visible” neural network, the researchers could see the cellular mechanisms that were altered when they messed around with the DNA.
This transparency means that DCell could potentially be used for in silico studies of cells, obviating the need for expensive and time-consuming lab experiments. If the researchers can figure out how to model not just a simple yeast cell but also complex human cells, the effects could be dramatic.
“If you could construct a whole working model of a human cell and run simulations on it,” says Ideker, “that would utterly revolutionize precision medicine and drug development.”
Cancer is the most obvious disease to study, because each cancer patient’s tumor cells contain a unique mix of mutations.
“You could boot up the model with the patient’s genome and mutations, and it would tell you how quickly those cells will grow, and how aggressive that cancer is,” Ideker says.
What’s more, pharma companies searching for new cancer drugs use cell growth as the metric of success or failure. They look at a multitude of molecules that turn different genes on or off, asking for each – does this potential drug cause the tumor cell to stop multiplying? With billions of dollars going to R&D for cancer drugs, an in silico shortcut has clear appeal.
Upgrading from yeast to human cells won’t be an easy task though. Researchers need to gather enough information about human patients to form a training data set for a neural network – they’ll need millions of records that include both patients’ genetic profiles and their health outcomes. But that data will accumulate fairly quickly, Ideker predicts.
“There’s a ton of attention going into sequencing patient genomes,” he says.
The trickier part is gathering the knowledge of how a human cancer cell works, so the neural network can be mapped to its component parts. Ideker is part of a consortium called the Cancer Cell Map Initiative that aims to help with this challenge. Cataloguing a cancer cell’s biological processes is tough because the mutations don’t only switch cellular functions on and off, they can also dial them up or down, and can act in concert in complicated ways.
Still, Ideker is hopeful that he can employ a machine learning technique called transfer learning to get from a neural network that models yeast cells to one that models human cells.
“Once you’ve built a system that recognizes cats, you don’t need to retrain the whole neural network to recognize squirrels,” he says.
The research was published in Nature Methods