WHY THIS MATTERS IN BRIEF
We hear all the time how AI and robots will replace all our jobs but not how it will help us become super humans at work, this is a great experiment …
Which is better at diagnosing radiological images – the world’s top radiologist or the world’s top radiology Artificial Intelligence (AI)? Or are they better together? If this latest experiment is to be believed then they’re better together so there’s great hope for us all yet – a trend I call the “Centaur Principle” where when you combine the best of human skills with the best of AI abilities we humans become “Super Human.” That is to say we become even better at our jobs, more productive, and our results and skills improve, in some cases by orders of magnitude.
Take Google as a quick example, are you better or worse off for having Google, and instant access to all the world’s information at your fingertips, in your life? It’s likely better and this is just one example of how technology used wisely can augment human skills beyond the ordinary.
The Future of Work keynote, by Futurist Matthew Griffin
Move forwards one step and what happens when you can use Conversational AI to talk and debate with these AI’s – forget mouse clicks and hunting for data – just ask, query, debate, and discuss whatever you want using nothing more than natural language?
When we talk about the future of work, which I’ve been discussing a lot recently with the leaders of various governments and companies around the world, answering the fundamental question of “better together?” could impact the jobs of hundreds of millions of people across sectors where the trend of automation is already rife and on the rise.
After all, why automate someone when you can augment them and, as this study shows, boost company productivity and results by 30% or more? Now, hit the pause on that automation program and explore your alternatives.
Up until now it’s been difficult to answer this question because in spite of the rapid rise of both blue and white collar automation very few studies have been done which is why I found this one interesting.
According to a study conducted by researchers at Stanford University School of Medicine and Unanimous AI, small groups of radiologists moderated by AI algorithms achieve higher diagnostic accuracy against individual radiologists or the machine learning algorithms alone.
The technology used for research is called Swarm AI. Swarm AI is a swarm intelligence technology by Unanimous AI that empowers networked groups of humans by combining their individual insights in real-time with the help of AI algorithms to converge on optimal solutions. The research paper was presented on earlier this week at the SIIM Conference on Machine Intelligence in Medical Imaging.
The researchers involved performed the study with a group of eight radiologists at different locations, connected by Swarm AI algorithms. The radiologists reviewed a set of 50 chest x-rays and for each of the X-ray predicted the likelihood that the patient has pneumonia.
After a few seconds of individually assessing the results of the chest x-rays, the group worked together as a “Swarm”, converging on a probabilistic diagnosis to predict the likelihood of a patient having pneumonia. This generated a set of 50 probabilities for the 50 test cases.
At the same time, separately, the same set of 50 chest X-Rays were run through CheXNet software algorithm, a state-of-the-art 121-layer convolutional neural network, that beat humans last year in predicting which patient suffering from pneumonia. CheXNet has been proved to outperform individual human radiologists in pneumonia screening tasks as per prior studies.
These two sets of probabilities were then further compared using different statistical techniques.
The performance of the Swarm AI system involving a small group of human radiologists was evaluated against the software-only CheXNet system. These two methods were analyzed across three different performance metrics, namely, binary classification accuracy, Mean Absolute Error, and ROC analysis. Let’s see how these two methods performed.
- Binary Classification:Fifty-percent was set as the cut off probability for classifying a positive diagnosis. The CheXNet system achieved 60% diagnostic accuracy across the 50 test cases, while the Swarm AI system achieved 82% accuracy across the same 50 cases. Also, The Swarm AI was more accurate in binary classification as compared to the ML system (p<0.01, μdifference = 21.9%).
- Mean Absolute Error:MAE is the absolute value of the Ground Truth (checking the classifications that machine learning algorithms make against what they know in reality) minus the Predicted Probability. A bootstrap analysis was performed for calculating MAE which revealed that the Swarm AI had significantly higher probabilistic accuracy than the ML system (p<0.001, μdifference = 21.6%).
- ROC Analysis:The Swarm AI system and the CheXNet system have different approaches to probabilistic forecasting. This is why a ROC (Receiver operating characteristic) analysis was performed that compared the true positive rate to the false positive rate across different cut-off points. This meant that the higher the ratio the better the classification. Area Under the ROC Curve (AUROC) was measured for both methods. Again, the swarm AI system managed to achieve an AUROC of 0.906, while the ML system achieved 0.708.
In non geek speak the Swarm Human-Machine AI system produced far more accurate results in the diagnosis of pneumonia than even the best state-of-the-art ML system, like CheXNet.
“Diagnosing pathologies like pneumonia from chest X-rays is extremely difficult, making it an ideal target for AI technologies. The results of this study are very exciting as they point towards a future where doctors and AI algorithms can work together in real-time, rather than human practitioners being replaced by automated algorithms,” says Dr. Matthew Lungren, Assistant Professor of Radiology at Stanford University, in the Unanimous AI blog.
This suggests that Swarm algorithms are a powerful tool when it comes to establishing Ground Truth for training use as well as for validating the machine learning systems.
“It is likely that the Swarm AI system excels in certain types of cases, while the ML system excels in others. We believe future research should identify these differences, so each method can be applied to those cases which are most appropriate. Additional research is warranted using more definitive Ground Truth and a wider range of cases,” write researchers in the paper.