MIT researchers have taught their AI to recognise sounds

0 1

By Matthew Griffin Intelligence and the Senses 5th December 2016

WHY THIS MATTERS IN BRIEF

Developing AI’s that can recognise and understand raw sound could have implications for autonomous vehicles, elderly care, entertainment, home security and much more

In recent years, computers have gotten remarkably good at recognizing speech and images. Think of the dictation software on most smartphones, or the algorithms that automatically identify images and people in photos posted to Google or Facebook.

But machine recognition of natural sounds – such as crowds cheering or waves crashing – has lagged behind. That’s because most automated recognition systems, whether they process audio or visual information, are the result of machine learning, in which computers search for patterns in huge compendia of training data. Usually, the training data has to be first annotated by hand – the equivalent of putting subtitles on your tv programs, which is prohibitively expensive and time consuming for all but the highest demand applications.

Then they trained a second neural network on the audio from the same videos. The second network’s goal was to correctly predict the object and scene tags produced by the first network just from listening to the videos that were being played.

The result was a network that could interpret natural sounds and associate them with image categories. For instance, it might determine that the sound of birdsong tends to be associated with forest scenes and pictures of trees, birds, birdhouses, and bird feeders.

“With the modern machine-learning approaches, like deep learning, you have many, many trainable parameters in many layers in your neural-network system,” says Mark Plumbley, a professor of signal processing at the University of Surrey, “that normally means that you have to have many, many examples to train that on. And we have seen that sometimes there’s not enough data to be able to use a deep-learning system without some other help. Here the advantage is that they are using large amounts of other video information to train the network and then doing an additional step where they specialize the network for this particular task. That approach is very promising because it leverages this existing information from another field.”

Plumbley says that both he and colleagues at other institutions have been involved in efforts to commercialize sound recognition software for applications such as home security, where it might, for instance, respond to the sound of breaking glass. Other uses might include elderly care, to identify potentially alarming deviations from ordinary sound patterns, or to control sound pollution in urban areas.

“I really think that there’s a lot of potential in the sound-recognition area,” he says.

Matthew Griffin / About Author

Matthew Griffin is a multi-award winning Futurist and expert in Disruption and Innovation, Geopolitics, Leadership, and Technology, who NASA have described as a "walking encyclopaedia of the future" and a "futurist Polymath." 15-time best selling author of the "Codex of the Future" series, Matthew is the Founder and Futurist in Chief of the 311 Institute, a global Futures and Deep Futures advisory firm working with royal households, world leaders, G7, G20, and G77 governments, NGOs, and multi-national mid and mega cap firms to help them explore, shape, and lead the next 50 years of business and society.

An award-winning YouTube creator with over a million followers, with an unrivalled global reach and impact, Matthew is a highly sought-after international keynote speaker, lecturer, and mentor who collaborates with global leaders through the United Nations Alliance of Civilizations (UNAOC) and United Nations General Assembly (UNGA) to shape pivotal initiatives such as the UN’s AI for Humanity program, the United Nations Conference of the Parties (UN COP), and the World Economic Forum in Davos.

As the former Global Head of Cloud, National Security, and Enterprise Sales for companies including Atos, Dell-EMC, and IBM, Matthew has a proven track record of building multi-billion dollar business units and turning failing divisions into market leaders. His ability to identify, analyse, and communicate the implications of hundreds of emerging technologies and trends is unparalleled, and his insights are trusted by many of the world’s most respected organisations, including ABB, Accenture, Adidas, AON, ARM, BCG, Centrica, Citi, Coca-Cola, Dentons, Deloitte, Dow Jones, EY, Google, KPMG, Lego, Legal & General, LinkedIn, Microsoft, PepsiCo, Qualcomm, RWE, Samsung, Siemens AG and Siemens Energy, T-Mobile, UBS, VISA, Walmart, Workday, Worldpay and many others.

Regularly featured in the global media including the AP, BBC, Bloomberg, CNBC, Discovery, Forbes, Khaleej Times, Telegraph, TIME, ViacomCBS, WIRED, and the WSJ, Matthews mission is to help organisations create a fair and sustainable future whose benefits are shared by everyone irrespective of their ability, background, or circumstances.