AI's "Screams of the damned" is the future of music

0 3

By Matthew Griffin Intelligence and the Senses 12th March 2021

WHY THIS MATTERS IN BRIEF

DeepFake videos are already destabilising democracy, and now DeepFake Audio is here to do the same to the music industry.

Love the Exponential Future? Join our XPotential Community, future proof yourself with courses from XPotential University, connect, watch a keynote, or browse my blog.

For a while now Artificial Intelligence (AI) has been used to create a variety of new types of synthetic music, including new catchy pop songs, lyrics, some classical favourites, and even full albums and elevator music – with Warner recently signing you guessed it, Endel, the AI who produced the elevator music. As for that last one maybe Warner should consider buying another AI that was developed a while back to help A&R companies pick chart toppers …

Asides from being used to create brand new music though now the audio cousins of DeepFake videos are coming to help bring dead music stars back to life – stars like Frank Sinatra. Music, or Audio, Deepfakes as they’re known are cute tricks, but they could change the music scene forever – much in the same way that CGI, DeepFakes, and Synthetic Content are changing the rest of the world’s media and entertainment industries.

‘It’s Christmas time! It’s hot tub time!” sings Frank Sinatra. At least, it sounds like him. With an easy swing, cheery bonhomie, and understated brass and string flourishes, this could just about pass as some long lost Sinatra demo. Even the voice – that rich tone once described as “all legato and regrets” – is eerily familiar, even if it does lurch between keys and, at times, sounds as if it was recorded at the bottom of a swimming pool.

The song in question not a genuine track, but a convincing audio DeepFake created by “research and deployment company” OpenAI, whose Jukebox project uses AI to generate music, complete with lyrics, in a variety of genres and artist styles. Along with Sinatra, they’ve done what are known as “deepfakes” of Katy Perry, Elvis, Simon and Garfunkel, 2Pac, Céline Dion and more. Having trained the model using 1.2m songs scraped from the web, complete with the corresponding lyrics and metadata, it can output raw audio several minutes long based on whatever you feed it. Input, say, Queen or Dolly Parton or Mozart, and you’ll get an approximation out the other end as you can hear on Soundcloud:

OpenAI · Jukebox samples: Novel lyrics

“As a piece of engineering, it’s really impressive,” says Dr Matthew Yee-King, an electronic musician, researcher and academic at Goldsmiths. “They break down an audio signal into a set of lexemes of music – a dictionary if you like – at three different layers of time, giving you a set of core fragments that is sufficient to reconstruct the music that was fed in. The algorithm can then rearrange these fragments, based on the stimulus you input. So, give it some Ella Fitzgerald for example, and it will find and piece together the relevant bits of the ‘dictionary’ to create something in her musical space.”

Admirable as the technical achievement is, there’s something horrifying about some of the samples, particularly those of artists who have long since died – sad ghosts lost in the machine, mumbling banal cliches. “The screams of the damned” reads one comment below that Sinatra sample; “SOUNDS FUCKING DEMONIC” reads another.

Deepfake music is set to have wide-ranging ramifications for the music industry as more companies apply algorithms to music. Google’s Magenta Project – billed as “exploring machine learning as a tool in the creative process” – has developed several open source APIs that allow composition using entirely new, machine-generated sounds, or human-AI co-creations. Numerous startups, such as Amper Music, produce custom, AI generated music for media content, complete with global copyright. Even Spotify is dabbling, its AI research group is led by François Pachet, former head of Sony Music’s computer science lab.

It’s not hard to foresee, though, how such deepfakes could lead to ethical and intellectual property issues. If you didn’t want to pay the market rate for using an established artist’s music in a film, TV show or commercial, you could create your own imitation. Streaming services could, meanwhile, pad out genre playlists with similar sounding AI artists who don’t earn royalties, thereby increasing profits. Ultimately, will streaming services, radio stations and others increasingly avoid paying humans for music?

Legal departments in the music industry are following developments closely. Earlier this year, Roc Nation filed DMCA takedown requests against an anonymous YouTube user for using AI to mimic Jay-Z’s voice and cadence to rap Shakespeare and Billy Joel. Both are incredibly realistic.

“This content unlawfully uses an AI to impersonate our client’s voice,” said the filing. And while the videos were eventually reinstated “pending more information from the claimant”, the case – the first of its kind – rumbles on.

Roc Nation declined to comment on the legal implications of AI impersonation, as did several other major labels contacted by journalists: “As a public company, we have to exercise caution when discussing future facing topics,” said one anonymously. Even UK industry body the BPI refused to go on the record with regard to how the industry will deal with this brave new world and what steps might be taken to protect artists and the integrity of their work. The IFPI, an international music trade body, also didn’t respond to E-Mails.

Perhaps the reason is, in the UK at least, there’s a worry that there’s not actually a basis for legal protection.

“With music there are two separate copyrights,” says Rupert Skellett, head of legal for Beggars Group, which encompasses indie labels 4AD, XL, Rough Trade and more. “One in the music notation and the lyrics – e.g. the song – and a separate one in the sound recording, which is what labels are concerned with. And if someone hasn’t used the actual recording” – if they’ve created a simulacrum using AI – “you’d have no legal action against them in terms of copyright with regards to the sound recording.”

There’d be a potential cause of action with regards to “passing off” the recording, but, says Skellett, the burden of proof is onerous, and such action would be more likely to succeed in the US, where legal protections exist against impersonating famous people for commercial purposes, and where plagiarism cases like Marvin Gaye’s estate taking on Blurred Lines have succeeded. UK law has no such provisions or precedents, so even the commercial exploitation of deepfakes, if the creator was explicit about their nature, might not be actionable. “It would depend on the facts of each case,” Skellett says.

Some, however, are excited by the creative possibilities. “If you’ve got a statistical model of millions of songs, you can ask the algorithm: what haven’t you seen?” says Yee-King. “You can find that blank space, and then create something new.” Mat Dryhurst, an artist and podcaster who has spent years researching and working with AI and associated technology, says: “The closest analogy we see is to sampling. These models allow a new dimension of that, and represent the difference between sampling a fixed recording of Bowie’s voice and having Bowie sing whatever you like – an extraordinary power and responsibility.”

Deepfakes also pose deeper questions: what makes a particular artist special? Why do we respond to certain styles or types of music, and what happens when that can be created on demand? Yee-King imagines machines able to generate the perfect piece of music for you at any time, based on settings that you select – something already being pioneered by the startup Endel – as well as pop stars using an AI listening model to predict which songs will be popular or what different demographics respond to.

“Just feeding people an optimised stream of sound,” he says, “with artists taken out of the loop completely.”

But if we lose all sense of emotional investment in what artists do – and in the human side of creation – we will lose something fundamental to music.

“These systems are trained on human expression and will augment it,” says Dryhurst. “But the missing piece of the puzzle is finding ways to compensate people, not replace them.”

Although when it comes to replacing people and having all of the money pie not just a slice of it, personally I doubt that there are going to be that many companies out there that will bat an eyelid at putting the future Taylor Swift out of a job … Which then leaves future artists with the question: How do we compete?

Matthew Griffin / About Author

Matthew Griffin, multi-award winning Futurist and named Futurist of the Year 2024, has been described as a "Walking encyclopaedia of the future" by NASA and a futurist polymath. One of the world's most renowned futurists and strategic foresight experts Matthew is the 15 times author of the blockbuster "Codex of the Future" series, and is the Founder and Futurist in Chief of the 311 Institute, a global Futures and Deep Futures advisory firm working across the next 50 years, XPotential University, the world's first free futures and foresight university, and the World Futures Forum which works with the United Nations to solve the worlds greatest challenges. Matthew is an in demand international keynote, acclaimed university lecturer and mentor, and host of the hit Fanatical Futurist podcast.

A rare talent in his past Matthew helped build and run several multi-billion dollar business units for Atos, Dell-EMC, and IBM, and his ability to identify, track, and explain the impacts of hundreds of emerging technologies and trends on global business, culture, and society has earned him a powerful reputation and a roster of clients that include royal households, world leaders, G7, G20, and G77+ governments, and many of the world's most respected brands including ABB, Accenture, Adidas, AON, ARM, BCG, Centrica, Citi Group, Coca Cola, Dentons, Deloitte, Disney, Dow, EY, KPMG, Lego, Legal & General, LinkedIn, Microsoft, PepsiCo, Qualcomm, RWE, Samsung, T-Mobile, UBS, VISA, and many others. He was also the only futurist invited to talk at the UN COP28 held in Dubai alongside world leaders.

Regularly featured in the global media including the AP, BBC, Bloomberg, CNBC, Discovery, Forbes, Khaleej Times, Telegraph, TIME, ViacomCBS, WIRED, and the WSJ, Matthews mission is to help organisations create a fair and sustainable future whose benefits are shared by everyone irrespective of their ability, background, or circumstances.