WHY THIS MATTERS IN BRIEF
The creation of content is increasingly being automated using advanced AI and this is just another example of how in the future creators will have to compete with the machines.
Want a real podcast? Mine’s online! Now, anyway … as our ability to use Artificial Intelligence (AI) to manipulate and create content, including AI generated podcasts that never end as I wrote about a long while ago, a voice synthesis company based in Dubai, UAE has published a fictional podcast interview between Joe Rogan and Steve Jobs using realistic voices digitally cloned from both men. It takes place during the “first episode” of a purported podcast series called “Podcast.ai,” created by Play.ht, which sells voice synthesis services.
In the interview, you first hear a replication of Rogan’s voice created by voice cloning technology similar to that which I’ve covered many times before. Deep learning technology has allowed AI models to replicate distinctive voices with a high degree of accuracy, such as in the case of Darth Vader in Disney’s Obi-Wan Kenobi TV series.
The Future of Synthetic Content and Deepfakes, by futurist Matthew Griffin
To achieve the effect, someone must first train the AI model on existing samples of the voice that will be cloned, in the same way Facebook and others did a while ago with the likes of Bill Gates whose voices they cloned. Rogan is a prime target for AI voice training by deep learning models because ample quantities of his isolated voice exist on his podcasts. In fact, The Verge covered a PR stunt by an AI company called Dessa synthesizing Rogan in 2019.
Where this instance of AI tomfoolery becomes more interesting is that Play.ht additionally roped in the voice of deceased Apple CEO Steve Jobs. His voice, while robotically choppy at times, recalls his Apple keynotes and All Things Digital interviews from the late 2000s. And Play.ht claims that the text of the interview was generated by AI as well, possibly from a large language model (LLM) similar to GPT-3.
What do you think of the new podcast?!
“Transcripts are generated with fine-tuned language models,” writes Play.ht on the Podcast.ai website. “For example, the Steve Jobs episode was trained on his biography and all recordings of him we could find online so the AI could accurately bring him back to life.”
In keeping with its LLM roots, the 19-minute interview doesn’t make much sense. After a while, parts of the fictional interview begin to sound like conceptual mashups of common Jobs talking points, including aesthetics, revolutionary products, competitors such as Google, Microsoft, and Adobe, and the triumphs of the original Macintosh.
For example, during a section of the interview, fake Jobs delves into criticism of Microsoft that is very similar to what the real Jobs said in a famous 1995 interview for Triumph of the Nerds, but it’s not a carbon-copy – and you can tell the voice is synthesized if you compare the two.
“That’s the problem I’ve always had with Microsoft,” fake Jobs says. “In many ways they’re smart people and they’ve done good work, but they’ve never had any taste. They’ve never had any aesthetic sense.”
Whether it’s legal to use Jobs’ or Rogan’s vocal likenesses in this manner – particularly to promote a commercial product – remains to be seen. And despite the PR-stunt nature of the podcast, the concept of entirely fictional celebrity podcasts got our attention. As voice synthesis becomes more widespread and potentially undetectable, we’re looking at a future where media artifacts from any era will likely be completely fluid and malleable, shapable to fit any narrative. In this particular fictional world, Jobs is a huge Rogan fan.
“It’s nice to sit back in the car and listen to you rant,” he says.