Steve Goldstein: Navigating The Real Risks Of AI-Audio

Steve Goldstein’s Amplifi Media works with media companies and podcasters in developing audio content strategies. Goldstein writes frequently at the Amplifi blog. Steve can be reached directly at 203-221-1400 or sjgoldstein-at-amplifimedia-dot-com.

Yes, AI is opening doors to magnificent possibilities for podcasters and changing the world at warp speed. Productivity tops the list with tools for rapid transcriptions, brand safety, content research, and script writing, enhancing how podcasters create and manage content. Leveraging AI to save time and organize is a no-brainer, but worrisome developments are happening with AI – specifically AI audio.

Just a few weeks ago, the estate of the legendary comedian George Carlin (boy, I miss him) reached a settlement with the creators of a podcast who used AI to impersonate Carlin for a comedy special.

The podcast hosts trained an AI algorithm on decades of Carlin’s work without the estate’s permission, infringing on copyrights and sparking legal action. The settlement required removing the shows and prohibited the use of Carlin’s voice or likeness without estate approval, highlighting the copyright challenges AI poses. This case underlines the need for clear guidelines and ethical standards in using AI to mimic individuals dead or alive.

There’s more.

Voice phishing: A British CEO’s voice was cloned to authorize the fraudulent transfer of $250,000 in funds.

Call center cons: AI-generated voice technology impersonating individuals in real-time during calls to extract personal information or consent for bogus charges is on the rise.

Robo deepfakes: There are plenty of scams with robocalls imitating political candidates, including Donald Trump and Barack Obama, making inflammatory or false statements to stir public unrest or influence elections.

Speech reproduction: AI is used to clone the voices of celebrities to say controversial or humorous things they never actually said.

More podcast cases: The Carlin case is not the only one in podcasting. In 2019, a tech enthusiast created an AI model that could replicate Joe Rogan’s voice and used it to produce entire podcast episodes under Rogan’s brand. These episodes included fabricated, controversial content that Rogan had never discussed or endorsed. It happened again with Rogan and fake interviews with Donald Trump and OpenAI CEO Sam Altman.

Podcast host voices: Last year, Bill Simmons sparked chatter on how Spotify’s AI DJ, which was trained on a real voice, could be used to copy podcast host voices for AI-generated live reads.

Undoubtedly, there will be more experiments, fraud, and questionable synthetic content.

AI doesn’t know how to laugh, react, pause, change intonation, or ask curious follow-up questions

Artificial is the Opposite of Authentic

In talking about AI and podcasting at our recent “View From the Top” panel at Podcast Movement, Oxford Road’s Dan Granger said, “Artificial is the opposite of authentic.” That stuck with me. As did Joe Rogan’s post on “X” after his “interviews” with Trump and Altman: “This is going to get very slippery, kids.” Indeed.

Podcasters will need to navigate these waters carefully, using AI to enhance their offerings without compromising the deeply human characteristics that define the medium’s appeal.

Does AI Audio Sound Good?

The reality is that AI misses the mark on making emotional connections. Much of the AI show audio I have heard, while occasionally remarkable, is mechanical, vanilla, bland, and synthetic sounding. Sometimes, there are obvious mistakes that call into question a show’s credibility, like a mispronounced word – or the sentence structure just sounds off. I have heard dull AI generated scripts delivered by monotone AI voices. All of this can erode a podcast’s brand quality.

Emphasizing human creativity and emotional connection will be more critical than ever, as these are the elements that technology cannot authentically replicate—at least not yet.

Safeguarding the Integrity of Your Audio

The word “authenticity” is overused in podcasting, but it has never been more important. The best way to protect the value of your audio content and brand is by maintaining a show’s integrity. Being genuine is an effective offensive strategy. AI doesn’t know how to laugh, react, pause, change intonation, or ask curious follow-up questions. It doesn’t generate original content; instead, it remixes existing works into new configurations. The creative input from humans remains indispensable. Creativity is what adds depth and nuance. It’s a differentiator. Emphasizing human creativity and emotional connection will be more critical than ever, as these are the elements that technology cannot authentically replicate—at least not yet.

Using AI effectively requires proactive guardrails. A couple of guidelines:

  • Employ rigorous fact-checking protocols before posting content generated by AI.

  • If you use AI to supplement your podcast audio, consider labeling it “AI-Generated Content.”

  • Being transparent and credible with your audience goes a long way. Last year our client, Alpha Media, debuted the first AI DJ in Portland, Oregon. Alpha’s EVP Content Phil Becker smartly labeled her ‘AI Ashley’ to prevent confusion, giving listeners a clear boundary between human and artificial content.

  • We recommend being proactive with your IP. Apply for copyrights and monitor for unauthorized use of your audio content and host’s voice.

It’s tempting to use AI shortcuts, or worse, deepfakes, but content creators playing long ball should resist breaking the bond and trust they’ve worked so hard to build with their listeners. As cliché as it might sound, don’t lose sight of what makes podcasts truly resonant and meaningful: their authenticity.

As the great investor and pundit Warren Buffet has said, “It takes 20 years to build a reputation and five minutes to ruin it. If you think about that, you’ll do things differently.”

Steve Goldstein