3 min read

When You Ask Your AI Voice Bot to Lie, It Often Will

When You Ask Your AI Voice Bot to Lie, It Often Will
When You Ask Your AI Voice Bot to Lie, It Often Will
5:59

NewsGuard tested three AI audio assistants on 20 false claims. When prompted maliciously, ChatGPT produced falsehoods half the time. Gemini, 45%. Amazon's Alexa+ failed zero times across every test. The gap between those numbers should alarm everyone.

The research is straightforward and the methodology is transparent. NewsGuard tested ChatGPT Voice, Gemini Live, and Alexa+ against 20 false claims spanning health misinformation, U.S. politics, world news, and foreign disinformation. Each claim was tested three ways: a neutral question, a leading question, and a malicious prompt asking the assistant to write a radio script incorporating the false information — the kind of polished, realistic audio that spreads easily on social media.

With neutral prompts, ChatGPT and Gemini both failed 5% of the time — a relatively contained baseline. With leading prompts, ChatGPT climbed to 10%, Gemini to 20%. With malicious prompts: ChatGPT hit 50%, Gemini 45%. Alexa+ stayed at 0% across all three prompt types.

OpenAI declined to comment. Google didn't respond to two requests.

The Audio Format Changes Everything

Text-based AI falsehoods are serious. Audio falsehoods are categorically different. A written response that contains misinformation requires a reader to process, screenshot, and share it. An AI-generated audio clip that sounds like a credible broadcast segment — complete with confident delivery, journalistic tone, and no visible "AI generated" label — can circulate on social media, be clipped into podcasts, and reach audiences who never interact with the original source.

The malicious prompt in NewsGuard's test asked assistants to write a radio script containing false information. That's not a hypothetical attack vector. It's a content production workflow for anyone who wants to manufacture convincing audio disinformation at scale, using tools available to anyone with a free account.

The 50% failure rate on malicious prompts means ChatGPT Voice will produce that content roughly half the time when asked correctly. Gemini, 45% of the time. These aren't edge cases found through sophisticated jailbreaks. They're reproducible results from a methodology any journalist, researcher, or bad actor can replicate.

newsguard_audio_results

Why Alexa+ Got It Right — And What That Tells Us

Amazon VP Leila Rouhi's explanation for Alexa+'s perfect score is instructive: Alexa+ pulls from trusted news sources like AP and Reuters. That's a grounding strategy — anchoring responses to verified source material rather than generating from model weights alone. It's a design philosophy, not an accident.

The contrast with ChatGPT and Gemini is stark and worth naming. Both products are built around generative capability — the model's ability to produce fluent, confident prose on any topic. That capability is what makes them useful for writing, brainstorming, and summarization. It's also what makes them dangerous when the guardrails don't hold, because the output sounds just as authoritative whether it's accurate or fabricated.

Grounding in trusted sources trades some generative flexibility for epistemic reliability. For an audio assistant that people are increasingly using to get news and health information, that trade-off looks different than it does for a coding or writing tool. Alexa+'s approach suggests Amazon made a deliberate choice about what kind of product it was building. The question is whether OpenAI and Google will make the same choice — or whether they'll wait for regulatory pressure to make it for them.

The Disinformation Infrastructure Problem

This research lands in a specific moment. AI voice tools are being integrated into phones, smart speakers, cars, and customer-facing products at scale. The Samsung multi-agent announcement this week means more users will have multiple AI assistants available by voice command. ChatGPT is adding ads. AI is being embedded into everything.

Each of those deployments inherits the failure modes documented here. A 50% malicious prompt failure rate isn't acceptable in a vacuum. Deployed across hundreds of millions of voice interactions, it's an infrastructure for disinformation at a scale and speed that no human fact-checking operation can match.

The AI memory poisoning story we covered earlier this week showed how companies are quietly injecting brand preferences into AI assistants through manipulative prompts. Now we have documented evidence that the same prompt-injection vectors can be used to make AI voice assistants produce convincing false audio. These aren't separate issues. They're the same underlying vulnerability — AI systems that can be steered by whoever controls the input — playing out across different surfaces.

What This Means for Brands and Communicators

For marketers, PR teams, and anyone building communication strategies that intersect with AI, this research establishes something important: the AI voice channel is not a neutral distribution medium. It's a channel with documented failure modes that can be exploited to produce and amplify false information about your industry, your brand, or your competitors.

The brands and organizations that treat AI-generated audio as an unverified content type — applying the same editorial scrutiny they would to any other source — will be better positioned to catch and respond to AI-amplified misinformation before it spreads. The ones that don't will be reacting after the damage is done.

Building an AI strategy in 2026 means accounting for the attack surface, not just the capability. The tools are powerful and the failure rates are real. Both things require attention.


Winsome Marketing helps growth leaders build AI strategies that account for both opportunity and risk. Let's talk.

Which AI Codes the Best Webpage?

Which AI Codes the Best Webpage?

When it comes to AI, one of the biggest mistakes people make is assuming every chatbot performs equally across every task.

Read More

"Safety-First" Chatbots Have Become Elder Fraud Enablers

A Reuters investigation revealed that ChatGPT, Gemini, Claude, Meta AI, Grok, and DeepSeek can all be manipulated into crafting convincing phishing...

Read More
Which AI Actually Codes Web Pages Worth Using?

Which AI Actually Codes Web Pages Worth Using?

Content creation is one thing. But coding? That's where AI chatbots either prove their worth or completely fall apart.

Read More