FDA's AI Tools Fail Basic Tests While Commissioner Rushes Rollout

Written by Writing Team | Jun 5, 2025 12:00:01 PM

The Food and Drug Administration's new artificial intelligence tools are failing at basic tasks while Commissioner Dr. Marty Makary pushes an aggressive rollout timeline that has FDA staff concerned about public safety. Internal sources reveal that the agency's AI systems can't upload documents properly, provide incorrect summaries, and aren't even connected to the internet—yet they're being deployed to help approve medical devices that keep people alive.

This isn't a beta testing problem. This is a regulatory crisis in the making.

When AI Fails, People Die

The FDA's Center for Devices and Radiological Health oversees pacemakers, insulin pumps, CT scanners, and other life-critical medical equipment. Their new AI tool, internally dubbed CDRH-GPT, is supposed to help reviewers sift through massive amounts of clinical trial data and animal studies to speed up approvals. Instead, according to two people familiar with the system, it's buggy, disconnected from internal FDA systems, and struggling with basic document handling.

Meanwhile, the agency's second AI tool, called Elsa, was rolled out to all FDA employees on Monday despite having significant functionality issues. When staff tested Elsa with questions about FDA-approved products, it provided summaries that were either incorrect or only partially accurate.

Dr. Arthur Caplan, head of the medical ethics division at NYU Langone Medical Center, captures the stakes perfectly: "I worry that they may be moving toward AI too quickly out of desperation, before it's ready to perform. It still needs human supplementation. AI is really just not intelligent enough yet to really probe the applicant or challenge or interact."

The keyword here is "desperation." The FDA suffered sweeping mass layoffs earlier this year, eliminating much of the backend support that enables device reviewers to issue approval decisions on time. Rather than addressing the fundamental staffing and resource problems, the agency is betting on unproven AI to fill the gaps.

The Dangerous Math of Regulatory Shortcuts

Here's what should terrify anyone who understands regulatory processes: 96.7% of FDA-approved AI/ML-enabled devices were cleared via the 510(k) pathway, which emphasizes substantial equivalence to existing devices rather than rigorous new clinical trials. Now the FDA wants to use AI tools that can't even handle basic information retrieval to evaluate these submissions.

The regulatory framework wasn't designed for this. The FDA's traditional paradigm of medical device regulation was not designed for adaptive artificial intelligence and machine learning technologies. Yet Commissioner Makary set a June 30 deadline for AI rollout and claims the agency is "ahead of schedule."

That's not progress—that's recklessness. Sources familiar with CDRH-GPT say it still needs significant work and that FDA staff were already concerned about meeting the June deadline in its original form. But the pressure to deploy is apparently outweighing the need for the technology to actually work.

The Transparency Black Hole

The problems run deeper than just buggy software. Only 3.6% of FDA AI device approvals reported race/ethnicity data, 99.1% provided no socioeconomic data, and 81.6% didn't report the age of study subjects. We're already approving AI medical devices with massive blind spots in demographic representation.

Now we're adding another layer of opacity: AI tools making recommendations about other AI tools, with limited transparency about how either system works. Some AI programs are referred to as "black-box" models because the algorithms reflect underlying patterns that may be too convoluted for a person, including the initial programmer, to understand.

The FDA staff testing these tools can't even get accurate summaries of public information, yet we're supposed to trust them with complex regulatory decisions involving proprietary clinical data and novel medical technologies? The math doesn't work.

The Ethics of Expedience

Richard Painter, a law professor at the University of Minnesota and former government ethics lawyer, raises another crucial concern: potential conflicts of interest. There's no clear protocol to prevent FDA officials using AI technology from having financial ties to companies that could benefit from AI approvals.

"We need to make sure that the people involved in these decisions do not have a financial interest in the artificial intelligence companies that would get the contracts," Painter said. "A conflict of interest can greatly compromise the integrity and the reputation of a federal agency."

This isn't theoretical. The FDA has issued recall notices for about 5% of AI/ML-enabled medical devices, and that's with human reviewers doing the heavy lifting. What happens when we're relying on AI systems that provide incorrect summaries to evaluate AI devices that may have fundamental flaws?

The Human Cost of Automation

Perhaps most troubling is how this rush to AI reflects a broader abandonment of human expertise in favor of technological shortcuts. Commissioner Makary bragged that "the first reviewer who used this AI assistant tool actually said that the AI did in six minutes what it would normally take him two to three days to do."

But medical device approval isn't supposed to be fast—it's supposed to be thorough. Those two to three days represent careful analysis, critical thinking, and the kind of nuanced evaluation that catches problems before they reach patients. If an AI can do it in six minutes, either the human was wasting two days and 23 hours and 54 minutes, or the AI is missing something crucial.

Given that the AI tools are providing incorrect summaries of basic public information, we can guess which scenario is more likely.

Some FDA staff don't see AI as a solution to overwhelming workloads—they see it as a sign they may eventually be replaced. One source noted the FDA is "already spread thin from the RIF [layoffs] and the steady loss of individuals while in a hiring freeze and no capacity to backfill."

The Slippery Slope Ahead

This is how regulatory capture happens in the AI age: not through corrupt officials taking bribes, but through desperate agencies deploying unproven technology to solve systemic problems that require human judgment and adequate resources.

Once these AI tools are embedded in FDA approval processes, they'll become difficult to remove even when they fail. The agency will develop dependencies on them, staff will be trained around them, and the cost of reverting to human-centered processes will seem prohibitive.

Meanwhile, medical device companies will optimize their submissions for AI review rather than human review, potentially gaming systems that already provide incorrect information. The feedback loops will become increasingly automated and increasingly divorced from actual patient safety considerations.

What This Means for Everyone

If you or someone you love will ever need a pacemaker, insulin pump, or any other FDA-regulated medical device, this should concern you deeply. The agency responsible for ensuring these devices won't kill you is deploying AI tools that can't accurately summarize public information to speed up approval processes that are already dominated by shortcuts.

This isn't innovation—it's institutional negligence disguised as modernization. And it's happening because we've accepted the premise that faster is always better, even when lives hang in the balance.

The FDA needs adequate staffing, proper resources, and human expertise to do its job. AI tools might eventually help, but only after they're proven to work reliably. Rushing unproven technology into life-or-death regulatory decisions isn't efficiency—it's a dangerous experiment on public safety.

Commissioner Makary's June 30 deadline should be scrapped until these AI tools can pass basic functionality tests. Because the alternative is approving medical devices based on AI recommendations that we already know are unreliable.

That's not a risk worth taking.

Ready to develop AI strategies that prioritize safety and effectiveness over speed? Contact Winsome Marketing's growth experts to build technology implementations that enhance human decision-making rather than replacing critical thinking in high-stakes environments.

View full post