Microsoft Discovery Is Transforming Scientific R&D
We're witnessing the birth of a new era in scientific discovery. At Microsoft Build 2025, the tech giant unveiled Microsoft Discovery—an enterprise...
3 min read
Writing Team
:
Oct 3, 2025 8:00:01 AM
Google DeepMind announced it had discovered 2.2 million new crystalline materials using AI. Microsoft and Meta followed with their own grand proclamations. Then the actual materials scientists showed up to the party, and things got uncomfortable.
The backlash was swift and technical. Anthony Cheetham at UC Santa Barbara found that over 18,000 of DeepMind's predicted compounds included elements like promethium and protactinium—radioactive materials so scarce they're functionally useless for real-world applications. Meta's carbon-capture candidates drew similar fire: computational chemist Berend Smit at EPFL suggested the team was "a little bit blinded to the reality" of what their AI had actually produced.
We're now in the messy middle of AI materials discovery—past the breathless announcements, before the verified breakthroughs. It's the phase where we figure out whether these systems are genuinely useful or just computationally expensive fantasy generators.
The pitch for AI in materials discovery is compelling. Traditional density functional theory (DFT) calculations—the mathematical approach that predicts how electrons behave in materials—are computationally expensive. Academic labs can afford to run DFT on a handful of compounds. Surveying millions would bankrupt most research budgets.
AI promises to sidestep that limitation. DeepMind's GNoME system learned from existing DFT calculations logged by projects like Lawrence Berkeley National Laboratory's Materials Project database, which contains about 200,000 crystal structures. Instead of running intensive calculations from scratch, GNoME learned to predict crystal stability much faster, then verified its most promising candidates with actual DFT.
Microsoft's MatterGen went further, claiming it could generate materials with specific target properties rather than blindly producing millions of candidates. Meta's team focused even more narrowly: using AI to identify metal-organic frameworks that might capture CO2 directly from air.
The approach is theoretically sound. The execution is where things get complicated.
Here's what the critics caught: many of these AI-generated materials are physically implausible, structurally naive, or just rebranded versions of compounds we already know about.
The A-Lab project—a robotic system that learned to synthesize materials from published papers—claimed to have produced 41 new inorganic compounds. Robert Palgrave at University College London called foul. His detailed critique, co-authored with Princeton's Leslie Schoop, concluded that A-Lab had mischaracterized its products and, in some cases, synthesized known materials while claiming novelty.
The deeper issue involves disorder. DFT typically predicts highly ordered crystal structures—perfect atomic arrangements that might only be stable at absolute zero. Real materials are messier. Johannes Margraf at the University of Bayreuth trained a machine-learning system on experimentally measured structures rather than DFT predictions. His model suggested that 80-84% of the approximately 380,000 "stable compounds" DeepMind highlighted as synthesis targets would actually be disordered in real life.
That's not a rounding error. It's a fundamental mismatch between what the AI predicts and what chemistry actually produces. Disordered materials can have wildly different properties than their ordered counterparts, which means the AI might be optimizing for materials that can't exist as designed.
Materials scientist Ekin Dogus Cubuk, who led the GNoME work before leaving DeepMind to found Periodic Labs, acknowledges that many predicted structures will likely be disordered. He frames GNoME as a signpost toward promising compounds requiring further investigation, not a materials catalog.
That's a more defensible position than DeepMind's original claim of "an order-of-magnitude expansion in stable materials known to humanity"—a phrase that raised hackles across the materials science community. "It's pretty implausible to say that 2.2 million things you haven't synthesized are new materials," says Jonathan Godwin, formerly of DeepMind and now founder of AI-materials firm Orbital Materials.
DeepMind points to over 700 GNoME-predicted compounds that other researchers independently synthesized, plus several previously unknown caesium-based compounds guided by GNoME structures. That's legitimately useful. It's also several orders of magnitude less impressive than 2.2 million new materials.
If you're watching the materials science drama thinking it's irrelevant to your work, consider this: every industry adopting AI is making the same fundamental error. We're confusing computational output with verified utility.
AI can generate millions of materials predictions, ad variants, content strategies, or customer segments. The question isn't whether it can generate them—it's whether they work when they collide with reality. Materials scientists are learning this lesson through peer review and replication. Marketing is learning it through failed campaigns and abandoned pilots.
That's not because the AI isn't sophisticated—it's because sophistication without domain expertise produces plausible nonsense at scale. DeepMind's promethium-laden compounds and marketing's hallucinated customer insights share a common ancestor: training data divorced from practical constraints.
Materials scientist Kristin Persson at Berkeley, who directs the Materials Project, remains bullish: "I'm completely convinced that if you're not using these kinds of method within the next couple of years, you'll be behind." She's probably right. But "using these methods" requires understanding their limitations, not just their capabilities.
We're in the awkward phase where AI materials discovery is neither fraud nor revolution—it's a tool that works better in some contexts than others, with limitations we're still mapping. The hype merchants oversold. The skeptics found real problems. The truth lives somewhere in the middle, probably involving a lot more experimental validation than anyone wants to fund.
For marketers watching this unfold, the lesson is clear: AI generates possibilities. Humans verify utility. The gap between those two activities is where most AI initiatives currently die, whether you're synthesizing crystals or optimizing conversion funnels.
Need help separating AI capability from AI hype in your marketing stack? Winsome Marketing's growth experts work with marketing leaders to identify where AI actually delivers value versus where it's just generating expensive possibilities. Let's talk about validation before scaling.
We're witnessing the birth of a new era in scientific discovery. At Microsoft Build 2025, the tech giant unveiled Microsoft Discovery—an enterprise...
You're a respected computer scientist at Waseda University, staring at your latest manuscript. You know the dirty secret everyone whispers about in...
Scientists have crossed a threshold that demands our attention: for the first time, artificial intelligence has successfully designed viruses capable...