CritPt Benchmark: No AI Model Scores Above 9% on Graduate-Level Physics Problems
We just got a brutal reality check on how far AI still has to go.
5 min read
Writing Team
:
Feb 11, 2026 8:00:00 AM
Berkeley researchers just destroyed AI's central promise.
Every vendor pitch selling generative AI to enterprise clients includes the same seductive claim: these tools will reduce workload, automate tedious tasks, and free employees to focus on high-value strategic work. It's the foundational justification for billions in corporate AI spending.
New research from UC Berkeley's Haas School of Business suggests that promise is backwards. AI doesn't reduce work—it systematically intensifies it.
@aeyespybywinsome Harder not smarter.
♬ original sound - AEyeSpy
In a study spanning April to December 2025, researchers Aruna Ranganathan and Xingqi Maggie Ye observed how generative AI transformed work patterns at a 200-person U.S. technology company. Published in Harvard Business Review on February 9, 2026, their findings should make every leadership team reconsider their AI rollout strategy.
The company didn't mandate AI adoption. It simply offered enterprise subscriptions to commercial AI tools and let employees experiment. What happened next reveals the gap between AI's theoretical benefits and its practical effects on human work.
Employees worked faster. They tackled broader scopes of responsibility. They extended work into more hours of the day. Crucially, nobody asked them to do this—they voluntarily expanded their workload because AI made "doing more" feel accessible, possible, and often intrinsically rewarding during the experimental phase.
The researchers identified three specific mechanisms through which AI intensifies rather than reduces work, and the implications matter for anyone managing teams, budgets, or growth strategy.
Because generative AI can fill knowledge gaps on demand, workers increasingly absorbed responsibilities that previously belonged to specialists. Product managers started writing code. Researchers took on engineering tasks. Designers handled technical implementation they would have previously delegated or avoided entirely.
AI made these tasks feel newly accessible by providing immediate feedback and correction along the way. Workers described it as "just trying things" with the AI—casual experimentation that accumulated into meaningful expansion of job scope. The work that might have justified additional headcount or specialized hiring simply got absorbed into existing roles.
The knock-on effects created secondary workload increases. Engineers spent more time reviewing, correcting, and guiding AI-generated work produced by colleagues who were "vibe-coding" beyond their expertise. This oversight surfaced informally through Slack threads and desk-side consultations, adding invisible labor to engineers' plates without formal recognition or compensation.
For marketing and growth teams, the parallel is direct: AI tools for content creation, data analysis, or campaign optimization don't eliminate the need for expertise—they shift it. Someone still needs to evaluate quality, maintain strategic coherence, and catch the errors that automated systems introduce. That someone is usually your most experienced team member, whose time just became more fragmented.
The conversational interface of AI tools fundamentally changed when work happened. Because prompting an AI system feels closer to chatting than executing a formal task, workers slipped small work activities into moments that had previously been breaks.
Many employees prompted AI during lunch, in meetings, or while waiting for files to load. Some described sending a "quick last prompt" before leaving their desk so the AI could generate output while they stepped away. These micro-tasks rarely felt like "doing more work," yet over time they eliminated natural pauses and created a more continuous involvement with work.
Workers described realizing—often in hindsight—that as prompting during breaks became habitual, downtime no longer provided the same sense of recovery. Work became less bounded and more ambient, something that could always be advanced a little further without deliberate intention.
The boundary between work and non-work didn't disappear, the Berkeley researchers note, but it became dramatically easier to cross. For knowledge workers already struggling with work-life integration, AI tools effectively lowered the friction that previously protected personal time.
AI introduced a new rhythm where workers managed multiple active threads simultaneously: manually writing code while AI generated alternative versions, running multiple agents in parallel, or reviving long-deferred tasks because AI could "handle them" in the background.
Workers did this partly because they felt they had a "partner" helping them move through their workload. While this sense of partnership enabled momentum, the reality involved continual attention switching, frequent checking of AI outputs, and a growing number of open tasks creating cumulative cognitive load.
Over time, this rhythm raised expectations for speed—not through explicit demands, but through what became visible and normalized in everyday work. Many workers noted they were doing more at once and feeling more pressure than before using AI, even though automation's time savings were supposedly meant to reduce such pressure.
As one engineer summarized: "You had thought that maybe, oh, because you could be more productive with AI, then you save some time, you can work less. But then really, you don't work less. You just work the same amount or even more."
The pattern creates positive feedback that leadership teams typically miss until it's too late. AI accelerates certain tasks, which raises expectations for speed. Higher speed increases reliance on AI. Increased reliance widens the scope of what workers attempt. Wider scope expands the quantity and density of work.
Organizations might view this voluntary expansion as validation—if workers are doing this on their own initiative, isn't that the productivity explosion we've been promised?
Ranganathan and Ye's research reveals why that's dangerous thinking. What looks like higher productivity in the short run masks silent workload creep and growing cognitive strain as employees juggle multiple AI-enabled workflows. Because the extra effort is voluntary and framed as enjoyable experimentation, it's easy for leaders to overlook how much additional load workers carry.
The cumulative effect: fatigue, burnout, and a growing sense that work is harder to step away from, especially as organizational expectations for speed and responsiveness rise. Overwork impairs judgment, increases error likelihood, and makes it harder to distinguish genuine productivity gains from unsustainable intensity.
Instead of passively responding to how AI reshapes work, the Berkeley researchers propose that organizations develop what they call an "AI practice"—intentional norms and routines that structure how AI is used, when it's appropriate to stop, and how work should and should not expand in response to newfound capability.
Without such practices, they argue, the natural tendency of AI-assisted work is intensification, not contraction, with serious implications for burnout, decision quality, and long-term sustainability.
Their framework includes three specific interventions:
Intentional pauses: Brief, structured moments that regulate tempo—protected intervals to assess alignment, reconsider assumptions, or absorb information before moving forward. For example, requiring one counterargument and one explicit link to organizational goals before finalizing major decisions. These pauses prevent the quiet accumulation of overload that emerges when acceleration goes unchecked.
Sequencing: Deliberately shaping when work moves forward, not just how fast. This includes batching non-urgent notifications, holding updates until natural breakpoints, and protecting focus windows where workers are shielded from interruptions. Rather than reacting to every AI-generated output as it appears, sequencing encourages work to advance in coherent phases that preserve attention and reduce context switching.
Human grounding: Protecting time and space for listening and human connection. Short opportunities to connect with others interrupt continuous solo engagement with AI tools and help restore perspective. Beyond perspective, social exchange supports creativity—AI provides a single synthesized viewpoint, but creative insight depends on exposure to multiple human perspectives.
The Berkeley research suggests that generative AI's promise lies not only in what it can do for work, but in how thoughtfully it's integrated into daily rhythm. Without intention, AI makes it easier to do more—but harder to stop.
For marketing leaders and growth teams evaluating AI adoption, this means the productivity gains vendors promise may arrive with hidden costs that don't appear in quarterly metrics until turnover spikes or quality degrades. The question isn't whether AI will change work—it's whether you'll actively shape that change or let it quietly shape you.
Winsome Marketing's growth experts help organizations build sustainable AI practices that enhance rather than exhaust teams. Strategic AI implementation requires more than tool selection—it demands understanding how these systems reshape work patterns, decision-making, and team dynamics in ways that compound over time.
The research is clear: AI doesn't reduce work. The only question is whether you'll recognize intensification before it becomes crisis.
We just got a brutal reality check on how far AI still has to go.
We have a problem with AI that no amount of training data will fix: Large language models hallucinate with confidence, asserting falsehoods as...
Researchers at the University of Liverpool just developed a computer model that processes audiovisual signals the way human brains do—by borrowing...