Anthropic Cleared in Federal Copyright Case
Judge William Alsup just handed down a ruling that should terrify every creative professional in America. In blessing Anthropic's wholesale piracy of...
6 min read
Writing Team
:
Oct 13, 2025 10:49:11 AM
OpenAI is fighting for its life in a Manhattan courtroom, and the weapon pointed at it isn't a novel legal theory or a sympathetic plaintiff—it's internal Slack messages. Authors and publishers suing the company have secured access to employee communications about OpenAI's deletion of a dataset containing pirated books from Library Genesis, and they're now pushing for attorney communications about that decision. If they succeed, those conversations could demonstrate willful copyright infringement, triggering damages of up to $150,000 per work. With tens of millions of copyrighted works at issue across consolidated cases, we're talking about exposure in the tens of billions—potentially more.
But the stakes extend far beyond OpenAI's balance sheet. This case is establishing the discovery framework that will govern every major AI copyright lawsuit for the next decade. The legal questions being litigated right now—what constitutes waiver of privilege, when the crime-fraud exception applies, how courts handle spoliation in AI training cases—will determine whether AI companies can keep their internal deliberations private or whether every strategic decision becomes admissible evidence. We need better precedents. Fast. Because what's happening to OpenAI is about to happen to everyone else.
According to Bloomberg Law's reporting, OpenAI initially stated it deleted the LibGen dataset due to "non-use." That explanation seemed innocuous—a routine data hygiene decision. But plaintiffs seized on it, arguing that by offering one reason for deletion, OpenAI waived attorney-client privilege over all reasons for the decision. The company later walked back the phrasing, claiming "due to" was never meant to disclose privileged rationale. Plaintiffs called it a "flip flop." The court sided with plaintiffs.
This is the legal equivalent of stepping on a landmine. The Second Circuit doesn't recognize selective waiver—if you disclose one privileged communication, you've waived privilege for everything related to that subject matter. OpenAI's attempt to explain its data deletion without revealing attorney advice may have inadvertently opened the door to discovery of every email, Slack thread, and legal memo discussing LibGen.
Rebecca Roiphe, a professor at New York Law School, put it bluntly: "The plaintiffs' point of 'no take backs' is right." OpenAI offered what Roiphe calls "the most innocuous reason" for deletion. If attorney conversations confirm that rationale, why is the company fighting so hard to keep them private? The implication is obvious: there are other, less innocuous reasons in those communications.
This pattern will repeat in every AI copyright case. Companies will be asked why they trained on specific datasets, why they removed certain sources, why they implemented particular filters. Any public explanation—however carefully worded—risks waiving privilege. And once privilege falls, plaintiffs gain access to the strategic calculus behind every training decision, including assessments of legal risk, infringement probability, and mitigation strategies.
Plaintiffs aren't stopping at waiver arguments. They're also invoking the crime-fraud exception to attorney-client privilege, claiming OpenAI's communications may reveal criminal copyright infringement or intentional destruction of evidence. If an attorney advised OpenAI to destroy the LibGen data rather than simply stop using it, that lawyer could be considered a participant in the infringement.
This is what Nathan Mammen, a partner at Snell & Wilmer, calls the "nuclear option." Courts rarely pierce privilege via crime-fraud because it would incentivize constant fishing expeditions by opposing counsel. Mammen has never seen it asserted in a copyright case—it's more common in patent litigation when invalidity is tied to fraud on the USPTO.
But these aren't normal copyright cases. Anthropic settled its author class action in August for $1.5 billion, citing "inordinate pressure" to avoid trial exposure that could have reached $1 trillion for downloading millions of pirated books from the same LibGen source. The scale of potential damages changes the calculus. When statutory damages can hit $150,000 per work and you've trained on tens of millions of works, the crime-fraud exception stops looking like a long shot and starts looking like plaintiffs' best leverage.
David Schultz, a professor at Hamline University, notes that discovering attorney communications would be an "enormous" blow to OpenAI's defense. "Finding out what attorneys said or what clients said to attorneys and back and forth probably gives us a lot of evidence regarding state of mind." Willfulness is the multiplier that transforms manageable statutory damages into company-ending liability.
If the court grants access to attorney communications and those records show counsel advised deletion to avoid litigation exposure rather than for legitimate technical reasons, OpenAI faces two catastrophic outcomes: willful infringement findings that trigger maximum statutory damages, and spoliation sanctions for destroying evidence in anticipation of litigation.
Spoliation—the intentional destruction of evidence—is where OpenAI's legal position becomes genuinely precarious. If the court finds OpenAI deleted the LibGen dataset knowing litigation was probable, sanctions could include:
Mammen describes adverse inference instructions as "a very powerful stick" that puts "a heavy thumb on the scales." It doesn't guarantee the outcome, but it fundamentally shifts the burden. Juries hearing that a company destroyed evidence before litigation tend to assume the worst.
The question isn't whether OpenAI deleted data—that's undisputed. The question is when OpenAI anticipated litigation and why the company chose deletion over retention. If internal communications show legal counsel flagging copyright risk in early 2023 and recommending data deletion to limit exposure, that's spoliation. If the decision was made for legitimate technical or operational reasons before litigation was foreseeable, it's defensible.
We won't know until Judge Ona T. Wang conducts an in camera review of the contested documents—examining them privately before ruling on privilege. Legal experts predict she'll lean toward plaintiffs, given her recent order rejecting most of OpenAI's privilege claims over related materials. "If a party is acting aggressive, that just predisposes the judge," Roiphe said.
OpenAI isn't an outlier. This is the discovery playbook that will be deployed against every major AI company training on copyrighted content without explicit licenses. Google, Meta, Anthropic, Stability AI, Midjourney—all of them have made training data decisions that involve legal risk assessment, and all of them have attorney communications discussing those risks.
Anthropic already settled for $1.5 billion rather than litigate these exact issues. The company explicitly cited pressure to avoid trial exposure on willfulness claims tied to LibGen downloads. That settlement sets a floor, not a ceiling. OpenAI's consolidated cases involve more works, more plaintiffs, and potentially more egregious conduct if internal communications reveal knowledge of infringement.
The problem is systemic: AI training requires legal risk-taking, and documenting that risk creates litigation exposure. Companies need to consult counsel about copyright boundaries, fair use arguments, and dataset sourcing. But those consultations generate discoverable communications that plaintiffs will use to prove willfulness. The better your legal diligence, the more evidence you create of knowing infringement.
This Catch-22 applies across the industry. We're going to see:
Every case will involve fights over privilege, waiver, crime-fraud exceptions, and spoliation. And every case will be litigated under precedents established in In Re: OpenAI, Inc. Copyright Infringement Litigation.
The current legal architecture isn't equipped for AI at scale. Copyright law was designed for discrete acts of reproduction, not training processes that ingest millions of works to produce derived capabilities. Privilege doctrines were built for conventional litigation, not cases where statutory damages can exceed a company's market capitalization.
We need several things urgently:
1. Statutory clarity on AI training and fair use. Congress should establish whether training on copyrighted works constitutes infringement, and if so, under what conditions. The current ambiguity forces companies to operate in a legal gray zone, then punishes them for documenting their risk analysis.
2. Discovery limits for AI training cases. Courts should develop standards for when attorney communications about training data decisions are discoverable. The current case-by-case approach creates unpredictability and chills legitimate legal consultation.
3. Proportional damages frameworks. $150,000 per work made sense when infringement was manual and discrete. When a single training run touches tens of millions of works, statutory damages become existential threats that incentivize settlement over adjudication of fair use.
4. Safe harbor provisions for good-faith compliance. Companies that implement reasonable copyright compliance measures—dataset audits, DMCA procedures, opt-out mechanisms—should receive some protection from willfulness findings. The current framework treats any infringement as equally culpable regardless of intent or mitigation efforts.
Yvette Joy Liebesman, a professor at Saint Louis University, notes that plaintiffs' attorneys are "going to get as much information as possible to get as much money for plaintiffs as possible." That's their job. But when the legal system creates incentives for trillion-dollar settlement demands based on statutory damages designed for individual infringement cases, something is broken.
In the short term, expect more settlements. Anthropic paid $1.5 billion to avoid trial. OpenAI is facing similar or greater exposure. Meta, Google, and other major players will watch these cases closely and likely settle their own copyright litigation before discovery gets this invasive.
In the medium term, expect operational changes. AI companies will implement more aggressive data filtering, expand opt-out mechanisms, and limit internal documentation of legal risk. The chilling effect on legal consultation is real—if discussing infringement risk with counsel creates discoverable evidence of willfulness, companies will discuss it less, not more.
In the long term, expect regulatory intervention. The current litigation wave is unsustainable. Either Congress will establish clearer rules for AI training, or courts will develop common law frameworks that balance copyright protection against technological development. The alternative is a world where every major AI company faces existential copyright liability, and innovation moves offshore to jurisdictions with less aggressive enforcement.
OpenAI's privilege fight isn't about one company's legal strategy. It's about whether AI development can coexist with copyright law under the existing framework, or whether we need fundamental reform. The cases being decided now—in Manhattan federal court, with billions of dollars and the future of the industry at stake—will determine that answer.
We're watching the birth of AI copyright precedent in real time. The question is whether those precedents will be coherent, proportional, and workable, or whether they'll be the legal equivalent of duct tape on a system that was never designed for this scale of technological change.
If you're navigating AI development under uncertain copyright frameworks and need strategic guidance that balances innovation with legal risk, we're here. Let's talk about building sustainably.
Judge William Alsup just handed down a ruling that should terrify every creative professional in America. In blessing Anthropic's wholesale piracy of...
Disney's $300 million lawsuit against Midjourney reads like a masterclass in corporate theater. The entertainment giant, clutching its Mickey Mouse...
Another day, another headline screaming about AI gone rogue in the legal profession. This time, it's New Jersey attorney Sukjin Henry Cho, slapped...