“Instead of millions of separate lawsuits with millions of juries, we will have a single proceeding before a single jury, Napster style.” — Judge William Alsup, Bartz v. Anthropic, Class Certification Order, July 17, 2025
In the summer of 2025, two rulings landed just 48 hours apart: Bartz v. Anthropic (Judge Alsup) and Kadrey v. Meta (Judge Chhabria). Both agreed on one point — GenAI training on copyrighted works can, in theory, be transformative. But from there, the paths diverged.
On September 25, 2025, Anthropic agreed to a record-setting $1.5 billion class action settlement with authors over the use of pirated books to train its Claude models. Judge William Alsup granted preliminary approval, making it the largest publicly known copyright recovery in U.S. history. The deal covers past conduct only, requires Anthropic to destroy infringing copies, and pays roughly $3,000 per book for 482,460 titles. While the settlement sets a powerful benchmark for valuing training data, it reflects litigation risk rather than a true arm’s length license price — leaving open the larger questions about outputs, future markets, and what fair compensation should look like.
Before the final chapter in Anthropic, Meta secured a very different outcome in its own “battle of the books.” Judge Vince Chhabria in the same district court granted summary judgment in Meta’s favor on fair use — not because the court endorsed Meta’s conduct as lawful, but because the plaintiffs failed to make the right arguments or show market harm. The ruling was explicitly limited.
The copyright battles over AI are accelerating, but the real issues remain unsettled. At their core, these cases ask: when AI systems train on copyrighted works, who gets paid, and how much? The answers are still elusive, even as billion-dollar settlements and headline rulings stack up.
⚖️ What the Courts Actually Decided in the Anthropic and Meta Cases
Case Elements | Bartz v. Anthropic, 3:24-cv-05417-WHA (filed 8/19/24); consolidated as a class action on July 17, 2025 | Kadrey v. Meta, Case 3:23-cv-03417-VC (complaint filed 7/7/23); |
Judge | William Alsup (N.D. Cal.) | Vince Chhabria (N.D. Cal.) |
Procedural Posture | Class certified July 2025; interlocutory appeal and stay denied August 2025 | Summary judgment granted to Meta June 2025 |
Key Plaintiffs | Includes authors Andrea Bartz and Charles Graeber, as well as affiliated corporate entities | Sarah Silverman, Andrew Sean Greer, Junot Díaz |
AI Model | Claude (Anthropic) | LlaMA (Meta) |
Training Data Sources | Books3, purchased books, pirated libraries including LibGen and PiLiMi | Books3, LibGen, other shadow libraries |
Fair Use Ruling (Inputs) | Lawfully purchased books: fair use – purpose and character “exceedingly transformative”; however, storage of pirated material, even if later used for training is “inherently, irredeemably infringing, even if the pirated copies are immediately used for the transformative use and immediately discarded ” | Inputs treated as “highly transformative,” but fair use not guaranteed without market harm analysis |
Pirated Books / Acquisition | Fair use denied for building a permanent research library from pirated books; class certified for works scraped from shadow libraries | Use of books from shadow libraries did not defeat fair use in this case, though could in others |
Output Liability | No infringing outputs alleged; training on lawfully acquired books was held to be fair use; however, the court denied fair use for Anthropic’s storage and use of pirated books, allowing a class action to proceed for works scraped from shadow libraries. | In 2023, the court dismissed as “nonsensical” the claim that LLaMA itself was an infringing derivative work, holding that a model cannot be understood as a recasting or adaptation of the books it was trained on. The court likewise rejected the claim that all outputs were infringing derivatives. Llama did not regurgitate more than 50 tokens from any of the plaintiffs’ works – and could do even that only 60% of the time under deliberately coaxing prompts. Without “an infringing output, there can be no vicarious infringement.” |
Market Harm Analysis | Court ruled that training on purchased books does not entitle authors to a licensing market; however, treating the use of pirated books as fair use would effectively condone stealing works that could otherwise be purchased | Plaintiffs failed to prove harm; court found market dilution argument “promising” but undeveloped. Court expressed concern that generative AI could “flood the market with endless amounts of images, songs, articles, books, and more” and warned that using copyrighted works to train AI models without permission will “likely be illegal” in many circumstances; however, issues are fact-dependent. |
Key Judicial Commentary | “If Anthropic loses big, it will be because what it did wrong was also big.” | “This decision does not stand for the proposition that Meta’s use is lawful…only [] that these plaintiffs made the wrong arguments.” |
Current Status | Court denied Anthropic’s motion to stay, rejecting its bid to pause the case pending appeal on fair use and class certification. The order keeps the December 1, 2025 trial date in place and directs proceedings to continue. UPDATE: SETTLEMENT IN PRINCIPLE REACHED PENDING COURT APPROVAL. | Summary judgment granted to Meta; the court is still considering a claim for unlawful distribution. Discovery phase ongoing. |
The Sliding Doors Moment – Would Anthropic and Meta Have Turned Out Differently If the Courts Were Switched?
Judge Alsup took a bright-line approach: if the data was lawfully sourced, fair use might apply; if it came from pirated libraries, the defense collapsed at the threshold. He also minimized market harm, likening training Claude to teaching schoolchildren to write.
Judge Chhabria, by contrast, suggested transformation could still outweigh provenance. But he firmly rejected the schoolchildren analogy as “inapt,” stressing instead that generative AI can churn out endless articles, books, songs, and images in a fraction of the time it would take humans — a scale of production that could dramatically undercut creative markets. The issue with this “potentially winning argument”? The plaintiffs didn’t make it. They, according to the court, “barely gave it lip service.”
It was a Sliding Doors moment — the plaintiffs made the wrong arguments and failed to rebut Meta’s evidence that no market harm had occurred, leaving the door open for a different outcome in a stronger case.
Both Anthropic and Meta sidestepped deeper rulings — one through a negotiated settlement, the other through a partial, caveated victory. Neither outcome closes the broader debate about how copyright law will adapt to the training of large language models (LLMs). Notwithstanding the fact-specific nature of the decisions, the bright-line analogies the judges drew — to Napster, to teaching schoolchildren, to flooding the market — will continue to shape the discourse.
And Meta’s win was hardly absolute. Judge Chhabria himself cautioned:
“No matter how transformative [generative AI] training may be, it’s hard to imagine that it can be fair use to use copyrighted books to develop a tool to make billions or trillions of dollars while enabling the creation of a potentially endless stream of competing works that could significantly harm the market for those books.”
That’s the tension at the heart of nearly fifty cases still pending. Tremendous amounts of data are needed to train the large language models that power generative AI. Musicians, book authors, visual artists, and news outlets have all sued over the use of their work without permission or payment. AI companies, for their part, argue that training is fair use — a fundamentally transformative process, and one they say is essential for innovation.
🏛️ Can Courts Weigh Markets That Don’t Exist Yet?
Both Judge Chaabria and Judge Alsup were hesitant to recognize copyright law in the context of an undeveloped market. In Meta, the parties argued over whether a market for licensing general trade books exists or is likely to develop.
Both Meta and Anthropic agreed that this market, regardless of whether it does exist, is not one that creators are legally entitled to monopolize under the Copyright Act. YET THAT DOES NOT MEAN THERE SHOULD NOT BE ONE. If the markets, rather than the courts, could right-size the divide between creators and developers, then there would not be so many lawsuits in the first place.
The Napster era also illustrates how “potential markets” can become real ones. Even if the CD market wasn’t directly harmed, courts flagged risks to the developing digital download market. Napster’s rise and fall coincided with—and arguably helped catalyze—the music industry’s eventual embrace of legitimate digital distribution. By demonstrating massive consumer appetite for downloadable music, Napster revealed both the market opportunity and the need for licensed alternatives. The legal pressure that ultimately shut down Napster created space for licensed platforms to emerge—first iTunes, then Spotify, and today’s streaming ecosystem (including a re-launched, licensed Napster).
It’s this pattern of legal disruption reshaping entire industries that makes Judge Alsup’s repeated Napster references so striking—even if the comparison doesn’t quite fit. And that’s why we need to talk about Napster.
🎧 Wake Up, Kids: Napster Still Haunts the Debate
“Wake up, kids, we got the dreamer’s disease.” — The New Radicals, You Get What You Give, 1998
The same spirit that reimagined music at the turn of the millennium is now reimagining creativity itself through AI. And that’s why Napster keeps coming back into the conversation.
Judge Alsup invoked Napster three times in Bartz, calling Anthropic’s acquisition of training data “Napster-style downloading.” His warning was clear: build on unauthorized sources and risk the same fate.
But here’s the key distinction. Napster was about exact copying and peer-to-peer distribution of MP3s. LLMs don’t hand back the same books they were trained on; they synthesize and generate new outputs. That makes them both more transformative and more potentially dilutive. As Judge Chhabria noted, courts have never faced a technology that can be so creative while also capable of flooding the market with near-instant substitutes.
History shows what happens when courts and markets collide. Napster collapsed. Ross went under. Anthropic has bought itself time through settlement — but not every company can write that check.
🌅 Final Takeaways: From Napster’s Past to AI’s Future
Settlements may pause individual cases, but they don’t resolve the larger fight. Bartz and Kadrey show that:
- Fair use decisions will remain highly fact-specific — each case turns on the sourcing, the use, and the record before the court.
- Evidence will be decisive — outputs and market impact, not speculation, will drive outcomes.
- Market harm can, depending on the court, include dilution from “flooding the market” or the threat to future licensing markets, even if no such markets exist yet or the outputs of competing works would never be mistaken for a Hemingway.
- AI copyright exposure spans the full lifecycle — from how training data is sourced, to whether models are deployed internally or made public, to the unsettled question of what counts as a transformative output.
- Once the terms of these settlements come to light, they could start to form a rough benchmark for how much AI training data is worth when it involves copyrighted works—particularly those of well-known creators. But that benchmark will always be distorted, since the very incentive to settle (especially for companies with deep pockets who can “throw money at the problem”) skews the numbers from what a fully litigated case might have shown.
“Wake up, kids.” The Napster analogy reminds us that courts, rights holders, and markets evolve together. The lawsuits of today may be paving the way for the licensing frameworks of tomorrow. With fifty more cases pending — and the next wave already challenging outputs — this fight is only beginning.
— Lili Kazemi, General Counsel & AI Policy Lead, Anant US
Want more human insights on AI?📬 Subscribe to the Human Edge of AI weekly newsletter The Human Edge of AI AI at the intersection of law, policy, and our everyday lives
Appendices
The Critical Technological Distinctions Between Music File Sharing and AI
Anthropic and Meta are not file-sharing services. Their models don’t distribute—they generate. That difference between transformation and duplication is central to both rulings: Judge Alsup acknowledged this distinction, noting that training Claude on lawfully acquired books was “spectacularly transformative.” The law isn’t adjudicating simple copying anymore—it’s evaluating training, retention, and what AI systems remember.
Feature | Napster (2001) | Anthropic / Meta (2025) |
---|---|---|
Type of System | Peer-to-peer sharing | Generative AI |
Copying | Exact MP3 duplication | Tokenized statistical transformation |
Output | Identical files | Synthetic responses |
Legal Focus | Distribution | Training, retention, output liability |
The Training Divide
Both Meta and Anthropic agreed that training large language models on copyrighted works is, at its core, a transformative use. Judge Alsup put it plainly: “the purpose and character of using copyrighted works to train LLMs…was quintessentially transformative.” Training a model was likened to learning from books in order to create something new, not simply to replicate.
Judge Alsup compared Claude’s training to teaching children — reading books to write their own. But the analogy breaks down:
Factor | Human Learning | AI Training |
---|---|---|
Processing Scale | ~11M bits/sec (400 consciously) | Billions of tokens, rapidly |
Dissemination Impact | One essay from one student | One Claude powers millions of users |
Retention & Recall | Humans forget | LLMs don’t (unless governed) |
Recommendations for Legal and Creative Professionals
What Legal Teams Should Do Now
1. Audit Your AI Supply Chain Know what models you’re using, where they source their data, and whether indemnity clauses are meaningful. Generic disclaimers and indemnity clauses provide insufficient protection when courts are scrutinizing data acquisition methods.
2. Review Vendor Agreements Generic disclaimers aren’t enough. Push for transparency on data provenance and training scope. Require vendors to provide detailed information about their training datasets and any potential copyright risks.
3. Establish Internal Content Policies Create workflows to flag AI-generated content that might raise IP concerns. Implement safeguards to prevent models from reproducing substantial portions of copyrighted works, even when prompted to do so.
4. Prepare for the Market Harm Fight Document examples where AI-generated outputs compete with human work. Engage with licensing collectives to explore future royalty regimes. Build records showing how AI-generated content affects demand for human-created works.
5. Embed Compliance into AI Development Don’t let product or engineering teams operate in a legal vacuum. Require documentation at every stage: data acquisition, preprocessing, fine-tuning, and deployment.
For Content Creators and Publishers
Focus on Output-Focused Claims: Future litigation should target what AI systems generate, not just what they were trained on. Focus on substantial similarity, derivative work theories, and market substitution rather than training data acquisition.
Document Market Impact: Track specific instances where AI outputs compete with your content in the marketplace. This evidence will be crucial for the market dilution arguments that Judge Chhabria signaled could be decisive.
Consider Collective Action: While courts have rejected automatic licensing rights for training data, market dilution arguments may create opportunities for collective licensing regimes that compensate creators for competitive harm.

This piece is Part II of my four-part series on AI and copyright.
- Part I, The Wild West of AI, broke down the copyright rules for training data, the Meta and Anthropic rulings, and how courts are beginning to draw the first lines around what’s fair—and what’s not.
- Part II (you’re here) tackles the Napster problem: why analogies to music piracy fall short in the world of LLMs, and why AI synthesis requires a new legal lens.
- Part III will zoom in on output liability—when models mimic protected works in style, voice, or structure—and whether that’s enough to trigger copyright infringement.
- Part IV will shift from law to money: exploring how AI systems are being valued, taxed, and priced across jurisdictions, and what this means for global IP and transfer pricing regimes.
Case References
- Bartz v. Anthropic, No. 23-cv-03440-WHA (N.D. Cal.)
- Kadrey v. Meta Platforms, No. 23-cv-03417-VC (N.D. Cal.)
- The New York Times Co. v. Microsoft Corp. & OpenAI, No. 1:23-cv-11195 (S.D.N.Y.)
- April 2025 Order on Motions to Dismiss (Dkt. 551)
- [vi] Order dated May 13, 2025, In re: OpenAI, Inc. Copyright Infringement Litigation (Preservation Order)
- [vii] Order dated June 20, 2025, In re: OpenAI, Inc. Copyright Infringement Litigation (Affirming Preservation Order)
- Disney Enterprises, Inc. et al. v. Midjourney, Inc., No. 2:25-cv-05275 (C.D. Cal.)
Lili Kazemi is General Counsel and AI Policy Leader at Anant Corporation, where she advises on the intersection of global law, tax, and emerging technology. She brings over 20 years of combined experience from leading roles in Big Law and Big Four firms, with a deep background in international tax, regulatory strategy, and cross-border legal frameworks. Lili is also the founder of DAOFitLife, a wellness and performance platform for high-achieving professionals navigating demanding careers.