: Stacks of newspapers, movie scripts, and music sheets transforming into AI chat bubbles on a glowing laptop, symbolizing journalism being replaced by generative AI, professional editorial illustration

The Answer Engine on Trial: What NYT and the Chicago Tribune’s New Lawsuits Mean for AI in 2026

When two of the most influential newsrooms in America sue the same AI startup in the same week, it’s no longer a story about copyright — it’s a story about business models colliding.

In early December, The New York Times and the Chicago Tribune each filed lawsuits in the Southern District of New York accusing Perplexity AI of systematic copying, paywall bypassing, and publishing “verbatim or near-verbatim” reproductions of their journalism inside Perplexity’s sleek answer engine.

These filings come as Perplexity is already defending the Dow Jones case involving the Wall Street Journal and New York Post, and as the Times continues its high-profile lawsuit against OpenAI and Microsoft over model training and output reproduction. Together with the Britannica and Merriam-Webster suit and the wave of publisher complaints filed throughout 2024–2025, these cases form a coordinated legal front aimed not simply at what AI models learn, but how AI answer engines behave in the wild.

This moment has been building. In my earlier article on RAG liability and answer-engine architecture, I argued that the next stage of AI litigation would target the entire pipeline — crawling, indexing, retrieval, reconstruction, and presentation — not just the training phase.

The NYT and Tribune lawsuits make that future concrete.

These complaints allege a product that does not merely “summarize” the news but substitutes for it — using unlicensed journalism to power a system marketed around “skipping the links,” undermining the economic foundation of the very reporting it depends on.

With that framing, let’s break down what The New York Times and the Chicago Tribune actually allege — and why these cases pose an existential threat to Perplexity’s model architecture.


1. What the Lawsuits Allege — and Why These Cases Are Structurally Different

A. This is not a “training data” lawsuit — it’s an output lawsuit

Both the Times and Tribune complaints contain the familiar mix of copyright, DMCA, trademark, and unfair-competition claims. But the heart of these cases is not about what Perplexity trained on years ago — it’s about what the system outputs today.

Both complaints say Perplexity:

  • Copied, stored, and indexed millions of subscriber-only or paywalled works.
  • Then generated long, near-verbatim reproductions of those works in response to user prompts.
  • Presented these answers in formats that resemble authoritative reporting while obscuring sourcing.
  • Encouraged users to remain inside Perplexity’s interface instead of clicking through to publisher websites.

Courts have drawn a growing distinction between learning from copyrighted works and replicating them. The NYT and Tribune complaints are designed to land squarely in the “replication” category.

This is why these cases are more dangerous for Perplexity than the earlier book-scanning suits against Meta and Anthropic. News content is time-sensitive commercial content — an area where courts are particularly protective because the market for the works is so direct and so easily substituted.

When an AI product reproduces or reconstructs a news article, the harm is immediate and quantifiable: a lost click, a lost subscription, a lost ad impression.

Fair use collapses quickly under that kind of market substitution.


B. Crawling, paywalls, and robots.txt: the acquisition story is as important as the outputs

A major part of both complaints is how Perplexity acquired the original works.

Across public reporting and the pleadings, the core allegations include:

  • Scraping paywalled articles from both the Times and Tribune without permission.
  • Ignoring robots.txt and other publisher opt-out mechanisms.
  • Using crawlers that masked their identity to avoid being blocked.
  • Potential reliance on third-party scraping partners known for aggressive data-capture methods.

If proven, this moves the case beyond fair use into more concrete and dangerous terrain:

  • Possible CFAA (Computer Fraud and Abuse Act) exposure.
  • Breach of contract via terms-of-service violations.
  • Evidence of willfulness, which dramatically increases damages.

Courts are far less sympathetic when a defendant appears to have deliberately maneuvered around technical and contractual barriers to acquire data it was not entitled to access.

The plaintiffs clearly understand that dynamic. These allegations are drafted to frame Perplexity not as an innocent innovator, but as a company that knew—or should have known—that its data-collection practices were unauthorized.


C. “Hallucinated attribution”: where copyright meets brand harm

One of the most striking elements in the Times complaint is the allegation that Perplexity:

  • Produced false or fabricated material,
  • Then labeled it—visually or contextually—as Times journalism,
  • Creating user confusion and reputational harm.

This is the evolution of the legal theories that Britannica and Merriam-Webster first surfaced against Perplexity: that the system does not merely misstate facts — it misstates sources.

My previous analysis of the Britannica and the Reddit cases against Perplexity can be found here:

That makes this:

  • A trademark case
  • A false designation of origin case
  • A dilution case
  • And in some fact patterns, arguably a defamation risk

Plaintiffs no longer need to prove that Perplexity copied a specific sentence. They can instead argue: “Your system told the world we published something we never wrote, under our name.”

That is powerful and intuitive to a jury.


2. This Is a Model-Architecture Case — Not a Copyright Case

Perplexity has pitched itself as more than a chatbot. It calls itself an answer engine — a curated, RAG-powered system designed to deliver direct, sourced answers rather than a blank-slate LLM hallucination. But that business model creates legal risks the company did not seem prepared to manage.

The complaints target the entire architecture:

  • Crawling (unauthorized acquisition)
  • Indexing (server-side copies of protected works)
  • Retrieval (selecting long passages as “context”)
  • Generation (reconstructing those passages)
  • Presentation (UI that resembles a news summary, not a search result)

This is not a narrow fight over what happens in a training dataset. It is a challenge to how the whole answer-engine ecosystem functions, front to back.


A. Courts are drawing a bright line between “learning” and “replicating”

Since 2024, judges across book and newspaper cases have been converging on a theme:

  • Learning = possibly transformative
  • Replicating = not transformative
  • Market substitution = fatal to fair use

The Times and Tribune complaints are drafted to lean fully on that distinction. They argue not merely that Perplexity ingested their works, but that it substitutes for their publications, down to the sentence level.

In that sense, these cases are the clearest test yet of the proposition that:

AI answer engines cannot survive by rebuilding paywalled material when prompted.

If a judge accepts that framing, the legal and commercial consequences are immediate.


B. RAG is not a safe harbor

In the early hype cycle, some companies treated RAG as a legal workaround — a way to “stay safe” by retrieving rather than memorizing. But these cases illustrate why RAG can be more legally dangerous than training:

  • If you retrieve unlicensed content, you have already made a server-side copy.
  • If your generator reconstructs or summarizes that content at scale, you create new infringing copies.
  • If your UI downplays attribution, you trigger DMCA and trademark exposure.
  • If your marketing promises to “skip the links,” you are actively promoting substitution.

Courts evaluate what the user receives, not what the engineering diagram claims.

RAG is not a doctrine. It is a design choice. And if that design choice yields infringing outputs, RAG offers no meaningful legal protection.


C. Output monitoring is becoming a baseline legal expectation

Both complaints emphasize that the infringing outputs appeared:

  • Repeatedly
  • Consistently
  • Across prompt variations

That tracks with what Judge Alsup observed in the Anthropic book cases: if the output keeps repeating the same protected work, the problem is not stochastic hallucination — it’s systemic behavior.

This signals a shift in judicial expectations:

  • AI companies must test for protected-text reproduction
  • AI companies must monitor high-risk content domains
  • AI companies must mitigate (filter or suppress) problematic outputs

“We can’t predict everything the model will say” is no longer a persuasive argument.

If you can build a RAG pipeline, courts believe you can build a guardrail pipeline as well.


3. Why Perplexity’s Exposure Is Higher Than Prior AI Copyright Cases

A. The plaintiffs are sophisticated and coordinated

These are not one-off plaintiffs. The Times, Tribune Publishing, and other media groups:

  • Have long histories of digital-rights enforcement
  • Understand technical discovery
  • Know how to trace server-side copying through logs
  • Have coordinated strategies against multiple AI defendants

This is not a case where a plaintiff will be overwhelmed by engineering details.

B. The damages model is multi-layered and severe

If willfulness is established — and the crawling allegations are clearly drafted with that in mind — Perplexity could face:

  • Statutory damages for each infringed work
  • Actual damages tied to subscription and ad-revenue loss
  • Disgorgement of Perplexity profits
  • DMCA §1202 enhanced damages
  • Lanham Act remedies
  • Potential punitive damages under state law

This is the most financially exposed Perplexity has ever been.

C. The business model itself may be on trial

The most existential risk is not monetary.

It is structural.

If a court concludes that Perplexity’s answer engine fundamentally depends on:

  • Unlicensed access to news
  • Substitutive reproductions in output
  • Brand associations with trusted publishers

then the remedy could include:

  • Injunctions limiting retrieval of news content
  • Mandatory licensing
  • Architectural changes to the RAG system
  • Restrictions on UI/UX that present Perplexity as a news surrogate

This is the Napster-zone risk: not because the tech is similar, but because the court may view the product’s core functionality as incompatible with copyright law.


4. What This Means for AI Companies, GCs, and Policy Leaders

Whether you are building AI, buying AI, or integrating AI into consumer-facing products, these lawsuits set expectations for what responsible AI will require in 2026 and beyond.

1. Treat crawling as a regulated activity

You must:

  • Honor robots.txt
  • Avoid paywall bypassing
  • Align with terms of service
  • Maintain internal records of data provenance

2. Govern training and outputs

Output governance is now a legal requirement:

  • Detecting near-verbatim reproduction
  • Monitoring brand misuse
  • Logging source selection and retrieval paths
  • Adding filters or refusal behaviors for high-risk categories

3. Build an attribution strategy now

Silence is a liability. AI systems should offer:

  • Automatic citations
  • “Read more at the source” plumbing
  • Transparency about what content was retrieved

4. Treat memorization as a discovery risk

Plaintiffs will not ask “What usually happens?”
They will ask: “What is the worst thing your system ever did, and what did you do about it?”

5. Prepare for a new wave of publisher licensing models

These lawsuits are likely to accelerate something that has been simmering for a while: formal licensing arrangements between AI platforms and news organizations. What we are seeing now is the beginning of a market correction. Publishers want compensation, AI companies want high-quality data, and courts are starting to signal that unlicensed extraction at scale is not a sustainable business model.

For companies, this means the next wave of AI tools may look very different from the ones you use today.
There may be:

  • new licensing tiers,
  • negotiated data-access agreements,
  • reliability and provenance guarantees,
  • and potential restrictions on features that were previously free or frictionless.

The lawsuits against Perplexity mark a shift in the entire conversation about AI and content ownership. This is no longer only a debate about what goes into model training. It is a debate about how AI systems deliver information, who owns the underlying material, and who should be compensated when that material creates value downstream.

For anyone relying on AI in their workflow, especially in content-heavy or research-driven industries, this is a moment to stay alert. The next generation of AI platforms will be shaped by these legal precedents. Make sure the tools in your stack are built by companies that can license content, respect publisher rights, and evolve as the rules get rewritten.


Final Thought

The NYT and Chicago Tribune lawsuits against Perplexity mark a turning point in AI litigation. They confirm what many of us in AI policy have seen coming: courts will no longer accept vague, good-faith appeals to “innovation” when the product behavior itself looks like systematic, scalable infringement.

This is not just about Perplexity.
This is about the future of answer engines.

The companies that survive this next phase will be those that build AI systems aligned with legal, commercial, and ethical realities — with governance built into the architecture, not layered on after the fact.


Read More


Lili Kazemi is General Counsel and AI Policy Leader at Anant Corporation, where she advises on the intersection of global law, tax, and emerging technology. She brings over 20 years of combined experience from leading roles in Big Law and Big Four firms, with a deep background in international tax, regulatory strategy, and cross-border legal frameworks. Lili is also the founder of DAOFitLife, a wellness and performance platform for high-achieving professionals navigating demanding careers.

Follow Lili on LinkedIn and X

🔍 Discover What We’re All About

At Anant, we help forward-thinking teams unlock the power of AI—safely, strategically, and at scale.
From legal to finance, our experts guide you in building workflows that act, automate, and aggregate—without losing the human edge.
Let’s turn emerging tech into your next competitive advantage.

Follow us on LinkedIn

👇 Subscribe to our weekly newsletter, the Human Edge of AI, to get AI from a legal, policy, and human lens.

Subscribe on LinkedIn