What Happened
In January 2026, Judge Sidney Stein affirmed a magistrate judge's ruling requiring OpenAI to hand over 20 million ChatGPT conversation logs to plaintiffs in the consolidated copyright lawsuit In re: OpenAI, Inc. Copyright Infringement Litigation.
The plaintiffs — including The New York Times, the Chicago Tribune, and numerous authors — allege that OpenAI used their copyrighted works without permission to train ChatGPT. This is one of the most significant AI copyright cases in history, with potential multi-billion-dollar implications.
Timeline of Events
Why This Matters
The Privacy Argument That Failed
OpenAI argued that producing millions of user conversations would invade ChatGPT users' privacy. The court disagreed, finding that three safeguards adequately protect user interests:
- Reduced sample size: 20 million logs instead of tens of billions
- De-identification: OpenAI must remove personally identifiable information
- Protective order: Discovery materials are governed by existing confidentiality rules
Crucially, Judge Stein distinguished ChatGPT users from wiretap subjects. ChatGPT users "voluntarily submitted their communications" to OpenAI — a distinction that undermined the privacy objection.
The Fair Use Question
At the heart of this lawsuit is a fundamental question: Can AI companies train their models on copyrighted works without permission under fair use?
The court found that even logs without plaintiffs' specific works are relevant because they bear on OpenAI's fair use defense. Fair use analysis examines how the challenged use affects the market for original works. Logs showing what ChatGPT produces across a broad range of queries could reveal whether ChatGPT's outputs compete with or substitute for copyrighted content.
"Even output logs without reproductions of plaintiffs' works are discoverable because they bear on OpenAI's fair use defense."
— Court ruling summary
What Happens Next
OpenAI must now produce 20 million de-identified ChatGPT logs to both news plaintiffs and class plaintiffs. Experts will analyze these logs for evidence of:
- Market harm: Does ChatGPT compete with or substitute for copyrighted content?
- Output patterns: How often does ChatGPT generate content similar to copyrighted works?
- Fair use factors: Is ChatGPT's use transformative or merely derivative?
This discovery could prove pivotal. If plaintiffs demonstrate that ChatGPT routinely generates outputs that compete with copyrighted content, OpenAI's fair use defense becomes much harder to sustain.
Implications for the AI Industry
🔑 Key Takeaways for AI Companies
- Privacy arguments won't block discovery: Courts will weigh privacy interests against relevance and expect safeguards, not wholesale withholding
- AI logs are discoverable ESI: Conversation logs are electronically stored information subject to legal holds and production
- Voluntary submission limits protection: Users who voluntarily share information with AI systems have less privacy protection than subjects of covert surveillance
- De-identification is expected: Companies must implement sensible safeguards, not refuse discovery entirely
The Broader Context
This ruling comes amid a wave of AI legal challenges:
- Wrongful death lawsuit: A California case alleges ChatGPT's GPT-4o model "intensified a man's delusions," leading to a murder-suicide
- Grok deepfake crisis: California's attorney general issued a cease-and-desist to xAI over AI-generated deepfakes
- Celebrity IP protection: Actors like Matthew McConaughey are trademarking their likenesses to prevent AI misuse
- AI toy legislation: California is considering a 4-year pause on AI-enabled toys pending safety standards
The multidistrict litigation against OpenAI remains one of the highest-stakes tests of how copyright law applies to generative AI. With 20 million data points now headed to the plaintiffs, the evidence base for answering that question has expanded dramatically.
What Users Should Know
Your ChatGPT Conversations
This ruling underscores that:
- Your conversations with ChatGPT are logged by OpenAI
- These logs can be produced in legal proceedings (anonymized)
- The "voluntary submission" distinction means less privacy protection
- You should assume AI conversations may be reviewed by third parties
Practical Recommendations
- Don't share sensitive personal, financial, or business information with AI chatbots
- Review AI providers' privacy policies and data retention practices
- Consider enterprise AI solutions with stronger data protection if handling confidential information
- Be aware that conversations may persist even after you delete them from your view
Stay Updated on AI Legal Developments
The intersection of AI and law is evolving rapidly. We're tracking the key cases, regulations, and implications for businesses and users.