Back to AI information
OpenAI accused the New York Times of requesting 20 million chat records, saying it was a serious violation of user privacy

OpenAI accused the New York Times of requesting 20 million chat records, saying it was a serious violation of user privacy

AI information Admin 105 views

In November 2025, OpenAI issued a statement on its official website, naming the New York Times' evidence collection request in the copyright lawsuit as "crossing the line", saying that the other party wanted to obtain about 20 million ChatGPT user conversations to find out whether users used the model to bypass the New York Times paywall and copy the content of the report. OpenAI emphasized that these chats contain highly sensitive content such as passwords, payment information, health issues, and emotional distress, and that any large-scale transfer to a third-party team of lawyers conflicts with the platform's commitment to user privacy, so the company will do its utmost to block this request in court.

The dispute stems from a copyright lawsuit filed by the New York Times in late 2023, with core allegations that OpenAI and Microsoft exploited Times content to train models without authorization, resulting in some outputs that closely resemble the original. As the lawsuit progressed, the focus gradually shifted from whether the training data was legal to "how and to what extent the evidence can be obtained". Some courts have ruled that limited access to some of the conversation logs for evidence collection is discussable under strict confidentiality orders and de-identification measures, and the New York Times claims that it will not use this data to identify specific users. OpenAI emphasized that even if names and accounts are removed, the content itself may be enough to expose personal identity and privacy, so it asked the court to be more restrained in balancing copyright claims with user data security.

In this context, OpenAI has received broader evidence preservation orders in the past, being required to suspend the deletion of relevant chat records as usual and keep them centrally, and then the company reduced its obligation to legal retention of data for a specific period of time through appeals and negotiations, and promised not to use it for training or product improvement. In the future, how the court delineates the scope of disclosure of chat records will not only affect the outcome of this case, but also provide a demonstration boundary for how the entire platform AI service trades off log retention, privacy protection, and litigation evidence collection.

FAQsQ

: Why is the New York Times asking OpenAI for 20 million chat logs?

A: The New York Times wants to find evidence in these ChatGPT conversations that users have used the model to restore or reconstruct the Times' paid content, thereby supporting its claim that "the model reproduces copyrighted works in large numbers," which is an evidence discovery strategy in copyright litigation.

Q: What risks does OpenAI consider this forensic request?

A: OpenAI believes that even if the account information and name are deleted, the chat content itself contains details such as illness, work, family, finances, etc., which is enough for the parties to be indirectly identified, and the large-scale transfer of this data to the opposing legal team will pose serious privacy risks, so it calls this an "intrusion" on user privacy.

Q: What is the court's current attitude towards chat records?

A: On the one hand, the court issued an evidence preservation order requiring OpenAI to suspend the deletion of relevant logs, and on the other hand, in subsequent rulings, it only allowed limited evidence collection within the framework of the protection order, and did not directly support the New York Times to obtain all the requested data.

Q: Will the ChatGPT conversations of ordinary users be saved for a long time?

A: OpenAI's public statement is that under normal circumstances, after a user deletes a conversation, the relevant content will be removed from the system within a certain period of time and will no longer be used for training. However, during the New York Times lawsuit, some of the time period was subject to a court order and needed to be kept in a legal hold system until the end of the proceedings. Enterprise and users with zero data retention agreements are generally not covered by this dispute.

Q: What are the potential implications of this case for the AI industry as a whole?

A: The outcome of the case is not only related to whether the news content can be regarded as a fair use of training data, but also affects how the court views the evidentiary value of platform chat records in litigation in the future. AI companies will have to consider similar requirements they may face in the future when designing log retention policies, deletion mechanisms, and outbound delivery processes, which will push the industry to rebalance the boundaries between "data minimization" and "legal compliance."

The New York Times asked for ChatGPT chat history controversy OpenAI opposes the handover of 20 million conversation data Chat history forensic boundaries in copyright litigation AI platform user privacy conflicts with court evidence collection The New York Times sues OpenAI for copyright infringement progress Whether ChatGPT conversations can be used as copyright evidence De-identifying chat data can still be re-identified at risk How to restrict data disclosure under a court protection order Analysis of OpenAI's commitment to user chat privacy 20 million ChatGPT conversations are justified Copyright protection and user data minimization game How AI companies keep logs in lawsuits New York Times paywall content was reconstructed controversially Whether the highly similar model output constitutes infringement Training data fair use boundary law discussion The court asked for a moratorium on the deletion of chat history The impact of the legal retention system on AI log policy Are Enterprise and Zero Data Retention users affected? How to balance copyright claims and privacy conflicts Lawyer teams are exposed to sensitive data risks at scale Mental health and emotional distress Content leakage concerns Password and payment information security in chat history Should AI chat platforms be encrypted end-to-end by default? Forensic plan in the compliance design of large model products Whether the New York Times' request for evidence has crossed the line OpenAI's possible defense in court The impact of this case on the content training rules of news media New requirements for transparency of training data sources in the AI industry After the user deletes the conversation, the data is left to the truth analysis How the court defines the necessity of chat log evidence Division of responsibility for large-scale log retention and data breaches New position of U.S. courts in generative AI cases Bypassing paywalls through ChatGPT is a legal risk Generative AI recreates the boundaries and norms of news reporting How to update the privacy policy of platform AI services Demonstration effect of AI chat record disclosure standards in the future How developers should design more fine-grained data controls OpenAI and Microsoft's interests in this case are bound Will this case push for stricter AI-specific legislation? Practical advice for users to protect their privacy in AI chats How regulators might intervene in platform data disputes The tension between media copyright protection and technological innovation How to set up a legal retention process within an AI company Generative AI products face rising compliance costs Chat record risk in the context of cross-border data transmission Whether the weights of algorithms and models are disclosed in copyright litigation From this case, we look at the life cycle management of large model logs Will OpenAI adjust its default data policy in the future? How should ordinary users understand the impact of this case on themselves? The New York Times v. OpenAI case is an industry-wide warning

Recommended Tools

More