As business use of generative Artificial Intelligence tools grows, opposing parties in litigation are seeking those interactions in discovery. The U.S. District Court for the Southern District of New York addressed those issues in a September 19, 2025 ruling in In re OpenAI Inc. Copyright Infringement Litigation, brought by the New York Times against OpenAI and Microsoft. The court confirmed that ordinary discovery rules apply to Artificial Intelligence data but declined to compel the New York Times to produce prompts and outputs because the requests were not relevant to the core copyright issues and were not proportional given the extensive review burden.
The decision follows an earlier May order requiring OpenAI to preserve output logs going forward, which raised concerns about indefinite retention of every Artificial Intelligence interaction. The September ruling clarified that preservation obligations must be targeted and defensible, not an obligation to retain all Artificial Intelligence content indefinitely, and subsequent proceedings removed the broad going-forward preservation requirement. The court signaled that limited metadata or logs such as timestamps, tool or model identifiers, and request IDs, along with administrative or audit information, may be discoverable when directly related to the claims. By contrast, system-wide logs or training data are generally outside the scope unless they directly bear on the dispute and are within the party’s control. Contracts with Artificial Intelligence vendors can also affect access to data through ownership, confidentiality, or notice terms.
Practically, organizations should identify custodians who used Artificial Intelligence tools, the tools involved, the kinds of data entered, and where that data resides, then preserve selectively. Parties may and should object to broad requests that lack a direct connection to the claims or are disproportionate. When production is appropriate, scope and burden negotiations typically focus on custodian, topic, and date limits, metadata or sampling, and quantifying review costs and privilege implications. Courts look for reasonableness and documentation in both preservation and production, and parties should limit production to what truly matters while defending against unnecessary overreach.
