Copying Westlaw Headnotes to Train AI Legal Search Competitor Is Not Fair Use, Per District of Delaware
On February 11, 2025, Third Circuit visiting Judge Stephanos Bibas, sitting by designation on the U.S. District Court for the District of Delaware, granted summary judgment that Ross Intelligence directly infringed Thomson Reuters’ (owner of Westlaw legal research platform) copyright in 2,243 Westlaw headnotes in the course of training its own artificial intelligence (“AI”) legal research program. This case represents one of the first substantial decisions addressing the issue of copyright fair use in the AI training context. The ruling might be particularly notable for future uses of fact-like “thin” copyrighted works. The unique set of facts, however, could also cabin any significant impact from the decision.
Background
Thomson Reuters’ Westlaw legal research platform is used to access and search case law and other legal materials. Although judicial opinions are not subject to copyright, Westlaw also has editorial content and annotations like headnotes, which summarize key points of law.
Ross, now defunct, attempted to compete with Westlaw with its own AI-powered legal search engine. A user would enter a legal question into the Ross search engine, which would analyze and return relevant case law addressing the issues raised in the question. The court observed that Ross’s AI was non-generative, since it returned existing judicial opinions rather than generating new content. Ross trained its AI search technology using roughly 25,000 “Bulk Memos,” which were compilations of legal questions paired with corresponding answers drawn from case law. The Bulk Memos were created and provided by a third party contractor, LegalEASE, who had built them from the Westlaw headnotes.
Citing remaining factual disputes as to many of the allegedly infringed headnotes, Key Number System, and other editorial decisions, the court considered only 2,830 asserted headnotes and ruled on 2,243 of them in its infringement opinion.
Ross Directly Infringed Thomson Reuters’ Copyrights in 2,243 Westlaw Headnotes
As a threshold issue, the court first held that the Westlaw headnotes are copyrightable works – both individually and as a compilation – because they satisfy the “extremely low” threshold of some “minimal degree of creativity,” despite being created from uncopyrightable judicial opinions. Analogizing headnotes to a sculpture carved from a block of marble, the court reasoned that each individual headnote, even if taken verbatim from an opinion, reflects the editor’s carefully chosen selection and opinion. Therefore, even identical copying of uncopyrightable case law text can result in copyrightable headnotes due to the editorial creativity in text selection.
The court did not address other issues related to potential copyright ineligibility, such as the typical short length of many individual headnotes and the absolute bar to copyrightability of facts and ideas.
Next, the court analyzed a subset of 2,830 asserted headnotes, for which Ross’s expert conceded that the corresponding Bulk Memo questions closely resembled the wording of the Westlaw headnotes but differed from the wording of the underlying opinions. It found that 2,243 of them infringed Thomson Reuter’s copyrights because they were so substantially similar that no reasonable jury could find otherwise.
Ross’s Fair Use and Other Infringement Defenses Fail
The bulk of the court’s analysis focused on Ross’s fair use defense, which it ultimately rejected.
Purpose and Character of the Use: The court applied the Supreme Court’s recent framework in Andy Warhol Found. for the Visual Arts, Inc. v. Goldsmith, 598 U.S. 508 (2023) to analyze this first fair use factor, by considering whether Ross’s use was commercial and for similar purposes. Ross’s use is admittedly commercial, and the court concluded that the purpose is “to make it easier to develop a competing legal research tool” to Westlaw. Therefore, Ross’s use was not transformative, giving this factor to Thomson Reuters.
In its transformativeness analysis, the court distinguished earlier cases involving intermediate copying (i.e., interim non-public reproduction) of computer programs. The court did not, however, address case law discussing transformativeness in the more closely relevant context of intermediate copying for enabling content indexing and searching (e.g., Authors Guild v. Google, Inc., 804 F.3d 202 (2d Cir. 2015)).
Nature of the Original Work: The court agreed with Ross that the headnotes had limited creativity, corresponding to limited scope of copyright protection. It emphasized, however, that this factor matters less than the others.
Amount and Substantiality of the Portion Used: The court clarified that “portion used” refers to what is “made accessible to a public.” Since Ross did not make the actual headnotes publicly available to it users, this factor favors Ross.
Likely Effect of the Copying on the Market for the Original: The court emphasized that this is the most important factor. This factor favored Thomson Reuters because Ross intended to compete with Westlaw as a market substitute. Moreover, the court recognized a potential market for Thomson Reuters in AI training (which Ross undercut) because Ross failed to show that such hypothetical market did not exist. Although acknowledging the public benefit in accessing the law, the court found that it fell short of tipping the scales.
Balanced together, the first and most important fourth factors lead the court to grant summary judgment against Ross’s fair use defense.
The court concisely rejected Ross’s other defenses, including the merger doctrine. The merger doctrine holds that when there are only limited ways to express a fact or idea, the expression merges with the underlying fact or idea, thereby becoming unprotectable. The court summarily concluded that merger is inapplicable because “there are many ways to express points of law . . . .” At the same time, however, the court had referenced headnotes that copied verbatim points of law from the opinions. The court did not discuss how to reconcile these potentially conflicting findings.
Implications
On the one hand, this ruling could set a challenging bar for potential fair uses defenses in other AI cases. The works copied here (individual headnotes) are, arguably, as thin and close to uncopyrightable as possible. The copying was purely intermediate in nature. The AI output contained no copyrighted materials and itself only consisted of existing noncopyrightable work. On the other hand, the impact of this case may be limited due to its niche facts. The decision may shape the nascent copyright landscape around the use of “thin” fact-like works in AI training, particularly for non-generative AI. At the same time, however, the court left unaddressed several relevant issues that could justify a contrary finding on key analyses. Consequently, it remains to be seen whether other courts will consider a similar analysis in the dozens of AI copyright infringement cases currently pending nationwide, or in future cases.