- Volume 76, Issue 5
- Page 893
Article
The Necessary and Proper Stewardship of Judicial Data
Zachary D. Clopton & Aziz Z. Huq *
Governments and commercial firms create profit and social gain by exploiting large pools of data. One source of valuable data, however, lies in public hands yet remains largely untapped. While the deep reservoirs of data produced by Congress and federal agencies have long been available for public use, the data produced by the federal judiciary is only loosely regulated, imperfectly available to the public at large, and largely ignored by scholars.
The ordinary process of litigation in federal courts generates an enormous volume of data. Especially after recent developments in large language models, this data holds immense potential. It can be used to predict case outcomes or clarify the law in ways that advance legality and judicial access. It can reveal shortfalls in judicial practice and enable the provision of cheaper, better access to justice. It can make legible many otherwise invisible social facts that, if brought to light, can improve public policy. Or the data can fuel private profits, its benefits accruing to a small coterie of data brokering firms capable of monopolizing its commercial use.
This Article is the first to address the complex empirical, legal, and normative questions raised by the untapped public asset of judicial data. It develops a positive, descriptive account of how federal courts produce, dissipate, preserve, or disclose information. This account includes a map of the well-known sources of Article III data (for example, opinions, orders, and briefs), but also extends to a massive volume of “dark data” produced but either lost or buried by the courts. This positive analysis further uncovers a complex administrative framework that erects a plethora of walls and hurdles—some categorical, and some individuated—to slow down or stop public access.
With this positive understanding in hand, we offer a careful analysis of the constitutional questions implicated in decisions to disclose—or to render opaque—judicial data. Drawing attention to the key question of who controls judicial data flows, we demonstrate the existence of sweeping congressional power to regulate judicial data outside of a small zone of inherent judicial authority and a handful of instances in which privacy or safety are implicated by disclosure. Congressional authority, therefore, is the rule and not
the exception.
Having established these empirical and legal predicates, the Article offers a normative vision of how Congress should regulate the production and dissemination of judicial data in light of the capabilities and incentives of relevant actors. The information produced by the federal courts should not exclusively be a source of private profit for a few data-centered firms. It is a public asset that should be elicited and disseminated in ways that advance the federal courts’ mission of equal justice under law.