Two Revolutions

What constitutes the law? You will find some text writers telling you that it is something different from what is decided by the courts of Massachusetts or England, that it is a system of reason, that it is a deduction from principles of ethics or admitted axioms or what not, which may or may not coincide with the decisions. But if we take the view of our friend the bad man we shall find that he does not care two straws for the axioms or deductions, but that he does want to know what the Massachusetts or English courts are likely to do in fact. I am much of his mind. The prophecies of what the courts will do in fact, and nothing more pretentious, are what I mean by the law.

The present short book describes how machine learning works. It does so with a surprising analogy.
Oliver Wendell Holmes, Jr., one of the law's most influential figures in modern times, by turns has been embraced for the aphoristic quality of his writing and indicted on the charge that he reconciled himself too readily to the injustices of his day. It would be a mistake, however, to take Holmes to have been no more than a crafter of beaux mots, or to look no further than the judgment of some that he lacked moral compass. That would elide Holmes's role in a revolution in legal thought-and the remarkable salience of his ideas for a revolution in computer science now under way.
Holmes in the years immediately after the American Civil War engaged with leading thinkers of the nineteenth century, intellectuals who were taking a fresh look at scientific reasoning and logic and whose insights would influence a range of disciplines in the century to come. The engagement left an imprint on Holmes and, through his work as scholar and as judge, would go on to shape a new outlook on law. Holmes played a central role in what has recently been referred to as an "inductive turn" in law, 2 premised on an understanding that law in practice is not a system of syllogism or formal proof but, instead, a process of discerning patterns in experience. Under his influence, legal theory underwent a change from deduction to induction, from formalism to realism. This change has affected the law in theory and in practice. It is oft-recounted in modern legal writing. A formalist view of legal texts-seeing the law as a formula that can be applied to the factual situations the legislator promulgated the law to address-remains indispensable to understanding law; but formalism, for better or worse, no longer suffices if one is to understand how lawyers, judges, and others involved in the law actually operate.
In computer science, a change has occurred which today is having at least as much impact, but in most quarters remains unknown or, at best, imprecisely grasped. The new approach is an inductive data-driven approach, in which computers are "trained" to make predictions. It had its roots in the 1950s and has come to preeminence since 2012. The classic view of computing, by contrast, is that computers execute a series of logical steps that, applied to a given situation, lead to completion of a required task; it is the programmer's job to compose the steps as an algorithm to perform the required task. Machine learning is still built on computers that execute code as a series of logical steps, in the way they have since the start of modern computing 3 -but this is not an adequate explanation of what makes machine learning such a powerful tool, so powerful that people talk of it as the point of departure toward a genuine artificial intelligence.

1.1
An Analogy and Why We're Making It In this book, we describe, in the broadest sense, how machine learning does what it does. We argue that the new and unfamiliar terrain of machine learning mirrors with remarkable proximity Holmes's conception of the law. Just as the law is a system of "prophesy from experience," as Holmes put it, so too machine learning is an inductive process of prediction based on data. We consider the two side by side in the chapters that follow for two mutually supporting purposes: in order to convey a better understanding of machine learning; and in order to show that the concepts behind machine learning are not a sudden arrival but, instead, belong to an intellectual tradition whose antecedents stretch back across disciplines and generations. We will describe how machine learning differs from traditional algorithmic programming-and how the difference between the two is strikingly similar to the difference between the inductive, experience-based approach to law so memorably articulated by Holmes and the formalist, text-based approach that that jurist contrasted against his own. Law and computing thus inform one another in the frame of two revolutions in thought and method. We'll suggest why the likeness between these two revolutions is not happenstance. The changes we are addressing have a shared origin in the modern emergence of ideas about probability and statistics.
Those ideas should concern people today because they have practical impact. Lawyers have been concerned with the impact of the revolution in their own field since Holmes's time. It is not clear whether technologists' concern has caught up with the changes machine learning has brought about. Technologists should concern themselves with machine learning not just as a technical project but also as a revolution in how we try and make sense of the world, because, if they don't, then the people best situated to understand the technology won't be thinking as much as they might about its wider implications. Meanwhile, social, economic, and political actors need to be thinking more roundly about machine learning as well. These are the people who call upon our institutions and rules to adapt to machine learning; some of the adaptations proposed to date are not particularly well-conceived. 4 New technologies of course have challenged society before the machine learning age. Holmes himself was curious and enthusiastic about the technological change which over a hundred years ago was already unsettling so many of the expectations which long had lent stability to human relations. He did not seem worried about the downsides of the innovations that roused his interest. We turn to Holmes the futurist in the concluding part of this book by way of postscript 5 ; scientism-an unexamined belief that science and technology can solve any problem-is not new to the present era of tech-utopians.
Our principal concern, however, is to foster a better understanding of machine learning and to locate this revolution in its wider setting through an analogy with an antecedent revolution in law. We address machine learning because those who make decisions about this technology, whether they are concerned with its philosophical implications, its practical potential, or safeguards to mitigate its risks, need to know what they are making decisions about. We address it the way we do because knowledge of a thing grows when one sees how it connects to other things in the world around it.

1.2
What the Analogy Between a Nineteenth Century Jurist and Machine Learning Can Tell Us The claim with which we start, which we base on our understanding of the two fields that the rest of this book will consider, is this: Holmes's conception of the law, which has influenced legal thought for close to a century and a half, bears a similar conceptual shape and structure to that which computing has acquired with the recent advances in machine learning. One purpose in making this claim is to posit an analogy between a change in how people think about law, and a change that people need to embrace in their thinking about how computers work-if they are to understand how computers work in the present machine learning age. The parallels between these two areas as they underwent profound transformation provide the organizing idea of this book.
Despite the myriad uses for machine learning and considerable attention it receives, few people outside immediate specialty branches of computer science and statistics avoid basic misconceptions about what it is. Even within the specialties, few experts have perspective on the conceptual re-direction computer science in recent years has taken, much less an awareness of its kinship to revolutionary changes that have shaped another socially vital field. The analogy that we develop here between law and machine learning supplies a new way of looking at the latter. In so doing, it helps explain what machine learning is. It also helps explain where machine learning comes from: the recent advances in machine learning have roots that reach deeply across modern thought. Identifying those roots is the first step toward an intellectual history of machine learning. It is also vital to understanding why machine learning is having such impact and why it is likely to have still more in the years ahead.
The impact of machine learning, realized and anticipated, identifies it as a phenomenon that requires a social response. The response is by no means limited to law and legal institutions, but arriving at a legal classification of the phenomenon is overdue. Lawyers and judges already are called upon to address machine learning with rules. 6 And, yet, legislative and regulatory authorities are at a loss for satisfactory definition. We believe that an analogy between machine learning and law will help.
But what does an analogy tell us, that a direct explanation does not? One way to gain understanding of what machine learning is is by enumerating what it does. Here, for example, is a list of application areas supplied by a website aimed at people considering careers in data science: • Game-playing • Transportation (automated vehicles) • Augmenting human physical and mental capabilities ("cyborg" technology) • Controlling robots so they can perform dangerous jobs • Protecting the environment • Emulating human emotions for the purpose of providing convincing robot companions • Improving care for the elderly • General health care applications • Banking and financial services • Personalized digital media • Security • Logistics and distribution (supply chain management) • Digital personal assistants • E-commerce • Customizing news and market reports. 7 Policy makers and politicians grappling with how to regulate and promote AI make lists like this too. The UK House of Lords, for example, having set up a Select Committee on Artificial Intelligence in 2017, published a report of the Committee which, inter alia, listed a number of specific fields which are using AI. 8 An Executive Order of the President of the United States, adopted in 2019, highlighted the application of AI across diverse aspects of the national economy. 9 The People's Republic of China Ministry of Industry and Information Technology adopted an Action Plan in 2017 for AI which identified a range of specific domains in which AI's applications are expected to grow. 10 But while lists of applications can reflect where the technology is used today, they don't indicate where it might or might not be used in the future. Nor do such lists convey the clearer understanding of how AI works that we need if we are to address it, whether our purpose is to locate AI in the wider course of human development to which it belongs or to adjust our institutions and laws so that they are prepared for its impact, purposes which, we suggest, are intertwined. AI is a tool, and naming things the tool does is at best only a roundabout route to defining it.
Suggesting the limits in that approach, others attempting to define artificial intelligence have not resorted to enumeration. To give a high profile example, the European Commission, in its Communication in 2018 on Artificial Intelligence for Europe, defined AI as "systems that display intelligent behavior by analyzing their environment and taking actions-with some degree of autonomy-to achieve specific goals." 11 This definition refers to AI as technology "to achieve specific goals"; it does not list what those goals might be. It is thus a definition that places weight not on applications (offering none of these) but instead on general characteristics of what it defines. However, defining AI as "systems that display intelligent behaviour" is not adequate either; it is circular. Attempts to define machine learning and artificial intelligence tend to rely on synonyms that add little to a layperson's understanding of the computing process involved. 12 In a machine learning age, more is needed if one is both to grasp the technical concept and to intuit its form.
In this book, we do not continue the search for synonyms or compile an index of extant definitions. Nor do we undertake to study how AI might be applied to particular practical problems in law or other disciplines. Instead, we aim to develop and explore an analogy that will help people understand machine learning.
The value of analogy as a means to understand this topic is suggested when one considers how definitions of unfamiliar concepts work. Carl Hempel, one of the leading thinkers on the philosophy of science in the twentieth century, is known by American lawyers for the definition of "science" that the U.S. Supreme Court espoused in Daubert, a landmark in twentieth century American jurisprudence. 13 Hempel was concerned as well with the definition of "definition." He argued that definition "requires the establishment of diverse connections… between different aspects of the empirical world." 14 It is from the idea of diverse connections that we take inspiration. We posit an analogy between two seemingly unrelated fields and with that analogy elucidate the salient characteristics of an emerging technology that is likely to have significant effects on many fields in the years to come. 15 We aim with this short book to add to, and diversify, the connections among lawyers, computer scientists, and others as well, who should be thinking about how to think about the machine learning age which has now begun.
We will touch on the consequences of the change in shape of both law and computing, but our main concern lies elsewhere-namely, to supply the reader with an understanding of how precisely under a shared intellectual influence those fields changed shape and, moreover, with an understanding of what machine learning-the newer and less familiar field-is.

1.3
Applications of Machine Learning in Law---And Everywhere Else Writers in the early years of so-called artificial intelligence, before machine learning began to realize its greater potential, were interested in how computers might affect legal practice. 16 Many of them were attempting to find ways to use AI to perform particular law-related tasks. Some noted the formalist-realist divide that had entered modern legal thinking. 17 Scholars and practitioners who considered law and AI were interested in the contours of the former because they wished to see how one might get a grip on it using the latter, like a farmer contemplating a stone that she needs to move and reckoning its irregularities, weight, position, etc. before hooking it up to straps and pulleys. Thus, to the extent they were interested in the nature of law, it was because they were interested in law as a possible object to which to apply AI, not as a source of insight into the emergence of machine learning as a distinct way that computers might be used.
Investigation into practical applications of AI, including in law, has been reinvigorated by advances that machine learning has undergone in recent years. The advances here have taken place largely since 2012. 18 In the past several years, it seems scarcely a day goes by without somebody suggesting that artificial intelligence might supplement, or even replace, people in functions that lawyers, juries, and judges have performed for centuries. 19 In regard to functions which the new technology already widely performs, it is asked what "big data" and artificial intelligence imply for privacy, discrimination, due process, and other areas of concern to the law. An expanding literature addresses the tasks for which legal institutions and the people who constitute them use AI or might come to in the future, as well as strategies that software engineers use, or might in the future, to bring AI to bear on such tasks. 20 In other words, a lot is being written today about AI and law as such. The application of AI in law, to be sure, has provoked intellectual ferment, certain practical changes, and speculation as to what further changes might come.
But the need for a well-informed perspective on machine learning is not restricted to law. We do not propose here to address, much less to solve, the technical challenges of putting AI in harness to particular problems, law-related or other. It is not our aim here to compile another list of examples of tasks that AI performs, any more than it is our purpose to list examples of the subject matter that laws regulate. Tech blogs and policy documents, like the ones we just referred to above, are full of suggestions as to the former; statute books and administrative codes contain the latter. Nor is it our purpose here to come up with programming strategies for the application of AI to tasks in particular fields; tech entrepreneurs and software engineers are doing that in law and many fields besides.
There are law-related problems-and others-that people seek to employ machine learning to solve, but cataloguing the problems does not in itself impart much understanding of what machine learning is. The concepts that we deal with here concern how the mechanisms work, not (or not primarily) what they might do (or what problems they might be involved in) when they work. Getting at these concepts is necessary, if the people who ought to understand AI today, lawyers included, are actually to understand it. Reaching an understanding of how its mechanisms work will locate this new technology in wider currents of thought. It is on much the same wider currents that the change in thinking about law that we address took place. This brings us to the common ancestor of the two revolutions.

Two Revolutions with a Common Ancestor
Connections between two things in sequence do not necessarily mean that the later thing was caused by the one that came before, and a jurist who died in 1935 certainly was not the impetus behind recent advances in computer science. We aren't positing a connection between Holmes's jurisprudence and machine learning in that sense, nor is it our aim to offer an historical account of either law or computer science writ large. Our goal in this book is to explain how machine learning works by making an analogy to law-following Hempel's suggestion that connections across different domains can help people understand unfamiliar concepts. Nonetheless, it is interesting to note an historical link between law and the mathematical sciences: the development of probabilistic thinking. According to philosopher of science Ian Hacking, [A]round 1660 a lot of people independently hit on the basic probability ideas. It took some time to draw these events together but they all happened concurrently. We can find a few unsuccessful anticipations in the sixteenth century, but only with hindsight can we recognize them at all. They are as nothing compared to the blossoming around 1660. The time, it appears, was ripe for probability. 21 It's perhaps surprising to learn about the link between probability theory and law. In fact, the originators of mathematical probability were all either professional lawyers (Fermat, Huygens, de Witt) or the sons of lawyers (Cardano and Pascal). 22 At about the time Pascal formulated his famous wager about belief in God, 23 Leibniz thought of applying numerical probabilities to legal problems; he later called his probability theory "natural jurisprudence." 24 Leibniz was a law student at the time, though he is now better known for his co-invention of the differential calculus than for his law. 25 Leibniz developed his natural jurisprudence in order to reason mathematically about the weight of evidence in legal argument, thereby systematizing ideas that began with the Glossators of Roman Law in the twelfth century. 26 Law and probability theory both deal with evidence; the academic field of statistics is the science of reasoning about evidence using probability theory. Statistical theory for calculating the weight of evidence is now well understood. 27 Leibniz, if he were alive today, might find it interesting that judges are sometimes skeptical about statistics; but even where (as in a murder case considered by the Court of Appeal of England and Wales in 2010) courts have excluded statistical theory for some purposes, they have remained open to it for others. 28 Whether or not a given court in a given case admits statistical theory into its deliberations, the historical link between lawyers and probability remains.
As we said, though, our concern here is not with history as such. Probabilistic thinking is not only an historical link between law and the mathematical sciences. It is also the motive force behind the two modern revolutions that we are addressing. Machine learning (like any successful field) has many parents, but it's clear from any number of textbooks that probability theory is among the most important. As for Holmes, his particular interests in his formative years were statistics, logic, and the distinction between deductive and inductive methods of proof in science; he later wrote that "the man of the future is the man of statistics." 29 That Holmes's milieu was one of science and wide-ranging intellectual interests is a well-known fact of biography; his father was an eminent medical doctor and researcher, and the family belonged to a lively community of thinkers in Boston and Cambridge, the academic and scientific center of America at the time.
Less appreciated until recently is how wide and deep Holmes's engagement with that community and its ideas had been. Frederic R. Kellogg, in a magisterial study published in 2018 entitled Oliver Wendell Holmes Jr. and Legal Logic, has brought to light in intricate detail the groundings Holmes acquired in science and logic before his rise to fame as a lawyer. Holmes's interlocutors included the likes of his friends Ralph Waldo Emerson, Chauncey Wright, and the James brothers, William and Henry. 30 Holmes's attendance of the Lowell Lectures on logic and scientific induction delivered by Charles Peirce in 1866 exercised a particular and lasting influence on Holmes's thought. 31 Holmes spent a great deal of time as well with the writings of John Stuart Mill, including Mill's A System of Logic, Ratiocinative and Inductive. (He met Mill in London in 1866; they dined together with engineer and inventor of the electric clock Alexander Bain. 32 ) Diaries and letters from the time record Holmes absorbed in conversation with these and other thinkers and innovators. Holmes eventually conceded his "Debauch on Philosophy" would have to subside if he ever were to become a practicing lawyer. 33 Holmes clearly was interested in statistics, and statistics can be used to evaluate evidence in court. But Holmes's famous saying, to which we will return below, that law is nothing more than "prophecies of what the courts will do," points to a different use of probability theory: it points to prediction. Traditional statistical thinking is mostly concerned with making inferences about the truth of scientific laws and models, at least in so far as scientific models can be said to be "true." For example, an expert might propose an equation for the probability that a prisoner will reoffend, or that a defendant is guilty of a murder, and the statistician can estimate the terms in the equation and quantify their confidence. A different type of thinking was described by Leo Breiman in a rallying call for the nascent discipline of machine learning: he argued that prediction about individual cases is a more useful goal than inference about general rules, and that models should be evaluated purely on the accuracy of their predictions rather than on other scientific considerations such as parsimony or interpretability or consonance with theory. 34 For example, a machine learning programmer might build a device that predicts whether or not a prisoner will reoffend. Such a device can be evaluated on the accuracy of its predictions. True, society at large might insist that scrutiny be placed on the device to see whether its predictions come from sound considerations, whether using it comports with society's values, etc. But, in Breiman's terms, the programmer who built it should leave all that aside: the predictive accuracy of the device, in those terms, is the sole measure of its success. We will discuss the central role of prediction both in Holmes's thought and in modern machine learning in Chapters 5 and 6.
Time and again, revolutions in thought and method have coincided. Thomas Kuhn, among other examples in his The Structure of Scientific Revolutions , noted that a shift in thinking about what electricity is led scientists to change their experimental approach to exploring that natural phenomenon. 35 Later, and in a rather different setting, Peter Bernstein noted that changes in thinking about risk were involved in the emergence of the modern insurance industry. 36 David Landes considered the means by which societies measured time, how its measurement affected how societies thought about time, and how they thought about time in turn affected their behaviors and institutions. 37 The relations that interested these and other thinkers have been in diverse fields and have been of different kinds and degrees of proximity. A shift in scientific theory well may have direct impact on the program of scientific investigation; the transmission of an idea from theory to the marketplace might be less direct; the cultural and civilizational effects of new conceptions of the universe (e.g., conceptions of time) still less. 38 Again, it is not our aim in this book to offer an historical account, nor a tour d'horizon of issues in philosophy of science or philosophy of law. Nor is it our aim to identify virtues or faults in the transformations we address. In computer science, it would be beside the point to "take sides" as between traditional algorithmic approaches to programming and machine learning. The change in technology is a matter of fact, not to be praised or criticized before its lineaments are accurately perceived. Nor, in law, is it to the present point to say whether it is good or bad that many jurists, especially since Holmes's time, have not kept faith with the formalist way of thinking about law. Battles continue to be fought over that revolution. We don't join those battles here.
What we do, instead, is propose that Holmes, in particular in his understanding of law as prediction formed from the search for patterns in experience, furnishes remarkably powerful analogies for machine learning. Our goal with the analogies is to explain the essence of how machine learning works. We believe that thinking about law in this way can help people understand machine learning as it is now-and help them think about where machine learning might go from here. People need both to grasp the state of the art and to think about its future, because machine learning gives rise to legal and ethical challenges that are difficult to recognize, even more to address, unless they do. Reading Holmes with machine learning in mind, we discern lessons about the challenges. Machine learning is a revolution in thinking, and it deserves to be understood much more widely and placed in a wider setting. Notes 1. As to the difference between "artificial intelligence" and "machine learning," see Prologue, pp. ix-x. 2. Kellogg  1. Any artificial system that performs tasks under varying and unpredictable circumstances without significant human oversight, or that can learn from experience and improve performance when exposed to data sets.
An artificial system developed in computer software, physical hardware, or other context that solves tasks requiring humanlike perception, cognition, planning, learning, communication, or physical action. 6 P.L. 115-232, Section 2, Division A, Title II, §238. The second paragraph displays the shortcoming of other circular definitions, but the first identifies, more helpfully, the relevance in AI of data and of learning from experience. See also the UN Secretary-General's initial take on the topic: 2. There is no universally agreed definition of artificial intelligence.
The term has been applied in contexts in which computer systems imitate thinking or behavior that people associate with human intelligence, such as learning, problem-solving and decision-making.
made possible, for better or worse, a civilization attentive to the passage of time, hence to productivity and performance" Id. at 6-7.
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/ by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.