# Bending the law: geometric tools for quantifying influence in the multinetwork of legal opinions

## Abstract

Legal reasoning requires identification through search of authoritative legal texts (such as statutes, constitutions, or prior judicial opinions) that apply to a given legal question. In this paper, using a network representation of US Supreme Court opinions that integrates citation connectivity and topical similarity, we model the activity of law search as an organizing principle in the evolution of the corpus of legal texts. The network model and (parametrized) probabilistic search behavior generates a Pagerank-style ranking of the texts that in turn gives rise to a natural geometry of the opinion corpus. This enables us to then measure the ways in which new judicial opinions affect the topography of the network and its future evolution. While we deploy it here on the US Supreme Court opinion corpus, there are obvious extensions to large evolving bodies of legal text (or text corpora in general). The model is a proxy for the way in which new opinions influence the search behavior of litigants and judges and thus affect the law. This type of “legal search effect” is a new legal consequence of research practice that has not been previously identified in jurisprudential thought and has never before been subject to empirical analysis. We quantitatively estimate the extent of this effect and find significant relationships between search-related network structures and propensity of future citation. This finding indicates that “search influence” is a pathway through which judicial opinions can affect future legal development.

## Keywords

Topic model Law search Citation networks Multi-networks PageRank Network curvature## 1 Introduction

Judicial decision-making is characterized by the application by courts of authoritative rules to the stylized presentation of disputed claims between competing litigants. These authoritative rules are set forth in legal source materials such as constitutions, statutes, and written opinions supporting prior decisions. For a legal source to have bearing on a current dispute, it must be retrievable by the relevant legal actors. The problem of organizing legal texts into a comprehensible whole has been recognized since Justinian I’s Corpus Juris Civilis issued in 529–534. The acute problems of identifying relevant legal sources (i.e., legal precedent) presented by the common law tradition has spurred codification and classification efforts that have ranged from Blackstone’s “Commentaries on the Laws of England (1765–1769)” to the codification movement in the late nineteenth century (Garoupa and Morriss 2012), to the development and spread of the West American Digest System in the twentieth century (West 1909). Most recently, the effect of digitization on the evolution of the law, primarily in its impact on legal research, has become a subject of inquiry (see e.g., Berring 1986, 1987; Fronk 2010; Hanson and Allan 2002; Hellyer 2005; Katsh 1993; McGinnis and Wasick 2015; Schauer and Wise 2000).

In this paper we consider the textual corpus of legal sources as an evolving *landscape* that carries a natural geometry and comprises *regions* of the law whose development and shifting boundaries are influenced by the dynamics and feedback of *law search*. Everything devolves from a model of the process of legal research carried out in the corpus in which “actors” start from a case or opinion and then build out an understanding of the relevant issues by (1) following citations, (2) searching for cases that cite the initial case of interest, and (3) identifying textually similar cases. These actions have a natural network—more precisely, a *multinetwork*—formulation, in which legal sources are connected to each other based on citation information and textual similarity as described by a *topic model* representation of their textual content. Topic models represent texts (embodied as word-frequency distributions or “bag-of-words” representations) as mixtures of *topics*. “Topic” as used in this sense has a technical meaning and is defined as a probability distribution over the vocabulary in the corpus. Topics are uncovered and discovered according to a well-known and by now widely deployed methodology (see e.g., Blei 2012) that we briefly describe below. Our use of three kinds of connectivity (as opposed to one) in the text corpus structures the corpus in a multinetwork representation, a combinatorial structure that has proved useful in a number of different contexts, such as biology and economics (e.g., Barigozzi et al. 2011; Blinov et al. 2012; Kivelä et al. 2014). In this work we introduce for the first time the multinetwork concept to the novel contexts of text-mining and text search, with a specific application to judicial texts.

We use the multinetwork framework to define a notion of search generalizing the Markov model (discrete time random walk) that encodes Google’s famous “websurfer” webpage search model (Brin and Page 1998). The webpage ranking system Pagerank is simply the stationary vector of this model (Bryan and Leise 2006). Rankings are of course useful (and of course profitable), but the random walk also will give rise to a natural notion of distance on the underlying state space, roughly defined in terms of the expected time (number of steps) needed to go from one state to another and it is this metric point of view that we explore herein. In our setting, distance reflects the ease with which a human user of the legal corpus could navigate from one legal source to another, based on a weighted combination of searches along the underlying citation and topical similarity networks. The latter is usually reduced to a keyword search in standard resources (e.g., through a commercial database such as Lexis-Nexis). The derived inter-opinion distances support the discovery of well-defined regions (in this case, groups of legal sources) that are relatively close to each other, but relatively distant from other regions. Distance is also a proxy for relevance. When new judicial decisions are issued and the supporting opinions are incorporated into the legal corpus, they interact with search technology to change the legal sources that will be discovered during the next search. For example, some new opinions can link together previously distant opinions, making them more easily discoverable. In turn, these new connections can foster new arguments. This is a new kind of legal effect that, as far as we know, has never been identified as a theoretical possibility, much less formalized and subjected to an empirical test.

The random walk setting also enables the creation/definition of a notion of *curvature* for the underlying state space (think of a state space as the cities and towns in a landscape of rolling hills and valleys). As per the usual interpretation of this geometric notion, the more negative the curvature of a region^{1} of the legal landscape, the easier it is to navigate to legal sources outside that region from legal sources that are inside of the region. Curvature may change over time as new legal sources are added to the corpus. An increase in curvature in a given region^{2} indicates increasing difficulty in navigating from the interior of the region to legal sources outside it. This has the interpretation that the region has become more isolated from the rest of the legal corpus and thus is less relevant to new opinions outside of the region. We refer to this effect as *puddling*. The opposite effect wherein curvature decreases is referred to as *drainage*. Drainage is characterized by ease of navigation from points (legal sources) inside the region to those that are outside. Notions of network curvature have only just begun to make their way into applied literature. Some early work has adapted the idea of Ricci curvature to the network setting, mainly for its relation to various isoperimetric inequalities (see e.g., Chung and Yau 1996; Lin and Yau 2010). More recent work approaches the idea from the point of view of *optimal transport* (Ollivier 2009). This in turn makes strong connections to discrete Markov chains—as does ours—but this other work is quite different from the approach taken herein.

Use of the citation network to measure the influence of judicial opinions is now well-studied (see e.g., Bommarito et al. 2009; Fowler and Jeon 2008; Fowler et al. 2007), although interesting potential avenues of this kind of investigation in the judicial context remain underexplored (see e.g. Uzzi et al. 2013 for a citation network analysis in the context of scientific articles). Topic models, however, have only just very recently entered legal studies and have already showed great promise as a foundation for new quantitative avenues of analysis (George et al. 2014; Livermore et al. 2017; Nardi and Moe 2014; Rice 2012).

Citation networks and topic modeling are examples of computational methods useful to legal studies. Early conversations concerning law and digitization focused on distinction in “context” between digital and physical forms, for example, whether digitization enhanced or reduced reading comprehension or facilitated or undermined serendipity in conducting searches. In particular, the legal significance of the effects of various search modalities (citation-based, keyword, unstructured text) are only just becoming apparent (see e.g. McGinnis and Wasick 2015). Our work may suggest ways to begin to quantify some of these effects and empirical studies comparing our search model with actual human search results is in preparation. In this paper we focus on the collection of all U.S. Supreme Court cases from 1951 to 2002. A project to extend our work to include the Circuit courts is already underway.

In the next section we explain in a bit more detail the mathematical background and framework. Section 3 presents our results, showing that the precise notions of puddling and drainage correspond to a measurable waning and waxing respectively of relevance over time. We also briefly introduce the publicly accessible database and user interface (www.bendingthelaw.org) that we have constructed for the engagement with and visualization of the multinetwork of opinions. We then conclude with some thoughts about next steps and extensions of this work. Two technical appendices provides a more detailed mathematical justification (based on Riemannian geometry) for our definition of multinetwork curvature as well as motivation for a certain parameter choice in the analysis. The paper can be read without these sections, but we include them for the sake of completeness.

## 2 The mathematical framework

### 2.1 A random walk model for legal research

*legal search process*. We frame legal search in this setting as a probabilistic process of “local” exploration of the opinion corpus modeling the way in which a user of the legal corpus might navigate from opinion to opinion while researching an issue. This navigation is naturally viewed as a

*Markov chain*(see e.g., Grinstead and Snell 1997), formulated as a matrix

*T*of

*transition probabilities*where the states are indexed by the opinions: given opinions

*a*and

*b*the value of the entry

*T*(

*a*,

*b*) is the probability of “moving to” opinion

*b*“from” opinion

*a*in an exploration of the legal corpus.

^{3}More precisely, framing this as a “random walk” in “opinion space”

*T*(

*a*,

*b*) is the probability of moving at the next step to case

*b*, given that you are currently at case

*a*, i.e., the

*conditional probability*

The transition probabilities are constructed as a combination of a several terms, reflecting our stylized model of navigation of the space of legal opinions.^{4} We assume the possibility of three basic types of local exploration from an initial opinion *a*: (1) consideration of opinions cited by *a*; (2) consideration of opinions that cite to *a*, and (3) consideration of opinions that are *textually similar* to *a*. Our Markov chain (transition matrix) is thus represented as a linear combination of the individual chains, \(T_{\text{ cited-by }}, T_{\text{ cited }},\) and \(T_{\text{ sim }}\).

*a*. As per the notation, the weights may vary by initial state (

*a*), though in what follows we will typically have them globally constant. In fact, for the sake of analysis we will assume these weights are uniform (each equal to \(\frac{1}{3}\)). Our implementation allows the weights to vary (cf. Sect. 3.2). In general, throughout this paper, we typically choose our parameters to be simple natural choices, reflecting the initiatory nature of this paper and the early stages of this project. Any particular parameter of groups of parameters could be optimized with more data and an appropriate training paradigm. Ideally, the weights would be determined by training them with respect to an appropriate objective function, and the ideal objective function would be related to the effectiveness of the exploration. This would require feedback from users, and in Sect. 3.2 we discuss an implementation which could eventually allow for such a training paradigm to be implemented.

### 2.2 Construction of the components \(T_{\text{cited}} ,T_{\text{cited-by}} ,\hbox {and}\;T_{\text{sim}}\)

The transition matrices \(T_{\text{ cited }}\) and \(T_{\text{ cited-by }}\), based on the citation network are straightforward to construct. A natural and standard choice is to weight equally all opinions cited by a given opinion, and similarly for all opinions that cite the given opinion. Thus, if opinion *a* cites opinions \(b_1,\dots ,b_k\) then \(T_{\text{ cited }}(a,b_i) = {1\over k}\). Similarly, if *a* is cited by opinions \(b_1,\dots ,b_k\), then \(T_{\text{ cited-by }}(a,b_i) = {1\over k}.\) While we choose to work with equal weights, this weighting could be modified in some way, perhaps accounting for some notion of the importance of an opinion. To find the citation network we make use of the excellent “Supreme Court Citation Network Data” database created by Fowler and Jeon (cf. Supreme Court 2015).

Navigation via textual similarity using something deeper than keywords is a novel contribution of this work and for this we make use of a *topic model*. A detailed description of topic modeling is beyond the scope of this paper, but a short description will suffice for the purposes of exposition. Very briefly, a topic—in the technical sense—is a probability distribution over a vocabulary. Topic modeling is the unsupervised derivation of a set of such distributions that represents a text corpus of *documents* (technically defined as a roughly contiguous set of words in the corpus, that is usually itself composed of larger portions of text—e.g., full opinions as opposed to the word blocks it comprises). Topics are defined according to a simple generative *bag-of-words model*^{5} for the documents in the corpus: given a document, first a topic is chosen at random and then a word is chosen at random within the topic. The topics are then the best fit solution to the actual bag-of-words representation of the documents. Recalling that bag-of-words is essentially a representation of each document as a word distribution, the topic model derives the “atomic” probability distributions that express each document in the corpus as a mixture of such atoms. The wide applicability of topic models in many disciplines has made for a broad community of topic modelers and the topic modeling technology has quickly become an “off-the-shelf” technology ready for deployment (see e.g., MALLET 2015) with a minimum of start-up cost. See Blei (2012) for one of the many friendly explanations of topic modeling.

The only supervision in the basic topic modeling algorithm is the choice of number of topics to be computed. We choose to use 100 topics, which for our corpus of 21,893 opinions (documents) is adequate. The most widely discussed method for choosing the number of topics involves treating the number of topics as a model parameter and inferring it from the data (Griffiths and Steyvers 2004). This method requires, however, more computational resources than are typically available as resources needed increase rapidly with the number of topics allowed. With such a large corpus of (long) documents, for example, fitting a corpus with 1000 topics is not possible in a reasonable amount of time. The approach we adopt—and we think it reflects the current best practice—is to choose a maximum number of topics based on time and computational resources available. Picking a larger number of topics than the data supports is not a risk because the widely used specifications of the topic model [used by MALLET (2015) and in the software we use Buntine and Mishra (2014)] will simply leave them empty. For example, if the data suggest that 50 topic distributions is sufficient to account for the data, fitting a model with a maximum of 100 topics will recover the same model as fitting the model with a maximum of 50 topics.

When the topic modeling is completed we therefore have a set of topics \(\text{ Topic}_{1}, \dots , \text{ Topic}_{100}\), where each word *w* in the vocabulary has a weight in each topic \(\text{ Topic}_{k}(w) \ge 0\) and any given opinion *a* is represented as a distribution over topics, \(\sum _k \alpha _k(a) {\text{ Topic}}_k \; \left(\sum _k\alpha _k(a) = 1; \;\; \alpha _k(a) \ge 0 \right)\). Table 1 shows the most highly weighted words in five of the topics. The indexing of the topics in the table is not relevant. The labels (in parentheses) are assigned by the user (in this case the authors of this paper). The full set of topics for our SCOTUS dataset is available online.^{6}

Some representative topics derived from the SCOTUS corpus

\(\text{ Topic}_{1}\) (jury process) | \(\text{ Topic}_{2}\) (housing) | \(\text{Topic}_{8}\) (evidence) | \(\text{Topic}_{58}\) (abortion) | \(\text{Topic}_{59}\) (search) |
---|---|---|---|---|

Jury | Housing | Court | Abortion | Search |

Trial | Lease | Case | State | Warrant |

Evidence | Property | Evidence | Woman | Fourth |

Defendant | Rent | Record | Medical | Amendment |

Error | credit | Fact | Physician | Evidence |

Verdict | Building | Question | Life | Arrest |

Reasonable | Bond | Facts | Health | Police |

Instruction | Tenant | Did | Roe | Cause |

Doubt | real | Issue | Consent | Probable |

Instructions | Rental | Findings | Statute | Seizure |

*a*identify the \(N_\mathcal{{T}}\) most heavily weighted topics expressed in opinion

*a*(using the \(\alpha _k(a)\) to define the weight) and for a given topic \({\text{ Topic}}_k\) identify the \(N_\mathcal{{O}}\) opinions in which \({\text{ Topic}}_k\) was most strongly expressed (using the \(\alpha _k\) here as well).

^{7}Intuitively we view this as the process of a search returning the top \(N_\mathcal{{T}}\) topics related to the initial opinion

*a*followed by a search of the top \(N_\mathcal{{O}}\) opinions associated to each of these top topics. To weight the final results of the search, for the given opinion

*a*we create an \(N_\mathcal{{T}} \times N_\mathcal{{O}}\) matrix in which the

*i*,

*j*entry is the index of the

*j*th most significant opinion in the corpus for the

*i*th most significant topic in opinion

*a*. If we define \(W_{a,b}\) to be the number of times opinion

*b*occurs in this matrix, then \(T_{\text{ sim }}\) is the random walk produced by normalizing according to these weights. More precisely, for any

*b*with \(W_{a,b} > 0\),

*T*.

### 2.3 The exploration geometry

*PageDist.*

^{8}We call the induced geometry an

*exploration geometry*.

^{9}

*PageDist*we attach one last parameter

*r*to the random walk of (1): at each step assume a probability \(r > 0\) of continuing the exploration. Then given

*r*and starting at an opinion

*a*, the expected number of visits to opinion

*b*is

*a*to

*b*in

*k*steps. Intuitively, \(R(a,\cdot )\) forms an exploration neighborhood of opinion

*a*in the sense that the higher the value of

*R*(

*a*,

*b*) the more opinion

*b*is considered to be in a neighborhood of

*a*. Notice,

*r*governs the size of this neighborhood as a sort of radius. If \(r=0\) then the neighborhood consist of only the opinion

*a*, while if \(r=1\) (and the chain is irreducible) then the series diverges everywhere and the whole space is

*a*’s exploration neighborhood. So we need a value between 0 and 1 and in what follows we chose \(r=\frac{1}{2}\) to keep it simple. As discussed above, with a fixed objective function and enough training data one could could optimize this choice of

*r*(perhaps even locally).

*PageDist*, given by

*p*denotes the

*p*-norm.

^{10}Notice that if the neighborhood description of

*a*and

*b*nearly agree then this will be near zero, and if they are very distant

*R*(

*a*,

*x*) will be nearly zero when

*R*(

*b*,

*x*) is large and vice versa, resulting in a large value of \(\text{ PageDist }(a,b)\) (in other words, a large distance between the opinions). So the

*PageDist*metric will capture a notion of distance within the landscape. Figure 1 shows the distribution of distances among our corpus of Supreme Court opinions. In what follows, we chose the Euclidean norm (\(p = 2\)) to keep it simple. Again, with a fixed objective function and enough training data the choice of

*p*could also be optimized.

The random walk setting also makes possible a definition of *curvature* that encodes a level of difficulty for escape from a given point in the execution of a random walk. If the degree of difficulty is large, a walk will have a tendency to get “stuck” in the neighborhood of the state. This can be interpreted as an opinion that doesn’t connect usefully with its surrounding or nearby opinions. Conversely, a more “fluid” area around an opinion suggests that it engages usefully with the broader opinion landscape. This kind of idea will be key to understanding the *relevance* of an opinion.

*curvature*as

*bending*. Let us make this precise. Given the node set

*N*of a network with a transition matrix

*T*reflecting a Markov process on the nodes, let \(S \subset N\), be some subset of nodes. A Markov chain on

*N*induces a chain on the subset

*S*by using the weights

*a*to

*b*that go outside of

*S*. We form a new transition matrix

*P*(

*a*,

*b*;

*S*,

*N*) normalizing \(W_S(a,b)\) so that the weights sum to one at each vertex. We call this the

*induced local exploration*. This induces a corresponding exploration geometry and a curvature \(\kappa\) (defined as in (3,4)) for

*S*relative to

*N*which we denote as \(\kappa (a; S,N)\). Bending will encode the change in curvature as

*S*grows.

*puddling regions*and regions where it becomes easier are called

*drainage regions*. A precise definition works with the distribution of bending values: we call the subset corresponding to the bottom quartile of \(\text{ Bending }(*; t_1, t_0)\) the

*Drainage*region (relative to the defining era)—or

*Drainage*\((t_1, t_0)\). Similarly, we call the subset corresponding to the top quartile of \(\text{ Bending }(*; t_1, t_0)\) the

*Puddling*region (relative to the defining era)—or

*Puddling*\((t_1, t_0)\). Figure 2 shows the distribution of \(\kappa (*; 1990)\) as well as the bending of 1995 relative to 1990 in the Supreme Court opinion corpus (\(Bending(*; 1995 > 1990)\)).

## 3 Results

The metrics we have developed enable us to determine the “relevance” of an opinion, as defined by its proximity to new opinions that are added to the corpus.

### 3.1 Metrics for relevance

*t*. Given \(t_2 \ge t_1 \ge t_0\), define the

*set of relevant cases*(at some threshold

*d*) as

*a*at time \(t_0\) (i.e., those that could serve as precedent) that find themselves close to newly arrived (later) opinions (those issued in the period between \(t_1\) and \(t_2\)). This means that the opinions in \(\text{ Rel }_{t_2, t_1,t_0; d}\) are those opinions published no later than \(t_0\) that are close to the new opinions published between times \(t_1\) and \(t_2\).

The threshold *d* can be set based on various criteria. A natural way to set *d* is by taking into account the PageDist distribution. A guiding principle is to set *d* according to the percentage of cases that we want to declare as “relevant” over a given initial or baseline period. For fixed time periods \(t_0< t_1<t_2\), as the threshold *d* increases, so does the fraction of opinions in the corpus at time \(t_0\) that are considered relevant. Conversely, as the fraction of cases that will be viewed as relevant grows, this implicitly corresponds to an increased threshold *d*.

*Initial Relevance Probability*(IRP) (for \(t_1 > t_0\) and a given threshold

*d*) as the fraction of opinions present at time \(t_0\) that are in \(\text{ Rel }_{t_1, t_0,t_0; d}\)—i.e., the fraction of opinions that remain relevant at time \(t_1\) according to a threshold

*d*. Our goal is to understand how to predict which cases remain relevant as time goes on. Figure 3 shows how IRP varies with relevance to future cases \(P(\text{ Rel }_{t_2,t_1,t_0; d} \mid \text{ Rel }_{t_1,t_0,t_0; d})\).

^{11}Therein we plot (using \(t_0=1990\), \(t_1=1995\), and \(t_2 = 2000\))

*d*increases monotonically with IRP, we can view both axes as functions of

*d*). Thus, “

*Momentum*” measures the fraction of opinions that continue to be relevant. This behaves as might be expected, with an increasing percentage of opinions remaining relevant, until such a time as too many initial cases are tossed in, some of which will be opinions that have become vestigial.

*R*which contains the recent legal action. If we imagine that we have constructed a random region with each of our independent samples, then \(P(\text{ Rel }_{t_2,t_1,t_0; d} \mid \text{ Rel }_{t_1,t_0,t_0; d}) \approx IRP\). So the

*Momentum*measures how far beyond random our construction is, and we define the optimally “relevant” region as the one that’s furthest beyond random. Let us now fix \(d = d_{max}\) so as to correspond to the \(IRP=0.2\) in Fig. 3. With the choice of

*d*set, we now have fixed the parameter by which we identify opinions as relevant. A mathematical justification for this choice can be found in "Appendix B".

*d*we can now examine the interaction between curvature and relevance, and in particular, the effect of being in either the drainage or puddling groups as respects the relevance of future cases. Let us start by defining our

*Future Relevance Probability relative to a condition A*as

*A*helps to predict future relevance. And our goal is to see whether knowing something about the dynamic geometry, namely if we are in a drainage or puddling region, helps us predict whether that regions is more or less likely to be relevant in the near future. This entails the comparison of \(FRP(\text{ Drainage })\), \(FRP(\text{ Puddling })\), and \(FRP(\text{ All })\).

This comparison is shown in Fig. 4. We see the relevance of future cases (the blue line - in the online - and solid line in the paper copy) compared to the relevance of future cases in the drainage and puddling regions. Therein we see that indeed, drainage regions (low bending) have roughly a greater than \(10\%\) chance more of being relevant for future cases than do puddling regions (high bending). That is, the drainage regions that are connecting up the space are more associated to future relevance.

### 3.2 Implementation

The ideas presented in this paper form the foundation of new web-based search tool for exploring a space of legal opinions using the exploration geometry introduced in the body of this paper. Specifically, we have built a prototype website and user interface (UI) that will enable the exploration according to PageDist of an opinion database, that ultimately will encompass all Federal Court and Supreme Court cases. At present it is running on a small subset (SC cases 1950–2001). This prototype can be found at www.bendingthelaw.org.

Currently, our UI introduces users to cases in the “vicinity” (in the sense of our exploration geometry) of a pre-identified case specified by the user. The anticipation is that these cases will be strong candidates for precedent-based reasoning. As per (1) the search returns the “neighborhood” of the case that depends on the database of cases as well as the individual weights assigned to the three-component random walk process encoding the exploration geometry—that is, a choice of weights \(p_{\text{ cited }}, p_{\text{ cited-by }},\) and \(p_{\text{ sim }}\). As a first step we allow a choice of weights from \(\{0,1,2\}\) with at least one positive weight, so that \(W = w_{\text{ cited }} + w_{\text{ cited-by }} +w_{\text{ sim }}\), \(p_{\text{ cited }} = w_{\text{ cited }}/{W}\), \(p_{\text{ cited-by }} = w_{\text{ cited-by }}/{W}\), and \(p_{\text{ sim }} = w_{\text{ sim }}/{W}\).

Recall that the similarity piece of the random walk, \(T_{\text{ sim }}\) requires that we construct the “topic by opinion” matrix of a given size. We choose that to be \(10 \times 10\)—i.e., that for any given topic we consider the 10 opinions that make the most use of it and conversely, for any opinion, we consider the 10 topics that make the strongest contribution to it.

## 4 Closing thoughts

In this paper we introduce a new multinetwork framework integrating citation and textual information for encoding relationships between a large set of Supreme Court opinions. The citation component derives from the underlying citation network of opinions. The textual piece derives from an LDA topic model computed from the text corpus. A metric on the opinion space is the reification of a basic model of legal search as would be executed by a prototypical legal researcher (“homo legalus”) looking for cases relevant to some initial case through textual similarity and citation. The model of search is articulated as a Markov chain on the network, built as a linear combination of the individual chains on the citation and topic networks. The Markov process produces a notion of *distance* between opinions which can also be thought of as a proxy for relevance. Along with distance, the Markov chain gives rise to a notion of curvature, and with this an implicit framing of the opinion corpus as a “landscape” which we call “the legal landscape”. We have implemented a first generation website that will allow users to explore a smallish subset of Supreme Court opinions using this search tool (www.bendingthelaw.org).

The text corpus evolves in the sense that cases enter the corpus regularly and in so doing continually transform the associated text landscape, changing interpoint distances and local curvatures. Of particular interest are those cases that remain relevant over long periods of time. Some regions of the legal landscape have the property that they serve as nexuses of connection for regions of the landscape. We show that those regions which over time become significantly more negatively curved are such connective areas. With the analogy of flow in mind, we call such areas, regions of “drainage”. Areas which experience a significant increase in curvature we call “puddling regions”. We show that drainage areas are more likely to contain continually relevant cases than the puddling regions. We further show that opinions that start off relevant, in the sense of entering the landscape highly relevant to many cases over a short period of time tend to remain relevant, thereby suggesting a property of (legal) *momentum*.

There are natural next steps to take with this idea. In one direction we will expand the text corpus to include all Supreme Court and Appellate Court Opinions. We also plan to validate and compare our model by asking users to compare the results of our search algorithm (under a range of parameter choices) with their own usual research approaches. Our newly introduced opinion distance function gives a new variable to explore the relations of opinions to all kinds of social and economic variables. It is also natural to export this model to other court systems that produce English language opinions. In this regard it would be interesting to see the ways in which the “bending” of the courts systems vary, and try to understand what might account for such (possible) variation. Ultimately, it would also be of interest to effect the integration of distinct corpora via this model. In a related, but different direction, we will deploy this new navigation and search model on other corpora. To this end, the *Bending the Law* website includes navigable access to the United States Code (USC), Code of Federal Regulations (FCR), and Internal Revenue Code (IRC). In these corpora, sections and subsections are linked and referenced, and the topic modeling takes place on the level of sections. Future work will describe our findings in analyzing these newly multinetworked corpora, but for now, they exist as domains for new explorations for the public.

## Footnotes

- 1.
The standard example of a point of negative curvature is the

*saddle point*—so named for the curvature of the center of a riding saddle. A marble placed there would rapidly move away from the point, if in an indeterminate direction. - 2.
A well is a standard example of a point of positive curvature.

- 3.
*T*varies over time as new opinions are introduced, but very slowly in comparison with the legal search process. Our use of the chain is with respect to the search that is accomplished at some instant in time, so we can assume the process is time homogenous and represented by a matrix. - 4.
Other legal sources, including statutes and constitutions, have other types of internal ordering (such as organization by chapter or article) that may be relevant for law search. For purposes of this analysis, we restrict our application to the body of U.S. Supreme Court opinions and do not incorporate other sources of law. The framework of search that we develop, however, is generalizable to these other legal sources.

- 5.
“Bag-of-words” means that the document is summarized as the probability (frequency) distribution of the words comprising it.

- 6.
- 7.
The use of \(\alpha _k\) can be justified for \(N_\mathcal{{T}}\) by the interpretation \(P(\text{ Topic }_{k} | a ) = \alpha _k\). While assuming that cases are equally relevant a priori, we have for a fixed \(\text{ Topic}_{k}\) that \(P(a | \text{ Topic}_{k}) = \frac{ P(a)}{P(\text{ Topic}_{k})} P(\text{ Topic}_{k} | a ) \propto P(\text{ Topic}_{k} | a ) = \alpha _k\); so we can use \(\alpha _k\) to order \(N_\mathcal{{O}}\) as well.

- 8.
We are indebted to Peter Doyle for early conversations regarding the geometrization of Markov chains and PageDist.

- 9.
It is worth noting that another natural candidate for a textual geometry is given in Leibon and Rockmore (2013) wherein the concept of a

*network with directions*is introduced. Therein, “directions” function as “points at infinity”, producing a hyperbolic metric on the network. For this—and any text corpus—the pure topics provide an obvious choice of direction. - 10.
Recall that this notation means \(\left( \sum _x |R(a,x) - R(b,x)|^p\right) ^{1/p}\).

- 11.
Note that the conditional notation has the usual interpretation of \(P(A \mid B) = \#(A \cap B)/\# B\).

## Notes

### Acknowledgements

The authors gratefully acknowledge the support of the Neukom Institute for Computational Science at Dartmouth College. Special thanks to Jason Linehan for building the beta version of the *Legal Landscapes* website. We also thank the referees for their careful reading of the manuscript.

## References

- Barigozzi M, Fagiolo G, Mangioni G (2011) Identifying the community structure of the international-trade multi-network. Phys A 390(11):2051–2066CrossRefGoogle Scholar
- Berring RC (1986) Full-text databases and legal research: backing into the future. Berkeley Technol Law J 1:27Google Scholar
- Berring RC (1987) Legal research and legal concepts: where form molds substance. Cal Law Rev 75:15CrossRefGoogle Scholar
- Blei DM, Lafferty JD (2006) Dynamic topic models. In: Proceedings of the 23rd international conference on machine learning, ICML ’06. ACM, New York, pp 113–120Google Scholar
- Blei DM (2012) Probabilistic topic models. Commun ACM 55(4):77–84CrossRefGoogle Scholar
- Blei D, Lafferty J (2007) A correlated topic model of Science. Ann Appl Stat 1(1):17–35MathSciNetCrossRefMATHGoogle Scholar
- Blei D, Ng A, Jordan M (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022MATHGoogle Scholar
- Blinov ML, Udyavar A, Yarbrough W, Wang J, Estrada L, Quaranta V (2012) Multi-network modeling of cancer cell states. Biophys J 102(3):22aCrossRefGoogle Scholar
- Bommarito MJ, Katz DM, Zelner J (2009) Law as a seamless web? Comparison of various network representations of the United States Supreme Court corpus (1791–2005). In: Proceedings of the 12th international conference on artificial intelligence and law (ICAIL 2009), pp 234–235Google Scholar
- Brin S, Page L (1998) The anatomy of a large-scale hypertextual web search engine. In: Crouch M, Lindsey T (eds) Computer networks and ISDN systems. Elsevier, Amsterdam, pp 107–117Google Scholar
- Bryan K, Leise T (2006) The $25,000,000,000 eigenvector: the linear algebra behind Google. SIAM Rev 48(3):569–581MathSciNetCrossRefMATHGoogle Scholar
- Buntine WL, Mishra S (2014) Experiments with non-parametric topic models. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 881–890Google Scholar
- Chung F, Yau ST (1996) Logarithmic Harnack inequalities. Math Res Lett 3:793–812MathSciNetCrossRefMATHGoogle Scholar
- Fowler JH, Jeon S (2008) The authority of Supreme Court precedent. Soc Netw 30:16–30CrossRefGoogle Scholar
- Fowler JH, Johnson TR, Spriggs FJ, Jeon S, Wahlbeck P (2007) Network analysis and the law: measuring the legal importance of Supreme Court precedents. Polit Anal 15(3):324–346CrossRefGoogle Scholar
- Fronk CR (2010) The cost of judicial citation: an empirical investigation of citation practices in the federal appellate courts. Univ Ill J Law Technol Policy 2010(1):5825–5829Google Scholar
- Garoupa N, Morriss AP (2012) The fable of the codes: the efficiency of the common law, legal origins and codification movements. Univ Ill Law Rev 5:1443Google Scholar
- George CP, Puri S, Wang DZ, Wilson J, Hamilton W (2014) Smart electronic legal discovery via topic modeling. In: Proceedings of the 27th international FLAIRS conference, pp 327–332Google Scholar
- Griffiths TL, Steyvers M (2004) Finding scientific topics. Proc Natl Acad Sci 101(Suppl. 1):5228–5235CrossRefGoogle Scholar
- Grinstead CM, Snell JL (1997) Introduction to probability. American Mathematical Society, ProvidenceMATHGoogle Scholar
- Hanson FA, Allan F (2002) From key numbers to keywords: how automation has transformed the law. Law Libr J 94:563Google Scholar
- Helgason S (2001) Differential geometry, lie groups, and symmetric spaces (graduate studies in mathematics). American Mathematical Society, ProvidenceMATHGoogle Scholar
- Hellyer P (2005) Assessing the influence of computer-assisted legal research: a study of California Supreme Court opinions. Law Libr J 97:285Google Scholar
- Katsh E (1993) Law in a digital world: computer networks and cyberspace. Vill Law Rev 38:403Google Scholar
- Kivelä M, Arenas A, Barthelemy M, Gleeson JP, Moreno Y, Porter MA (2014) Multilayer networks. J Complex Netw 2(3):203–271CrossRefGoogle Scholar
- Leibon G, Rockmore DN (2013) Orienteering in knowledge spaces: the hyperbolic geometry of wikipedia mathematics. PLoS ONE. https://doi.org/10.1371/journal.pone.0067508
- Lin Y, Yau ST (2010) Ricci curvature and eigenvalue estimate on locally finite graphs. Math Res Lett 17:345–358MathSciNetCrossRefMATHGoogle Scholar
- Livermore M, Riddell A, Rockmore D (2017) The Supreme Court and the judicial genre. Arizona Law Rev 59:837Google Scholar
- MALLET. http://mallet.cs.umass.edu/topics.php. Accessed Jan 2015
- McGinnis JO, Wasick S (2015) Law’s algorithm. Fla Law Rev 66:991Google Scholar
- Nardi DJ, Moe L (2014) Understanding the Myanmar Supreme Court’s docket. In: Crouch M, Lindsey T (eds) Law, Society and Transition in Myanmar. Hart PublishingGoogle Scholar
- Ollivier Y (2009) Ricci curvature of Markov chains on metric spaces. J Funct Anal 256:810–864MathSciNetCrossRefMATHGoogle Scholar
- Pinsky MA (1984) Brownian motion, exit times and stochastic Riemannian geometry. Math Comput Simul 26(4):357–360MathSciNetCrossRefMATHGoogle Scholar
- Polterovich I (2000) A commutator method for computation of heat invariants. Indag Math 11:139–149MathSciNetCrossRefMATHGoogle Scholar
- Rice D (2012) Measuring the issue content of Supreme Court opinions through probabilistic topic models. In: Presentation at the 2012 Midwest Political Science Association Conference. Illinois, ChicagoGoogle Scholar
- Roberts M, Stewart B, Tingley D, Airoldi EM (2013) The structural topic model and applied social science. In: Advances in neural information processing systems workshop on topic models: computation, application, and evaluationGoogle Scholar
- Schauer F, Wise VJ (2000) Nonlegal information and the delegalization of law. J Legal Stud 29:495–515CrossRefGoogle Scholar
- Supreme Court Citation Network Data. http://jhfowler.ucsd.edu/judicial.htm. Accessed Jan 2015
- Uzzi B, Mukherjee S, Stringer M, Jones B (2013) Atypical combinations and scientific impact. Science 342(6157):468–472CrossRefGoogle Scholar
- West JB (1909) Multiplicity of reports 2. Law Libr J 4Google Scholar

## Copyright information

**Open Access**This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.