Introduction

In recent times, the study of social influence has extended beyond mathematical sociology [16, 24, 50] and has entered the realm of computation [1, 4,5,6,7, 28, 30, 33, 34, 36, 37]. A computational study of “influence”—however we define it—is key to understanding the behavior of individuals embedded in networks. In this paper, we model and analyze social influence in a strategic setting where one’s behavior depends on others’ behavior. Since game theory reliably captures such interdependence of behavior in a population, we ground our computational approach in game theory. The strategic setting of our interest here is the U.S. Senate. We model the influence structure among the senators by taking into account the relevant context, which we call the spheres of legislation. We learn these models of influence from the real-world behavioral data on Senate bills and voting records. Our particular focus is on analyzing machine learned influence networks to answer various questions on polarization and most influential nodes.

Interestingly, most computational models of influence assume a fixed network structure among individuals. We relax this simplifying assumption, allowing the network of influence to vary according to the spheres of legislation. For example, bills on finance may induce a very different influence network among senators than bills on defense, which may in turn have different impacts on inference problems like polarization and most influential nodes. One central question in this regard is: how do we identify different spheres of legislation that may have different implications on these inference problems? We address this in "Spheres of legislation" section.

After identifying spheres of legislation, we can learn an influence network among the senators for each sphere by adopting game-theoretic models of strategic behavior. Broadly speaking, the topic of modeling and analyzing congressional voting behavior has been getting a lot of attention in both political science and computer science [9, 19, 28, 30, 45], in part due to the availability of data.

In particular, we use the linear influence game (LIG) model of strategic behavior proposed by Irfan and Ortiz [29, 30]. We learn these models using data from the spheres of legislation. In LIG, each senator exerts influence upon (and is subject to influences from) other senators in a network-structured way. The model focuses on interdependence among the senators and adopts the game-theoretic solution concept of Nash equilibrium to predict stable outcomes from a complex system of influences. This notion of Nash equilibrium leads to a definition of the most influential senators, where a group of senators is called most influential with respect to a desirable stable outcome represented by a pure-strategy Nash equilibria (PSNE) if their support for that outcome influences enough other individuals to achieve that outcome. The LIG model will be elaborated in "The LIG model" section and machine learning of this model using the spheres of legislation will be detailed in "Machine learning" section.

The main theme of this paper is how influence networks are affected by the underlying context, where the spheres of legislation represent the context. We should note here that contextual information has been considered before in the congressional setting. Recently, Irfan and Gordon [28] extended the LIG model to account for the bill context. They sought to combine both the social interactions and ideological leaning aspects of congressional voting. Although their model incorporates an aspect of bill context by assigning polarities to bills, it does not completely disentangle the influence network from the topics or subjects of the bills. The influence network produced by their model does not change due to the bill context, whereas we allow the influence network to change based on the spheres of legislation.

Additionally, Irfan and Gordon [28] focused on making predictions given a single bill, rather than analyzing the network as a whole. In "Towards richer models: ideal point models with social interactions" section, we briefly touch upon how their richer model can be applied to multiple spheres of legislation, thereby allowing the network to vary according to the context. The full exploration of their model within the spheres of legislation remains open.

As alluded above, while game-theoretic prediction of congressional votes has been well studied using the LIG model and its extensions [28,29,30], an analysis of the machine learned networks of influence did not get much attention, which we address here. Similarly, algorithms for computing most influential nodes in a strategic setting have been studied before (e.g., [30]), but their structural analogs like centrality measures have not been explored in a comparative fashion. In other words, what do we gain by using a game-theoretic definition of most influential nodes as opposed to a structural definition? We address questions like this.

Furthermore, polarization in social networks has been well studied [4, 16, 21, 22, 39, 41], especially in the political arena [10, 17, 27, 40, 52, 53] and often in a regional context [2, 43]. A detailed literature review is provided in Appendix A. Three salient points distinguish our approach from the rich body of literature: (1) Ours is a model-based approach, where networks are central to predicting collective outcomes, (2) we learn the networks using behavioral data because the networks are not observable, and (3) we seek to show that polarization in Senate varies according to the spheres of legislation. We do not touch on the rising polarization in Senate over time, which by now is a well-settled matter [15].

Two recent congressional terms—114th and 115th—are especially interesting for analyzing network behavior and polarization. The 114th Congress ran from January 2015 to January 2017, and the 115th Congress ran from January 2017 to January 2019. In both terms, Republicans controlled the Senate, but the executive power was different. In the 114th Congress, Barack Obama (D) held the presidency; in the 115th, Donald Trump (R) held the presidency. Despite the two opposing parties holding presidency, both terms are perceived to be deeply polarized. Interestingly, when we study different influence networks among the same group of senators arising from different spheres of legislation, we find that polarization is not really equally applicable. It very much depends on the sphere under consideration. Our aim is to put polarization and other inference questions like most influential nodes in context.

Spheres of legislation

We use an unsupervised machine learning technique, namely fuzzy clustering, to assign bills to different spheres of legislation based on the bill subjects. We learn the linear influence game (LIG) models, analyze influence networks, compute equilibria, and find most influential senators for each sphere separately. By doing so, we are able to examine differences and make comparative judgments across the spheres. We first describe how we prepare the data for clustering.

Preparing congressional roll-call data

Our model relies on data obtained from the @unitedstates project’s Congress repository, (https://github.com/unitedstates/congress), a public domain program that allows easy access to official congressional data from the Congressional Research Service (CRS). In particular, we use bill data and roll-call data. Roll-call data contain senators’ “yea,” “nay,” or abstaining votes, while bill data include a list of subjects incident to the bill, among other attributes. These 820 subjects range from “Abortion” to “Zimbabwe,” and a multitude of subjects describes each bill. Additionally, each bill is assigned a single “top term,” the broad subject which best describes the bill out of 23 possible top-level subjects. We use the roll-call data to represent senator voting behavior, and bill data to extract bill topics.

Working with the combined data from multiple terms presents a troubling problem for graph-based analysis: senators come and go. Seats in the United States Senate often change during midterm elections, when constituents have the chance to re-elect or replace incumbent senators. In the middle of a term, if a senator leaves their seat, a successor is appointed until the state can hold a special election to find a democratically elected replacement. In the 2016 midterm election at the onset of the 115th congress, seven senate seats were changed; during the course of the 115th congress, due to cabinet appointments by President Trump, scandals, and a death, the senate saw seven more changes.

When a senator is not present for a vote, they neither influence nor can be influenced by other senators’ votes during that roll call. Some senators in our dataset never once overlap with another; one left the senate before the other even joined. To reduce the number of these cases, we combined non-permanent senators under the following circumstances, given a departing senator A and an incoming senator B:

  1. 1.

    Senator A does not run during an election, and senator B of the same party is elected to replace them.

  2. 2.

    Senator A voluntarily or involuntarily steps down, and senator B of the same party is appointed as their replacement

In these circumstances, we assume that the incoming senators behave similarly to the departing senators. In other circumstances, such as when a senator loses their seat to member of the opposing party, we keep both senators in the dataset. Changes in senate membership, and the operations undertaken to reduce the number of total senators, are described in Table 1. Additionally, learning the LIG model requires data to be in the form of two discrete values: 1 (yea) or \(-1\) (nay). When a senator is not present for a vote—either because they were absent on that day, or were not yet holding office—we fill in the missing data with the mean vote of their party.Footnote 1

Table 1 Changes in congressional membership within the 114th–115th congresses

Clustering algorithm

We seek to split the bills into a small number of broad categories, each of which encompasses many bills. Each bill has been tagged with a “top term” by @unitedstates. The top term corresponds to congress.gov’s tag of “policy area.” According to congress.gov, “one Policy Area term, which best describes an entire measure, is assigned to every public bill or resolution.” The policy area vocabulary consists of 32 terms. Footnote 2 However, these top terms/policy areas are too specific to be used as clusters on their own. In fact, making each top term its own cluster would result in some clusters containing only one bill and others containing a hundred. This would be problematic because the “outcome space” of LIGs is exponential in size, and as a result, learning LIGs requires a relatively large amount of data.

Rather than manually re-categorizing bills, we took a statistical clustering approach to grouping, based on a bill’s assigned “top term” in addition to all subjects it contains. For each data point, we assigned each possible subject a weight: 0 if missing, 1 if present, or 10 if it is the “top term.” By including both measures of subjects (top and regular), we produce more meaningful categories than using top terms or bill subjects lists alone.

In data science, K-Means (KM) is often used as a simple yet effective clustering algorithm [38]. In KM, n data points are partitioned into k clusters based on their Euclidean distance from cluster centers. In each iteration, every data point is assigned a cluster based on the closest centroid; then, the centroid of each cluster is reset to the average position of each data point within that cluster. The process repeats until centroid positions converge. The problem of choosing k is left up to the researcher; generally, k is chosen by trial-and-error. Cluster membership in KM is crisp, meaning that each data point belongs to one and only one cluster. While effective at producing distinct clusters, KM is not ideal for our purposes because bills often belong to multiple clusters. For example, a bill about increasing defense spending is about national security as well as economics.

The Fuzzy C-Means (FCM) clustering algorithm addresses this problem. FCM is an extension of KM which allows for overlaps in clusters [3, 47]. The objective function in FCM is largely the same as in KM, with the addition of membership values \(w_{ij}\) and a fuzzifier m. Membership values describe how closely each data point i belongs to cluster j. The fuzzifer changes membership values: \(m=1\) results in crisp clusters (\(w_{ij} \in \{0, 1\}\)), and higher values of m result in fuzzier clusters. The FCM algorithm produces a list of cluster centers, describing the position of each centroid, as well as the fuzzy partition matrix, describing the membership degree of each bill to every cluster.

Iterating over a range of values, we found that number of clusters, \(c=4\) and \(m=1.3\) resulted in clusters which were relatively distinct, had intuitive descriptions and also contained an adequate number of bills for machine learning. Additionally, we experimented with the threshold values for cluster membership and settled on 0.15. That is, a bill is considered a member of a cluster if its membership value is above 0.15. Table 2 describes the results of our chosen FCM parameters. Each cluster is assigned a shorthand name describing its contents and is called a sphere of legislation in this paper. We next describe the model.

Table 2 Summary of four spheres of legislation: shorthand names and descriptions for each of the spheres of legislation identified by the FCM algorithm are shown here

The LIG model

We represent the senate influence network as a linear influence game (LIG) [29, 30], one type of 2-action graphical game [35]. Nodes represent senators, or players, and are connected by directed edges. Edge weights represent the influence exerted by the source node upon the target. Influence weights can be negative, positive, or zero. The directed edges are allowed to be asymmetric, meaning that nodes A and B may exert different levels of influences on each other. Additionally, nodes have a threshold level, which represents “stubbornness.” Nodes with thresholds further from zero are more resistant to change. Absent influences, a node with negative threshold is predisposed to adopting action 1 (yea vote), and a node with positive threshold is predisposed to \(-1\) (nay vote). The matrix of influence weights \({\mathbf {W}} \in {\mathbf {R}}^{n \times n}\) and the threshold vector \({\mathbf {b}} \in {\mathbf {R}}^n\) constitute the LIG model. The action \(x_i \in \{1, -1\}\) chosen by each node i is the outcome of the model, as described below in game-theoretic terms.

Each node’s best response to other nodes’ actions depends on the net incoming influence and the node’s threshold. When the total incoming influence from nodes playing 1 minus the total incoming influence from nodes playing \(-1\) exceeds the node’s threshold level, that node’s best response is 1. If below, it is \(-1\); in the case of a tie, the node is indifferent and can play either. Note that the best responses of the nodes are interdependent. A vector of mutual best responses of all the nodes is a stable outcome of the model, formally known as a pure strategy Nash equilibrium (PSNE). It is stable because no node has any incentive to deviate from it. The LIG model adopts PSNE to represent stable collective outcomes from a complex network of influence. Before formally defining the technical terms, we illustrate the model using an example.

Fig. 1
figure 1

LIG example. A four-node LIG is shown here. The directed edges are labeled with influence levels. Any absence of a directed edge implies an influence level of 0. Threshold values of 0 (for simplicity) are shown with a connector to each node. We assume binary actions \(\{1, -1\}\). In this game, nodes A and B playing 1 and nodes C and D playing \(-1\) is a pure strategy Nash equilibrium (PSNE). To see this, consider node A first. We add up the incoming influences from those nodes (in this case, B) that are playing 1 and then subtract from it the influences coming from nodes (in this case, C and D) playing \(-1\). We get \(1 - (-2 -1.5) = 4.5\), which is basically the total weighted influence on A. Since 4.5 is greater than A’s threshold of 0, A’s best response is 1. Similarly, it can be shown that B, C, and D’s best responses are \(1, -1, -1\), respectively, and therefore, this is a PSNE. Similarly, nodes A and B playing \(-1\) and C and D playing 1 is another PSNE. As a negative example, all nodes playing 1 is not a PSNE. To see this, consider node A. The total weighted influence on A is \(1 + (-2) + (-1.5) = -2.5\), which is less than A’s threshold of 0. Therefore, A’s best response is to play \(-1\), which violates the mutual best response condition for PSNE

Example. Fig. 1 illustrates the LIG model with a simple, 4-node example. Note that the LIG model allows edges of opposite polarities between two nodes. This is not shown in this example for simplicity. As explained in Fig. 1, A and B playing 1 and C and D playing \(-1\) is a PSNE, whereas all nodes playing 1 is not a PSNE.

As shown for node A in the above example, the process of adding up incoming influences from nodes playing 1, then subtracting influences from nodes playing \(-1\), and finally comparing the result with the threshold value is succinctly captured by the influence function defined in Definition 3.1. The best response calculation (e.g., node A’s best response is to play 1 if the total weighted influence on A exceeds its threshold) can be done using the payoff function defined in Definition 3.2. Finally, PSNE is formally defined in Definition 3.3. In the following formal definitions, we use the same notation as [30].

Definition 3.1

(Influence function [30]) The influence function of each individual i, given others’ actions \({\mathbf {x}}_{-i}\), is defined as \(\textstyle f_{i}({\mathbf {x}}_{-i}) \equiv \sum _{j \ne i} w_{ij} x_j - b_i\) where for any other individual j, \(w_{ij} \in {\mathbb {R}}\) is a weight parameter quantifying the “influence factor” that j has on i, and \(b_{i} \in {\mathbb {R}}\) is a threshold parameter for i’s level of “tolerance.”

Here, individuals receive influences from other players and have an influence threshold of their own, which accounts for their own resistance to external influence. The influence function \(f_i\) calculates the weighted sum of incoming influences on i, as described in the paragraph above Definition 3.1, and subtracts i’s threshold from it.

Example. In the LIG shown in Fig. 1, when B plays 1 and C and D play \(-1\), the influence function of A is \(1 \times 1 + (-1) \times (-2) + (-1) \times (-1.5) - 0 = 4.5\). In contrast, when B, C, and D play 1, the influence function of A is \(1 \times 1 + 1 \times (-2) + 1 \times (-1.5) - 0 = -2.5\). Note that the influence function of A does not depend on A’s action.

We next define the payoff of each player. The payoff function happens to be one of the main ingredients of any game-theoretic model.

Definition 3.2

(Payoff function [30]) For an LIG, we define the payoff function \(u_i: \{-1,1\}^n \rightarrow {\mathbb {R}}\) as \(u_i(x_i,{\mathbf {x}}_{-i}) \equiv x_i f_i({\mathbf {x}}_{-i})\), where \({\mathbf {x}}_{-i}\) denotes the vector of a joint action of all players except i and \(f_i\) is defined in Definition 3.1.

The payoff function quantifies the preferences of the players based on the actions of other players. Given the action of all other individuals \({\mathbf {x}}_{-i}\) and influence function \(f_{i}({\mathbf {x}}_{-i})\), an individual will prefer to choose either 1 or \(-1\) as follows. When \(f_{i}({\mathbf {x}}_{-i})\) is negative, \(x_i = -1\) will result in a positive payoff; when \(f_{i}({\mathbf {x}}_{-i})\) is positive, \(x_i = 1\) will result in a positive payoff. Actions chosen in this fashion in order to result in a positive payoff (i.e., to maximize payoff) is defined as the best response.

Example. For the LIG shown in Fig. 1, when A and B play 1 and C and D play \(-1\), A’s payoff is \(1 \times 4.5 = 4.5\). In this scenario, A is playing its best response because if A were to play \(-1\), A’s payoff would have been \(-4.5\). As another example, when everyone plays 1, A’s payoff is \(1 \times (-2.5) = -2.5\). Here, A is not playing its best response because A could have gotten a payoff of 2.5 by switching to action \(-1\). Note that the payoff of a node does depend on the node’s own action.

We next define pure-strategy Nash Equilibrium (PSNE) of an LIG. PSNE is one of the most central solution concepts in game theory. A PSNE signifies everyone playing their best responses simultaneously.

Definition 3.3

(Pure-strategy Nash equilibrium [30]) A pure-strategy Nash equilibrium (PSNE) of an LIG \({{\mathcal {G}}}\) is an action assignment \({\mathbf {x}}^* \in \{-1,1\}^n\) that satisfies the following condition. Every player i’s action \(x_i^*\) is a simultaneous best response to the actions \({\mathbf {x}}_{-i}^*\) of the rest.

Example. In our running example (Fig. 1), nodes A and B playing 1 and nodes C and D playing \(-1\) is a PSNE because it can be verified that every player is playing their best response simultaneously. As another example, nodes A and B playing \(-1\) and C and D playing 1 is also a PSNE. As shown in Fig. 1, all nodes playing 1 cannot be a PSNE.

We adopt PSNE as the notion of stable outcomes arising from a network of influence. We are interested in questions like how the network changes based on the spheres of legislation and what impact the spheres have on polarization and most influential nodes. For these, we learn the networks using the spheres data.

Machine learning

We use Honorio and Ortiz’s machine learning algorithm to instantiate an LIG from raw roll-call data [26]. The goal of the algorithm is to capture as much of the ground-truth data as possible as PSNE (the empirical proportion of equilibria), without having so many total PSNE (the true proportion of equilibria) that the model is meaningless. For example, if all influence weights and threshold levels are 0 (i.e., W = 0, b = 0), then all \(2^n\) possible joint actions among n players would be PSNE, trivially covering all observed voting data. However, this is undesirable as it has no predictive power at all. Therefore, we would like to maximize the empirical proportion of equilibria while minimizing the true proportion. Following is a gist of Honorio and Ortiz’s machine learning algorithm resulting from a very lengthy proof [26].

Learning algorithm

To balance the true and empirical proportions of equilibria, the learning algorithm uses a generative mixture model that picks a joint action which is either a PSNE or non-PSNE of an LIG model \({\mathcal {G}}\) with probabilities q and \(1-q\), respectively. Of course, our goal is to learn the game \({\mathcal {G}}\). Let \({{\mathcal {N}}}{{\mathcal {E}}}({\mathcal {G}})\) denote the set of PSNE of \({\mathcal {G}}\) and \({\mathcal {D}} = \{{\mathbf {x}}^{(1)}, {\mathbf {x}}^{(2)},..., {\mathbf {x}}^{(m)}\}\) be the dataset of m voting instances. The empirical proportion of equilibria, \({\widehat{\pi }}({\mathcal {G}})\), is the fraction of data captured as PSNE of \({\mathcal {G}}\). This is formally defined as follows, where \(\mathbb {1}\) is the indicator function returning 1 if the condition is true, 0 otherwise:

$$\begin{aligned} {\widehat{\pi }}({\mathcal {G}}) \equiv \frac{1}{m} \sum _{{\mathbf {x}} \in {\mathcal {D}}} \mathbb {1}[{\mathbf {x}} \in {{\mathcal {N}}}{{\mathcal {E}}}({\mathcal {G}})]. \end{aligned}$$

The true proportion of equilibria, denoted by \(\pi ({\mathcal {G}})\), is the fraction of all joint actions among n players that are PSNE, regardless of their existence in the voting instance data. This can be expressed as:

$$\begin{aligned} \pi ({\mathcal {G}}) \equiv |{{\mathcal {N}}}{{\mathcal {E}}}({\mathcal {G}})|/2^n. \end{aligned}$$

Given a set of voting instances \({\mathcal {D}}\), the average log-likelihood of the probabilistic generative model can be written as follows. Here, KL stands for the Kullback–Liebler divergence [11, Ch 2]:

$$\begin{aligned} \widehat{{\mathcal {L}}}({\mathcal {G}}, q) = KL ({\widehat{\pi }}({\mathcal {G}}) \, || \, \pi ({\mathcal {G}}) ) - KL ({\widehat{\pi }}({\mathcal {G}}) \, || \, q ) - n \log 2. \end{aligned}$$

Leaving the rigorous mathematical proof [26] aside, we can intuitively see how maximizing the above log-likelihood achieves maximization of the empirical proportion of equilibria \({\widehat{\pi }}({\mathcal {G}})\) relative to the true proportion of equilibria \({\pi }({\mathcal {G}})\). For this, note that the first term above, \(KL ({\widehat{\pi }}({\mathcal {G}}) \, || \, \pi ({\mathcal {G}}) )\), is maximized by a game \({\mathcal {G}}\) that makes \({\widehat{\pi }}({\mathcal {G}})\) as big as possible while making \({\pi }({\mathcal {G}})\) as small as possible. In other words, the game should capture as much of the data as possible as PSNE while keeping its total number of PSNE as small as possible.

Furthermore, the second term, \(- KL ({\widehat{\pi }}({\mathcal {G}}) \, || \, q )\) becomes 0 when \({\widehat{\pi }}({\mathcal {G}}) = q\). This indicates that the optimal mixture parameter q is \({\widehat{\pi }}({\mathcal {G}})\). This leaves learning \({\mathcal {G}}\) to maximize \(KL ({\widehat{\pi }}({\mathcal {G}}) \, || \, \pi ({\mathcal {G}}) )\) as the main task because we are maximizing the log-likelihood over all choices of \({\mathcal {G}}\) and q. The main challenge here is dealing with \(\pi ({\mathcal {G}})\) due to the hardness of computing PSNE [30]. However, it can be shown that with high probability, maximizing a lower bound of the log-likelihood is equivalent to maximizing \({\widehat{\pi }}({\mathcal {G}})\) over all choices of \({\mathcal {G}}\). This is equivalent to minimizing \(1 - {\widehat{\pi }}({\mathcal {G}})\), which leads to the following loss minimization formulation:

$$\begin{aligned} \min _{\mathbf{W },\mathbf{b }}\frac{1}{m}\sum _l \max _i\,{\ell }\big [{x_i^{(l)}( \mathbf{w }^T_{i,-i} \mathbf{x }^{(l)}_{-i} - b_i )}\big ]. \end{aligned}$$

Above, the loss function \(\ell\) represents the errors in best responses. It is easy to explain the above using the 0/1 loss function \(l(z) \equiv \mathbb {1}[z < 0]\). Whenever any player in the l-th voting instance does not play its best response, \(\max _i\,{\ell }\big [{x_i^{(l)}( \mathbf{w }^T_{i,-i} \mathbf{x }^{(l)}_{-i} - b_i )}\) is 1. When all players play their best responses, then \(\max _i\,{\ell }\big [{x_i^{(l)}( \mathbf{w }^T_{i,-i} \mathbf{x }^{(l)}_{-i} - b_i )} = 0\), signifying a PSNE. For practical purposes of optimization, instead of the 0/1 loss function, a continuous loss function like the logistic loss function is used.

The final optimization problem is the following:

$$\begin{aligned} \min _{\mathbf{W },\mathbf{b }}\frac{1}{m}\sum _l \max _i\,{\ell }\big [{x_i^{(l)}( \mathbf{w }^T_{i,-i} \mathbf{x }^{(l)}_{-i} - b_i )}\big ] + \rho ||\mathbf{w }||_1. \end{aligned}$$

Here, m is the number of bills, \(\ell\) is the typical logistic loss function, and \(\rho\) is an \(l_1\) regularization parameter controlling the number of edges \(||\mathbf{w }||_1\). That is, we prefer sparser networks if the solution quality is not degraded too much.

We solve the above optimization for each sphere of legislation and obtain an influence network. While doing this, we rigorously cross validate to avoid overfitting or underfitting as described in the next section.

Cross-validation and model selection for LIG

To make use of the \(l_1\)-regularized model, we must choose a regularization parameter \(\rho\). High values of \(\rho\) assign a higher penalty to the number of edges in the graph and result in a sparser graph, while low values of \(\rho\) assign a lower penalty and result in a denser graph. While low values of \(\rho\) will be better fitted to the model, there is a risk of overfitting—“memorizing” the data—which results in poor predictive performance on new data.

Additionally, the number of edges must be taken into consideration because the problem of computing equilibria is NP-hard [29, 30]. In fact, it is likely that an extremely complex model would have so many edges that equilibria computation would never finish within a reasonable time-frame of several days. However, an exceedingly low number of edges would lead to an under-fit model, and could not be generalized to new data. Therefore, we must pick a \(\rho\) value which strikes a balance between computation time and the risks of over- and under-fitting.

We use cross-validation (CV) to determine the effectiveness of a given \(\rho\) value. In CV, a process essential to most machine learning applications, data are partitioned into two sets: training and validation. The model is trained using the training set and then employed to make predictions against the validation set. The performance of the model is measured by the error in the training and validation set. When a model is overfit, validation error will be significantly higher than training error. When a model is under-fit, both validation and training error will be high. In CV, researchers adjust the parameters of the machine learning algorithm to create the best model which neither underfits nor overfits the data.

With large datasets, training and validation sets are often created by splitting the data in half, or holding out some smaller proportion of the data.Footnote 3 However, the four datasets generated by clustering method are too small to form informative predictions if they are further reduced by this straightforward partitioning. Instead, we used k-fold CV, which leverages re-sampling to form useful insights on small datasets. In k-fold CV, the dataset is randomized and split into k partitions. In one run of the k-fold CV, one of the k sets is chosen as the validation set, while the remaining \(k-1\) sets are combined to form the train set. On the next run, a different set is chosen as the validation set, and the others are used to train the model. Measures of accuracy and error from each run are averaged across the k runs. Choosing a value of k is arbitrary, but \(k=10\) is often used in research applications, colloquially known as 10-fold CV.

We ran 10-fold CV on each sphere with \(0< \rho < 0.01\), tracking three measures of model performance:

  1. 1.

    Number of edges in the training set graph

  2. 2.

    Best response (BR) error, or the percentage of senators not playing their best response, in training and validation sets

  3. 3.

    q, the proportion of votes recorded as PSNE, in training and validation sets.

We chose \(\rho\) values, shown in based on the following goals:

  1. 1.

    The graph is sparse enough to efficiently compute equilibria

  2. 2.

    The model neither overfits nor under-fits the data (i.e., BR error is low, and the differences between training and validation sets for BR error and q are low)

  3. 3.

    The proportion of observed roll-call votes that are PSNE (q) according to the learned model is high

We next present the cross-validation results for Sphere 1.

Cross-validation on Sphere 1 (Security & Armed Forces). As shown in Fig. 2, the number of edges drastically decreases until \(\rho = 0.000367\) and then begins to decrease at a slower rate, reaching a reasonable number of edges between values of 0.002424 and 0.003455. BR error in both the training and validation set remains low until \(\rho \; \ge \;0.004\) and then begins to increase, showing that the model performs well until that point. Until \(\rho = 0.001512\), the drastic difference between training and validation q values shows that the model is overfit, and the regression to \(q=0\) when \(\rho >0.007014\) shows that the model is under-fit. Between values of 0.002154 and 0.03455, all metrics are within an acceptable range.

Fig. 2
figure 2

Cross validation: Sphere 1 (Security & Armed Forces). We perform tenfold CV for \(0< \rho < 0.01\). The plots for the number of edges, best response errors, and the proportion q of data that are PSNE are shown here

While we leave the detailed cross-validation results for the other spheres to Appendix B, there are a lot of similarities among these results. Across all spheres, when \(\rho =0\), the learned model is basically memorizing the training data as training error is 0, validation error is relatively high, and the proportion q of data captured as PSNE is drastically higher for the training set than the validation set. This is the overfitting regime. As \(\rho\) increases, validation and training errors begin to converge, as do the validation and training q values. At higher \(\rho\) values, validation and training errors are both prohibitively high and the learning enters the under-fitting regime.

Table 3 summarizes the \(\rho\) values that we have chosen according to the three criteria listed above. We use these values of \(\rho\) to produce the LIG models used throughout the rest of the paper.

Table 3 Chosen values of \(\rho\) for each sphere and the corresponding number of edges and the average best response (BR) error of validation sets
Fig. 3
figure 3

A bird’s eye view of the LIG network for Sphere 1 (Security & Armed Forces). Red nodes are Republicans, blue Democrats, and green Independents. Darker nodes have higher threshold and thicker edges have more influence weights. The strongest 40% incoming and outgoing edges for each node are shown

Fig. 4
figure 4

A bird’s eye view of the LIG network for Sphere 2 (Economics & Finance). The interpretation is the same as Fig. 3

Polarization in context

Visualization of the machine learned networks clearly shows that the network structure varies according to the spheres of legislation. In all spheres, however, the force-directed drawing algorithm automatically distinguishes Republicans from Democrats. Figures 3 and 4 depict the LIG visualizations for Spheres 1 (Security & Armed Forces) and 2 (Economics & Finance) as representative examples. The visualizations for the remaining spheres can be found in Appendix F.3. In this section, we discover different degrees of polarization across the spheres by investigating cross-party (or cross-border) edges, influence weights and thresholds, and modularity measures. We begin with cross-party edges.

Cross-party edges

The boundary between the two parties is interesting for studying polarization. Even though negative edges more often occur at the boundary, the connectivity between the two parties varies a lot according to the spheres of legislation. These are depicted in Figs. 5, 6 for Spheres 1 and 2, respectively (others are in Appendix F.4).

Fig. 5
figure 5

Cross-party edges in Sphere 1 (Security & Armed Forces): Graphviz visualization of cross-party edges connecting members of the opposing parties within the strongest 40% of all edges. Here, 52 boundary edges are positive and 37 negative

Fig. 6
figure 6

Cross-party edges in Sphere 2 (Economics & Finance): Graphviz visualization of cross-party edges connecting members of the opposing parties within the strongest 40% of all edges. Here, 4 cross-party edges are positive and 8 negative

Figure 6 shows the cross-party edges in Sphere 2 (Economics & Finance), which starkly contrasts those of Sphere 1 (Security & Armed Forces) shown in Fig. 5. In Sphere 2, only 12 of the strongest 40% of edges are between members of different parties. Of these, 2/3 are negative, suggesting a very polarized network. Aside from two positive influences between Maine senators King (a left-leaning Independent) and Collins (a center-leaning Republican), the remaining two positive connections are the weakest of all connections shown for this sphere.

Similarly, examining inter-party edges reveals that Sphere 3 (Energy & Infrastructure) is also very polarized. While there are many edges between both parties in this network, about 70% of them are negative. Positive influences come from a few sources, again including the centrist Senator Collins. Incongruously, prominent right-wing senator Tom Cotton (R-AR) also exhibits positive influences with Democratic senators. However, most other far-left or far-right leaning senators, including Sanders (I-VT) and Cruz (R-TX), only exhibit negative influences with the opposite party.

Sphere 4 (Public Welfare)’s inter-party edges strike a balance between the polarities exhibited by the previous three spheres. There are slightly more positive edges (9) than negative edges (7), but still a low number of edges overall. Again, there are positive influences between Maine senators King (I-ME) and Collins (R-ME), but also positive influences between Senator McConnell and Democratic senators King (D-ME) and Tester (D-MT).

Overall, each sphere exhibits some level of polarization, but some spheres are far more polarizing than others. Some senators are present in every sphere’s inter-party boundary, whether for positive or negative influences. Maine Senators Collins (R) and King (D) often share positive influences with each other, as well as other senators. Senator Lee (R-UT), a conservative libertarian, always exhibits negative edges with members of the other party, although in Sphere 1, he also shares positive influences with senators Harris (D-CA) and Feinstein (D-CA). Meanwhile, left-wing icon Bernie Sanders (I-VT) exhibits the equivalent behavior, with only negative cross-party edges in all spheres except Sphere 1. These results suggest that Sphere 1 (Security & Armed Forces) is least polarized, whereas Spheres 2 (Economics & Finance) is highly polarized.

Influence weights and thresholds

We now take a closer look at the influence weights and thresholds of the machine learned models, beyond just the cross-party edges. Figure 7 shows a histogram of four different categories of edge weights: in the top row, Democrat-to-Democrat and Republican-to-Republican, and in the bottom row, Democrat-to-Republican and Republican-to-Democrat (note that the edges are directed). In each plot, the histograms for the four spheres are superimposed for the purpose comparison. For the intra-party edges (D–D and R–R), Spheres 2, 3, and 4 have very similar histograms and they are different from the histogram of Sphere 1 (Security & Armed Forces). At the peak, the number of intra-party edges in Sphere 1 is dominated by the other spheres. However, for higher edge weights, Sphere 1 dominates the other spheres. This indicates that there are stronger D–D and R–R influences in Sphere 1 compared to the other spheres, which in turn may indicate more polarization in Sphere 1. Interestingly, if we look at the cross-party edges (D-R and R-D), we can see that Sphere 1 again dominates the other spheres in the positive influence weights regime. Note that in the bottom row of Fig. 7, the peak of Sphere 3 dominates that of Sphere 1, but Sphere 3’s peak is in the negative influence regime, whereas Sphere 1’s peak is in the positive influence regime.Footnote 4 All of these indicate that there are more positive influences within and across the two parties in Sphere 1 compared to the other spheres, which contributes to Sphere 1 being less polarized.

Fig. 7
figure 7

Histograms of edge weights. Best viewed in color, each plot shows frequency polygons, a form of histogram, of edge weights for the four spheres. Clockwise from top left: Democrat-to-Democrat, Republican-to-Republican, Republican-to-Democrat, and Democrat-to-Republican edges. The data are parsed into bins \((x_1, x_2], (x_2, x_3],..., (x_{n-1}, x_n]\). We then plot the number of data samples that appear in the bin \((x_{i-1}, x_i]\) against \(x_i\). The plots show how the influence weights in Sphere 1 (Security & Armed Forces)— shown in black dots—are different from the rest of the spheres

Of course, the influence weights cannot be read alone without considering thresholds because the game-theoretic model accounts for both of these in predicting stable outcomes. Recall that the threshold magnitude signifies stubbornness or resistance to influence. More positive threshold values resist positively weighted influences by leaning to play \(-1\) in the presence of (1) positive influence from those playing 1 and (2) negative influence from those playing \(-1\) (in both cases, a neighbor’s action times the influence from that neighbor is positive). More negative threshold values resist negatively weighted influences in a similar fashion.

Fig. 8
figure 8

Histograms of thresholds. Frequency polygons of the Democratic and Republican senators’ thresholds are shown in these two plots. Each plot shows the frequency polygons of the four spheres. The frequency polygon of Sphere 1 (Security & Armed Forces) is “flatter” than the other three spheres for both Democratic and Republican senators

Figure 8 shows the threshold histograms for the two parties. The most interesting aspect of these histograms is that for both Democratic and Republican senators, the threshold distribution is “flatter” in Sphere 1 (Security & Armed Forces) compared to the other spheres. This indicates that for both parties, the thresholds are more “uniformly distributed” in Sphere 1 than in the other spheres. In contrast, in Spheres 2 (Economics & Finance), 3 (Energy & Infrastructure), and 4 (Public Welfare), the threshold values of each party are concentrated in one region, which indicates the similarity among the senators belonging to the same party. Together with negative cross-party edges and positive intra-party edges, this contributes to polarization in these spheres. While Fig. 8 shows the histogram of each party for different spheres, Fig. 9 makes a comparison of the histograms of the two parties for each sphere separately. The contrast between the two parties is not as remarkable as the contrast among the four spheres for any party.

As a final note, we emphasize that the threshold values on their own lack sufficient predictive power. In fact, the main component of the LIG model is the interdependence among the senators’ actions through the influence structure. Having said that, if a sphere is overwhelmingly dominated by bills sponsored by one of the two parties, then it is possible that the machine learning algorithm would assign low threshold values to the senators of that party (that is, those senators would be predisposed to voting yea).Footnote 5 Even then, the influence weights would play a role in predicting the stable outcomes. Investigating this issue using sponsorship and co-sponsorship data is an interesting future direction.

Fig. 9
figure 9

Comparative histograms of Democratic and Republican thresholds. Frequency polygons of the Democratic and Republican senators’ thresholds are compared for the four spheres. We do not immediately see any significant contrast

Modularity

Furthermore, a formal study of polarization rooted in network science produces similar results. Modularity [23, 41, 42] has been widely used as a measure of polarization in networks. We apply the following definition of modularity derived for directed networks with signed weights [20]:

$$\begin{aligned} Q=\frac{1}{2w^++2w^-}\displaystyle \sum _{i}\displaystyle \sum _{j}\left[ w_{ij}-\left( \frac{w_i^{+,out}w_j^{+,in}}{2w^+}-\frac{w_i^{-,out}w_j^{-,in}}{2w^-}\right) \right] \times \delta \left( C_i,C_j\right) . \end{aligned}$$

Here, \(w_{ij}\) is the weight of edge i to j, \(w_{{ij}}^{ + } = {\text{max}}\{ 0,w_{{ij}} \}\), \(w_{ij}^- = {\text{max}}\{0, -w_{ij}\}\), and \(2w^\pm\) is the total weight of all positive or negative edges, expressed by \(\sum _{i}\sum _{j}w_{ij}^\pm\). Furthermore, \(w_{i}^{{ \pm ,{\text{out}}}}\) is the weighted out-degree \(\sum _{k}w_{ik}^\pm\) and \(w_j^{\pm ,in}\) is the weighted in-degree \(\sum _{k}w_{kj}^\pm\). The Kronecker delta function \(\delta \left( C_i,C_j\right)\) is 1 if i and j belong to the same party; it is 0 otherwise.

Applying this definition, We obtain the following modularity scores for the four spheres of legislation respectively: 0.7861, 0.8904, 0.8724, and 0.8857 (see Table 5). This shows that Sphere 1 (Security & Armed Forces) is least polarized and Spheres 2 (Economics & Finance), 3 (Energy & Infrastructure), and 4 (Public Welfare) are much more polarized.

It is important to note that modularity does not always indicate polarization. As Guerra et al. [22] show, there are networks that exhibit community structure despite not being polarized. However, in our case, we are not investigating whether Congress is polarized or not. Polarization in Congress is already a settled matter [15]. We are rather investigating to what degree Congress is polarized based on the spheres of legislation. Furthermore, our analysis of cross-party edges ("Cross-party edges" section) resonates with Guerra et al.’s main idea that in a polarized network, the nodes at the border are on average more connected inside their own community than outside.

Most influential nodes in context

There exists a number of centrality measures that are derived from a structural analysis of networks [31]. However, our model is behavioral where nodes adopt their best responses to each other. In a strictly game-theoretic model of behavior, a set of nodes will be called most influential with respect to achieving a desirable stable outcome if their choice of actions leads the whole system of influence to that desirable stable outcome [29, 30]. Here, a crucial aspect is a desirable stable outcome, represented by a PSNE. For example, let us say that our desirable stable outcome is to pass a bill by a 100–0 vote. A set of senators will be called most influential if their voting together influences every other senator to also vote for the bill, thereby having the desirable stable outcome as the unique PSNE outcome. This concept can be extended to other types of desirable stable outcomes like blocking a bill unanimously, passing a bill with at least 60 votes, forcing/avoiding a filibuster, etc. When there are multiple most influential sets, we naturally prefer smaller sets of most influential nodes.

The above concept of most influential nodes is centered around stable or PSNE outcomes. As we will see in  "Computing most in uential nodes" section, it requires computation of all PSNE. We next outline how we compute all PSNE for each sphere of legislation.

PSNE computation

Once the LIG is instantiated by the machine learning algorithm (see "Machine learning" section), we can compute the set of all PSNE using the algorithm described in [30]. This is a backtracking search algorithm which takes advantage of the graph’s structure. We give a brief overview below.

The algorithm begins by selecting the node with the highest out-degree—the node which directly influences the most other nodes—and assigns it the action 1. It progressively selects new nodes and assigns them the action 1 until all nodes are assigned actions without any contradiction (indicating a PSNE) or it encounters a contradiction that guarantees that there is no PSNE with the actions assigned so far. It then revisits the most recent node and changes its action from 1 to \(-1\). After this, the algorithm again tries to make progress. In general, at any stage of the algorithm, we have a partial joint action, which is the action (1 or \(-1\)) of each node selected so far. If some node in the network is not playing its best response, the partial joint action cannot lead to a PSNE. When this occurs, the algorithm tries a different action for the most recently selected node v if it has not already done so. If trying a different action for v still leads to a contradiction, the algorithm backtracks by deselecting v and changing the action of the node that had been selected before v. When every node is playing their best response with respect to each other, we have reached a PSNE. Importantly, the algorithm always tries to reach a contradiction so that it can reduce the overall computation time by pruning large parts of the search tree. This process repeats until all possible PSNE have been found.

For each sphere, we ran the algorithm on Bowdoin College’s high-performance computing (HPC) grid. The number of PSNE created for each sphere’s LIG given our chosen \(\rho\) values are summarized in Table 4. Note that the number of PSNE is a tiny fraction of the \(2^{103}\) possible joint actions. These sets of all PSNE are necessary to compute the most influential senators, which we describe next.

Table 4 \(\rho\) values selected through cross-validation and the corresponding PSNE counts across different spheres

Computing most influential nodes

Algorithmically, the most influential nodes problem asks for selecting a minimum set of nodes, such that when they choose their actions according to the desirable stable outcome (e.g., voting yea when the desirable stable outcome is passing a bill unanimously), the desirable stable outcome becomes the only possible PSNE. An approximation algorithm for computing most influential senators was given by Irfan and Ortiz [30], which produces a directed acyclic graph (DAG). The algorithm requires precomputation of all PSNE, which is a provably hard problem [30]. We apply Irfan and Ortiz’s PSNE computation algorithm to the LIG for each sphere of legislation. Having computed all the PSNE, we then compute the DAG representing most influential sets of nodes. Figures 10, 11 show the results of the most influential nodes algorithm for Spheres 1 and 3, respectively, where the desirable stable outcome is to achieve the most number of yea votes possible in any PSNE (that is, to gain the most support possible from the legislative body according to our model). The way to read Figs. 10, 11 is to inspect each DAG and find a top to bottom path. Each of these paths gives a most influential set.

Fig. 10
figure 10

Most influential nodes for Sphere 1 (Security & Armed Forces): Directed acyclic graphs (DAGs) representing sets of most influential nodes. Any top-to-bottom path gives a most influential set. Here, 4 Republicans and 4 Democrats are most influential

Fig. 11
figure 11

Most influential nodes for Sphere 3 (Energy & Infrastructure): The interpretation is similar to Fig. 10. Here, 5 Republicans and 6 Democrats are most influential

The sets of most influential senators in each sphere support the inferences gained from analyzing the LIG networks. As illustrated in Fig. 10, in Sphere 1 (Security & Armed Forces), 4 Republicans and 4 Democrats comprise a set of 8 most influential senators. In other words, 8 senators and, more importantly, the balanced bipartisan groups of 8 senators shown in Fig. 10 are sufficient to generate the maximum possible support for a bill in Sphere 1. As shown in Fig. 11, in Sphere 3 (Energy & Infrastructure), 5 Republicans and 6 Democrats comprise a set of 11. This suggests that Sphere 3 is more polarized than Sphere 1, since it requires a larger body of influencing senators. The DAGs for the other spheres are shown in Appendix E.

Game-theoretic vs. structural centrality measures

In the above game-theoretic formulation of most influential nodes, we find that each set of most influential senators across all spheres is comprised of an (almost) equal number of Democrats and Republicans. This signifies the need for bipartisan support to guarantee passing a bill with the maximum possible support under the PSNE constraints. As we show next, this also happens to be a distinguishing feature between game-theoretic and structural measures. Table 5 shows various centrality measures and other quantities computed for each sphere.

First, measures like diameter, average shortest path length, and clustering coefficient reveal some, but not many, differences among the spheres. The network diameter of Sphere 3 (Economics & Finance) is 4, and the network diameter of every other sphere is equal to 5. The average shortest path lengths between all four spheres are similar to one another, ranging between 2.2295 and 2.5476. Being close to half the size of the network diameter, these values suggests that most nodes in the network are well connected, though not all. The average clustering coefficient is a measure of the density of triangles in a network. In more polarized networks, we might expect this value to be high because senators who are closely aligned on partisan issues would be well connected with each other. In each sphere, the average clustering coefficient is similar, but lower in Spheres 1 and 3 (0.0187 and 0.0174, respectively) than in Spheres 2 and 4 (0.0206 and 0.0218, respectively). These measures, however, do not give a direct indication of polarization, at least not as much as the modularity measure. We discussed the modularity values in "Polarization in context" section.

We now focus on the widely applied structural measures of centrality. For each sphere, we show the top 10 most central senators with respect to four centrality measures: degree, closeness, betweenness, and eigenvector. The simplest measure is degree centrality, or the number of nodes each node is connected to (normalized by the maximum possible degree, \(N-1\) or 102 in our case). The next form is closeness centrality, or how close a node is, on average, from every other node in the network. The third form is betweenness centrality, which is the average number of times the node is present along the shortest path from any other two nodes. The final form is eigenvector centrality, which has a self-referential definition accounting for the centrality of a node’s neighbors.

Most notably, these centrality measures do not capture the strategic aspects of behavior. Throughout most measures, Republican senators are overrepresented, comprising the majority of the top ten most central nodes. In contrast, the game-theoretic measure gives a balanced coalition between Democrats and Republicans. This is important because when networks are polarized, achieving a desirable stable outcome requires support from both sides.

Table 5 Network analysis of learned influence networks for different spheres of legislation. Various centrality measures and network-level properties are shown

Toward richer models: ideal point models with social interactions

We also apply a richer model of influence recently proposed by Irfan and Gordon [28] that extends the LIG model by incorporating ideal points of senators and polarities of bills. Their work showed the value of combining game-theoretic and statistical models for studying strategic interactions in context, but they assume the network to be fixed, regardless of the bill context. We use their model and allow the network to change based on the spheres of legislation. We also perform an analysis of the networks learned.

We start with an overview of how Irfan and Gordon’s model [28] builds on the political science literature on ideal point models [8, 13, 32, 45, 46, 48]. Ideal point models are predictive statistical models that assign each senator i an ideal point \(p_i\) signifying the senator’s legislative position. Usually, more negative values of \(p_i\) mean more liberal position and more positive values mean more conservative. Similarly, each bill l is also assigned a polarity \(a_l\) signifying the position of the bill in the liberal to conservative spectrum. There is a third model parameter called the popularity \(r_l\) of bill l representing the fraction of senators supporting the bill. The ideal point model in its most basic form defines the probability of senator i supporting bill l using the following logistic function \(\sigma\):

$$\begin{aligned} p(x_{i,l}=\textit{yea}\ |\ p_i, a_l, r_l) = \sigma (p_i a_l + r_l). \end{aligned}$$

The ideal point model captures the interdependence among the senators using the \(r_l\) term. However, this term is an aggregate measure quantified by the number of senators voting yea on bill l. In ideal point models with social interactions, Irfan and Gordon expand this aggregate measure by considering how the individual senators are voting and how their votes influence each other [28]. The resulting model is game-theoretic with the following influence function. Here, other than the new terms l, \(p_i\), and \(a_l\) defined in the previous paragraph, the rest of the terms are the same as those in Definition 3.1:

$$\begin{aligned} f_i({\mathbf {x}}_{-i}, l)&\equiv \sum _{j \ne i} {w}_{ij}x_j+(p_i\cdot a_l) - b_i. \end{aligned}$$

Using the above influence function, the richer game-theoretic model is defined in the same fashion as "The LIG model" section.

As a cautionary note, the way Irfan and Gordon’s model [28] combines networks with ideal points makes it difficult to disentangle the two. Analyzing the networks alone may be inconclusive because ideal points also supply the model with predictive power. Moreover, the machine learning algorithm learns these two components simultaneously. With this caveat in mind, we give an analysis of the networks and the ideal points learned.

Analysis of influence networks. Figs. 12, 13 show the learned networks for Sphere 1 (Security & Armed Forces) and 2 (Economics & Finance) under this richer model (other spheres are in Appendix F.3). First, it is evident that the two parties are not as clustered as they were in the LIG model (compare with Figs. 3, 4). Second, a closer look at the cross-party edges shows that there are a lot more negative edges between the two parties under this richer model than there are under the LIG model. We show the cross-party edges for Spheres 1 and 2 in Figs. 14, 15, respectively (others are in Appendix F.4). These two differences can be attributed to using ideal points to discriminate the behaviors of opposing senators.

Fig. 12
figure 12

Learned influence network for Sphere 1 (Security & Armed Forces) under the ideal point model with social interactions. A bird’s eye view of the influence network for Sphere 1 is shown. The strongest 33% of the edges are shown here. Contrast this with Fig. 3 where the two parties are more distinctly clustered

Fig. 13
figure 13

Learned influence network for Sphere 2 (Economics & Finance) under the ideal point model with social interactions. A bird’s eye view of the influence network for Sphere 2 is shown. The strongest 33% of the edges are shown here. Contrast this with Fig. 4 where the two parties are separated to a great extent

Fig. 14
figure 14

Cross-party edges in Sphere 1 (Security & Armed Forces) under the richer model. Only 40% of the strongest cross-party edges are shown. 123 cross-party edges are positive, and 47 are negative

Fig. 15
figure 15

Cross-party edges in Sphere 2 (Economics & Finance) under the richer model. Only 40% of the strongest cross-party edges are shown. 86 cross-party edges are positive, and 57 are negative

Polarization metric based on modularity. The modularity framework discussed in "Polarization in context" section yields scores of 0.5392, 0.6801, 0.6887, and 0.6229, respectively. Both the ideal point metric and modularity scores indicate that Spheres 2 (Economics & Finance) and 3 (Energy & Infrastructure) are most polarizing, whereas Sphere 1 (Security & Armed Forces) is least polarizing. Sphere 4 (Public Welfare) sits in between. These results are somewhat similar to our earlier conclusions based on LIG without using ideal points. We include a broader analysis of the learned influence networks in Appendix F.5.

Polarization metric based on ideal points. We now apply the well-known ideal point-based polarization metric (i.e., distance between the means of the two parties) [40] to calculate polarization levels across the four spheres. The ideal point distributions for two of the spheres are depicted in Figs. 16, 17 (others are in Appendix F.2). Applying the ideal point-based polarization metric, we obtain values of 0.754, 1.235, 1.126, and 0.889 for Spheres 1–4, respectively. Evidently, Sphere 1 is least polarizing with respect to the ideal point distributions alone. Note that in our computation, we have not used the scaled versions of ideal point distributions shown in Figs. 16, 17. Instead, we have the non-scaled, machine learned ideal points. The scaled ideal points are amenable to comparison, but we have observed similar results for non-scaled versions.

Fig. 16
figure 16

Ideal points of Sphere 1 (Security & Armed Forces): The ideal point distributions of Democratic (blue +) and Republican (red x) senators, scaled linearly between \(-1\) and 1 are shown. The distance between the mean ideal points of the two parties is 0.754. It shows that Democratic and Republican senators are ideologically close to each other when it comes to national security

Fig. 17
figure 17

Ideal points of Sphere 2 (Economics & Finance): The distance between the mean ideal points of the two parties is 1.235, which shows more polarization compared to Sphere 1 shown in Fig. 16

We have also applied a recently proposed measure called polarization index [39]. Inspired by the electric dipole moment, the polarization index is measured from an opinion distribution, where opinions propagate from a set of elite entities (e.g., influential politicians and media accounts on Twitter) to listener entities (e.g., ordinary individuals on Twitter). The measure is based on opinion distribution (as opposed to dynamics). Here, we apply it to the machine learned ideal point distribution.Footnote 6 We use the following definition of the polarization index \(\mu\), where \(\Delta A\) represents the difference between the fraction of Republicans and Democrats and \(gc^+\) and \(gc^-\) represent the gravity centers of the Republican and Democratic senators’ ideal points, respectively:

$$\begin{aligned} \mu = (1 - \Delta A) (gc^+ - gc^-)/2. \end{aligned}$$

This definition produces the following polarization indices for the four spheres, respectively: 0.3588, 0.5874, 0.5356, and 0.4227. These are a constant factor off of the ideal point-based polarization measures [40] presented before due to the similarity between the two definitions in our case.

We conclude this section by reiterating an earlier point. Investigating the influence networks and ideal points separately does not give us the complete picture, since the model combines these two components together to make predictions. Therefore, we should also combine them in a meaningful way to infer polarization. We leave this as future work. We also leave open an exploration of the most influential nodes problem under this richer model.

Concluding remarks and research outlook

In this paper, we have studied the linear influence game (LIG) model in the context of four spheres of legislation. We have done a thorough network analysis of the machine learned models for each sphere. Our analysis shows that contrary to the popular notion that the U.S. Congress is overly polarized these days, the measure of polarization varies according to the spheres of legislation. In fact, the two opposing parties tend to come together when dealing with bills in Sphere 1 (Security & Armed Forces). Therefore, the notion of polarization should be contextualized with respect to the spheres of legislation.

We have also shown that across all the spheres, the LIG model predicts that a set of most influential senators consists of a bipartisan coalition (which also differentiates game-theoretic and structural centrality measures). Despite this shared property among the four spheres, the number of senators required to form a most influential set varies. Sphere 1 happens to require the least number of senators in its most influential set to achieve a desirable outcome of garnering the maximum support possible for a bill (under PSNE constraints). Again, this signifies that Sphere 1 is least polarized among the four spheres.

In sum, the consideration of different spheres of legislation reveals interesting aspects of polarization and most influential senators in Congress. Building upon this study, following are some interesting future directions.

  1. 1.

    The most pressing task is to fully explore the ideal point model with social interactions [28] for different spheres of legislation. We have briefly touched upon it in "Towards richer models: ideal point models with social interactions" section. However, as we have mentioned in that section, finding a behavioral definition of polarization that can meaningfully combine different constituent parts of the model, such as ideal points of senators, polarity of bills, influence weights, and threshold values, remains an open problem.

  2. 2.

    In a similar vein, the notion of context provides another interesting direction. In this paper, we use spheres of legislation as a contextual platform for learning, analyzing, and comparing influence networks. The main idea here is that depending on the sphere, the influence network would be different. In contrast, the ideal point model with social interactions also has a contextual element in terms of polarities of bills and ideal points of senators, but it keeps the network fixed. How we can synthesize these two diverging ways of capturing context and thereby give a deeper meaning to context remains open.

  3. 3.

    A detailed comparative study of the most influential nodes for different spheres under the richer model [28] is another interesting direction. In particular, what happens to the balanced, bipartisan composition of most influential sets under the LIG model (see "Most in uential nodes in context" section when we incorporate additional contextual parameters like ideal points and polarities?

  4. 4.

    On the computational front, Irfan and Gordon [28] showed promising results on improving the time required to compute all PSNE. Extending those results to the spheres of legislation setting is another promising direction. It would also be interesting to investigate why their model leads to drastic improvement in computational time.

  5. 5.

    Considering different modeling frameworks is yet another exciting direction. A particularly promising framework is probabilistic graphical models (PGMs). Whereas we are currently constructing the spheres of legislation first and then learning the LIG models for each sphere, PGMs may allow us to do both at the same time. This approach would not require us to split the data. Finally, exploring the recently proposed semi-supervised learning for studying polarization [25] in game-theoretic settings is also an interesting direction.

In addition to the above open directions in the context of legislative chambers, the LIG model may also be applied to other settings where network-connected individuals exhibit influence or behavioral interdependence. Some examples in the public health domain are smoking [7] and obesity [6]. Other promising areas include smart electricity grids, vaccination, and the adoption of microfinance.