Detecting Dividing Lines in Turnout: Spatial Dependence and Heterogeneity in the 2012 US Presidential Election

Fiorino, Nadia; Pontarollo, Nicola; Ricciuti, Roberto

doi:10.1007/s41651-022-00127-9

Detecting Dividing Lines in Turnout: Spatial Dependence and Heterogeneity in the 2012 US Presidential Election

Open access
Published: 11 November 2022

Volume 6, article number 34, (2022)
Cite this article

Download PDF

You have full access to this open access article

Journal of Geovisualization and Spatial Analysis Aims and scope Submit manuscript

Detecting Dividing Lines in Turnout: Spatial Dependence and Heterogeneity in the 2012 US Presidential Election

Download PDF

2275 Accesses
1 Citation
8 Altmetric
Explore all metrics

A Correction to this article was published on 16 February 2023

This article has been updated

Abstract

US voters have been moving further and further apart, most notably in terms of partisanship. This trend has led to a strong geographic concentration of voters’ preferences. We look at how turnout shows a similar pattern by jointly addressing two features of the data: spatial autocorrelation and heterogeneity of the observed units. Results obtained through a spatial lag regression tree procedure for the 2012 US presidential elections allow us to identify twelve groups of counties with similar characteristics. We find that (i) close counties behave similarly in terms of turnout; (ii) across various groups of counties, some variables have different statistical significance (or lack of it, such as household income and unemployment), and often different signs (such as the shares of adherents to congregations, Blacks, and Hispanics, and urban population). These results are useful for targeting geographically based groups in get out the vote operations.

Spatial links in the analysis of voter turnout in European Parliamentary elections

Article Open access 01 February 2021

Tunisian constituent assembly elections: how does spatial proximity matter?

Article 05 December 2014

A Multilevel Spatial Model to Investigate Voting Behaviour in the 2019 UK General Election

Article Open access 11 January 2024

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

In November 2020, Joe Biden won the presidential election in Georgia—for the first time since 1992—and in January 2021, the two Democrat candidates for the US Senate won the runoffs in the same State. Pundits credit these outcomes to two concurring forces: the shift from Republican to Democrat in the suburbs of Atlanta and the increased turnout of the Black population.^{Footnote 1} The former is the local incarnation of a wide liberal shift of well-educated voters. The latter has been nurtured by former state representative Stacey Abrams and her New Georgia Project over the last decade.^{Footnote 2}

These recent examples show how turnout has become a decisive and divisive issue in US politics. In the last two decades, the US electorate is becoming increasingly polarized in socio-demographic, economic, and ideological terms (among others, Boxell 2020). This ongoing polarization is reflected by the growing and deeper divisions between Republicans and Democrats on fundamental political values, such as government, abortion, race, immigration, national security, and environmental protection. According to the Pew Research Center (2017), the average partisan gap has increased from 15% in 1994 to 36% in 2017. One of the most significant consequences of this tendency is the entrenchment of political partisanship in different areas, with cities becoming more liberal and rural areas more conservative (Storper 2018; Durkan 2021). Several causes have been identified for this increase, such as political activism, election policies, in-group bias, and media bubbles (see among others Iyegar et al., 2019).^{Footnote 3}

This premise contributes to defining the research question this paper aims to investigate: Do socioeconomic and demographic dividing lines (i.e., heterogeneity) take place also for the decision to vote?^{Footnote 4}

We focus on the geographical parcellation of the electorate into several homogenous subgroups scattered around the country by analyzing the non-linearity effects of the determinants of voter turnout (i.e., the effects producing a sort of “polarization” of voters along with the determinants of electoral behavior) in combination with the spatial dependence of voter turnout (i.e., the “neighborhood effect”). We investigate these two issues by relying on recent work by Wagner and Zeileis (2019) combining the regression tree from Zeileis et al. (2008) with a spatial autoregressive parameter. The integration of these two methodologies enables the non-homogeneity of regression coefficients to be dealt with, thereby capturing the role of some critical social junctures. In addition, the potentially distortionary effects of spatial dependence can be coped with.

The electoral turnout at the county level in the 2012 US presidential elections provides the empirical setting for our research question. The results confirm a complex multi-regime picture of the determinants of electoral participation. The choice of 2012 and not a more recent election allows us to show that this process has been already in the making for quite some time, before becoming more clearly visible.

By answering our research question, we contribute to the extant literature in two ways. First, when dealing with fine geographical data, two possible concerns may plague the relationship between voter turnout and its covariates. On the one hand, the relationship between turnout and its covariates may not be the same for all the units that are observed, raising the possibility of non-linearity. On the other hand, despite the large set of covariates one may consider, there may still be omitted factors showing a geographical or spatial component that influence electoral participation (Moretti 2012). Research on voter turnout in US presidential elections has increasingly underlined the role of geographical influences, hinting at voting behavior as a result of a multidimensional process that occurs in space and crucially reflects—and is mediated by—the social and geographical environment where individuals are located and interact (Agnew 1987; Pattie and Johnston 2000). However, incorporating heterogeneity has received little attention in the current research on electoral participation.

Second, considering heterogeneity in voter turnout in US presidential elections may help to better understand how local disparities translate into the election outcomes and can be used by the competitors to propose targeted electoral policies to win the elections, especially in closely contested states or electoral districts.

The paper is structured as follows. The “Literature Review” section reviews the existing literature, the “Empirical Strategy, Methodology, and Data” section describes the empirical strategy and data, the “Results” section illustrates the results, and the “Conclusions” section concludes.

Literature Review

Elections are a central feature of democracy, and scholars have long tried to identify and explain the variation in voter turnout. Decades of theoretical and empirical research have shown the latter to be mainly influenced by socio-economic, cultural, demographic characteristics, and politico-institutional environment at the individual as well as at aggregate level (see Cancela and Geys 2016).

Complementing this approach and drawing on the tradition of political geography, a growing number of studies have been paying attention to the role of geographical influences on electoral turnout (among others, Agnew 1987; Pattie and Johnston 2000). Voters do not cast their ballots regardless of where they live. Their voting behavior is rather the outcome of a complex process that occurs in space and is influenced and mediated by their respective environments.

From this perspective, neighborhood processes, and sociogeographical and household interactions (Fieldhouse and Cutts 2008, 2012; Galster 2012) produce particular political traditions, practices, and outcomes (Gimpel et al. 2003), and promote a shared outlook, which translates into widely held attitudes about the value of political and electoral participation. The decision about whether to vote and how to exercise one’s franchise is influenced by local information exchanges. In sum, “people who talk together, vote together” (Pattie and Johnston 2000).

Several scholars have analyzed the determinants of voting in the US (among others Kahane 2009; Cann and Cole 2011). Within this strand of literature, an emerging line of research uses spatial econometrics techniques to reveal the influence of local contests and interactions on turnout. A number of studies have also explored the geography of the US electoral behavior through a spatial econometric methodology to probe the existence of spatial dependence. Kim et al. (2003) test the hypothesis of reward–punishment and issue–priority voting behavior in the presidential elections from 1988 to 2000 using a spatial lag model in a Bayesian framework and find significant spatial dependence. Tam Cho and Rudolph (2008) investigate the spatial structure of political participation through a spatial lag model of political participation across thirty-two cities and eighteen states, covering every region of the nation. Their findings show that spatial proximity influences voter turnout. Moreover, the spatial structure of electoral participation is consistent with a contagion process that occurs irrespective of involvement in social networks. Lacombe et al. (2014) focus on the potential role that spillovers may exert in explaining voter turnout in the 2004 presidential election. Exploiting advances in Bayesian computation, they compare the normal a-spatial linear model, the spatial lag model, the spatial Durbin model and the spatial Durbin error model, and find that the latter is the most appropriate empirical model. Their results show the existence of direct and indirect effects from the set of explanatory variables on voter turnout. Furthermore, several variables traditionally shown to affect voter turnout (i.e., per capita income and the county unemployment rate) are not associated with turnout at the county level.

These contributions, although revealing important insights into voting behavior by recognizing that space and context matter, assume that the relationship between voter turnout and the explanatory variables is linear. This assumption, which is already restrictive when analyzing local voters and politics more generally (see Calvo and Escolar 2003), is even more limiting when focusing on the American electorate, which is becoming more heterogeneous and polarized in terms of demographic, socio-economic, and cultural characteristics (Glaser and Ward 2006; Boxell et al. 2017).^{Footnote 5} In an increasingly polarized landscape, incorporating heterogeneity in presidential election models could capture a crucial feature for the understanding of voting behavior.^{Footnote 6}

Empirical strategy, Methodology, and Data

To address the research question, our empirical strategy consists of applying a spatial autoregressive regression tree methodology (Wagner and Zeileis 2019).^{Footnote 7} Such a methodology applies a regression tree approach after filtering out spatial autocorrelation in the independent variable. The regression tree approach, initiated by Morgan and Sonquist (1963), is a recursive algorithm that checks parameter instability, i.e., non-linearities, in each covariate, and splits the sample accordingly. In detail, in the first step, the algorithm splits the sample into two according to the threshold on that particular covariate that allows minimizing the global residual sum of squares. After splitting, the algorithm proceeds recursively to further split the resulting subsamples by the same method, creating more child nodes and continuing until no further non-linearities are found or until a stopping rule becomes binding. This enables multiple regimes from a set of control variables to be endogenously identified. However, when using geo-referenced data such as ours, spatial autocorrelation in the residuals may occur, leading the regression tree—and more generally a regression analysis—to biased and/or inefficient results (Anselin 1988).

To jointly identify the determinants of voter turnout we start from the spatial lag (SAR) model estimated for the full sample, following Anselin (1988):

$$Y=\rho \mathbf{W}Y+X{\varvec{\beta}}+{\varvec{\varepsilon}}$$

(1)

where y is the vector corresponding to the independent variables, W is a N × N spatial weight matrix, where N is the number of observations, X is the matrix of independent variables, β is the vector of coefficients and ε is the vector of i.i.d. residuals. Finally, the parameter ρ is the spatial lag coefficient, which varies between the minimum and maximum value of the eigenvalue extracted by W, typically around − 1 and 1, and measures the strength of the spatial dependence. The initial SAR model is estimated via a maximum likelihood estimator.

After Eq. (1) is estimated, the spatial lag model can be rewritten as follows:

$$\left({\varvec{I}}-\rho \mathbf{W}\right)Y=X{\varvec{\beta}}+{\varvec{\varepsilon}}\mathrm {or},\mathrm{ equivalently},{Y}^{*}=X{\varvec{\beta}}+{\varvec{\varepsilon}}$$

(2)

where I is an identity matrix and ${Y}^{*}=\left({\varvec{I}}-\rho \mathbf{W}\right)Y$ is the spatially filtered dependent variable, i.e., with the effect of autocorrelation taken out (Anselin and Bera 1998). Subsequently, the estimation of Eq. (2) is performed through ordinary least squares (OLS), which allows us to apply a standard (aspatial) regression tree approach. In our case, the algorithm is set to stop when it finds no additional significant non-linearities at the 5% level or the sample is smaller than 100 counties. As widely known in spatial econometric literature, a benchmark generalized model is the spatial Durbin (SDM). However, we did not apply such a model for the following reasons: first, we wanted to rely as much as possible on the framework proposed by Wagner and Zeileis (2019), which does not include the spatial lag of the independent variables; second, Wagner and Zeileis (2019) methodology aims at filtering spatial dependence in the dependent variable, not in preserving geographic proximity between units. As a consequence, given that even contiguous counties could belong to different nodes, we find it difficult to justify theoretically the inclusion of spatial lags based on the average values of contiguous counties. Finally, the inclusion of the spatial lags of the independent variables doubles the variables for which non-linearities can be found, complicating the interpretation of the outcomes of the model.

The algorithm aimed at identifying non-linearities in the covariates is based on Zeileis et al. (2008). According to the authors the parametric model in Eq. (2) can be written as Ϻ(Y^*; β) with observations Y^* ∈ y and a k-dimensional vector of parameters β ∈ Θ. Given n observations Y^*_i = (i = 1; …; n) the model can be fitted by minimizing some objective function ψ(Y^*; β) yielding the parameter estimate

$$\widehat{\beta }={argmin}_{\beta \in\Theta }\sum\nolimits_{i=1}^{n}\psi ({Y}_{i}^{*}; \beta )$$

(3)

In our case, as the estimates are performed via OLS, $\psi$ is the error sum of squares, so the observations Y^* are normally distributed with mean $\mu$ and covariance matrix $\Sigma$: Y^* ~ N($\mu$;$\Sigma$) with the combined parameter vector β = ($\mu$;$\Sigma$).

So, given that in our specific case it is unreasonable to assume that a single global model Ϻ(Y^*; β) for all n observations, we partition the observations with respect to covariates such that a well-setting model can be found locally in each cell of the partition. To achieve this aim Zeileis et al. (2008) propose a recursive partitioning approach based on ℓ partitioning variables Z_j ∈ Z_j(i = 1; …; ℓ), in our case X_i ≡ Z_j, to adaptively find a good approximation of this partition.

More formally, Zeileis et al. (2008) assume that a partition {ẞ_b}_{b=1,…, ẞ} of the space Z = Z₁ × … × Z_ℓ exists with ẞ cells (or segments) such that in each cell ẞ_b a model Ϻ(Y;{β_b}) with a cell-specific parameter β_bholds. This segmented model is identified by Ϻ_ẞ(Y^*;{β_b}) where {β_b}_{b=1,…, B} is the full combined parameter.

Given the correct partition {ẞ_b} the estimation of the parameters {β_b} that minimize the global objective function can easily be achieved by computing the locally optimal parameter estimates ${\widehat{\beta }}_{b}$ in each segment ẞ_b. However, if there are more partitioning variables (ℓ > 1) and {ẞ_b} is unknown, minimization of the global objective function

$$\sum\nolimits_{b=1}^{B}\sum\nolimits_{i \in {I}_{b}}\psi ({Y}_{i}^{*}; \beta )\to min$$

(4)

over all conceivable partitions {ẞ_b} (with corresponding indexes I_b, b = 1,..., B), requires a greedy forward search where the objective function can at least be optimized locally in each step. In particular, the algorithm has the following steps:

1.
Fit the model once to all observations in the current node by estimating $\widehat{\beta }$ via minimization of the objective function. This means, recalling Eq. (4), that
$$\sum\nolimits_{i=1}^n\psi(Y_i^\ast;\widehat\beta)=0$$
(5)

where $\psi \left({Y}^{*};\beta \right)=\frac{\partial\uppsi ({Y}^{*};\beta )}{\partial \beta }$ is the score function or estimating function corresponding to $\uppsi \left({Y}^{*};\beta \right)$.
2.
Assess whether the parameter estimates are stable with respect to every ordering Z₁;…; Z_ℓ. If there is some overall instability, select the variable Z_j associated with the highest parameter instability, otherwise stop. To achieve this aim, it is necessary to check whether the scores fluctuate randomly around their mean 0 or exhibit systematic deviations from 0 over Z_j. These deviations can be captured by the empirical fluctuation process:
$${W}_{i}(t)={\widehat{J}}^{-1/2}{n}^{-1/2}\sum\nolimits_{i=1}^{[nt]}{\widehat{\psi }}_{\sigma ({Z}_{ij})}(0 \le t \le 1)$$
(6)
where $\sigma ({Z}_{ij})$ is the ordering permutation that gives the antirank of the observation Z_ij in the vector Z_j = (Z_1j,…, Z_nj)^T. Thus, W_j(t) is simply the partial sum process of the scores ordered by the variable Z_j, scaled by the number of observations n and a suitable estimate $\widehat{J}$ of the covariance matrix $COV(\psi \left({Y}^{*};\widehat{\beta }\right))$. This empirical fluctuation process is governed by a functional central limit theorem (Zeileis and Hornik 2007) under the null hypothesis of parameter stability: it converges to a Brownian bridge W⁰. A test statistic can be derived by applying a scalar functional λ(.) capturing the fluctuation in the empirical process to the fluctuation process λ(W_j(.)) and the corresponding limiting distribution is just the same functional (or its asymptotic counterpart) applied to the limiting process λ(W⁰(.)).
3.
Compute the split point(s) that locally optimize ψ, either for the adaptively chosen number of splits.
4.
Split the fitted model with respect to variable Z_j* into a segmented model with B segments and repeat the procedure.

Drawing on the literature on the determinants of turnout and considering that local parameters are estimated for each node ℓ, we estimate the following regression:

$${Turnout}_{c}^{*}={\alpha }_{c}+{\alpha }_{1\mathcal{l}}{{\varvec{E}}{\varvec{D}}{\varvec{U}}}_{c}+{\alpha }_{2\mathcal{l}}{{\varvec{E}}{\varvec{C}}{\varvec{O}}}_{c}+{\alpha }_{3\mathcal{l}}{{\varvec{S}}{\varvec{O}}{\varvec{C}}{\varvec{I}}{\varvec{O}}{\varvec{D}}{\varvec{E}}{\varvec{M}}}_{c}+{\varepsilon }_{c}$$

(7)

where c refers to an individual county, for a total of 3046,^{Footnote 8} and ${Turnout}_{c}^{*}=\left({\varvec{I}}-\rho {\varvec{W}}\right){Turnout}_{c}$ is the spatially filtered turnout where $\rho$ is obtained as described in Eq. (1). The choice of the spatial weight matrix ${\varvec{W}}$ has been performed checking for nine different weighting schemes: k-nearest neighbors of order 5 to 10, a Queen contiguity matrix an inverse distance matrix and an inverse distance matrix with a cut-off at 200 km. We estimate our regression tree for each of them and we ended up with the model with the lowest Akaike Information Criterion (AIC), which corresponds to a ${\varvec{W}}$ with a weighting scheme equal to a k-nearest neighbor of order 9. As customary, the ${\varvec{W}}$ has been row-standardized.

The variables can be described as follows^{Footnote 9}:

1)
Turnout, the dependent variable, is the share of the county voting-eligible population (VEP), which is registered and legally empowered to cast its vote in presidential elections. Using VEP instead of VAP (voting age population) to measure turnout corrects for the number of ineligible felons and non-citizen residents. Nevertheless, the use of these two different types of data is controversial as both measurement methods have their biases (Holbrook and Heidbreder 2010). We, therefore, estimate the models also using the citizen voting age population (CVAP). The results are presented in the Appendix (Table A3 and Fig. A2).
2)
The vectors EDU and ECO include variables aiming to capture the role of education (Less than graduate and University education) and the economic system (Unemployment and Household income) respectively.
3)
The vector SOCIODEM contains several socio-demographic variables (Adherents, Urban population, Hispanics, Blacks, and Veterans). Blacks and Hispanics indicate the percentage of a county’s population in the respective racial category.

The description, the summary statistics and the source of all the data are set out in Tables 1 and 2. Table A1 in the Appendix presents the correlation matrix of the variables considered. Figure 1 shows that the distribution of turnout across US counties is not random in space, as indicated by Moran’s I (+ 0.48), positive and significant.^{Footnote 10} Higher turnout is observed along the coast from New England to Louisiana and in the Midwest, Deep North and Rocky Mountain regions. Lower participation is in the Sun Belt, Texas, Deep South, and Appalachian regions.

Table 1 Variables, description, and sources

Full size table

Table 2 Descriptive statistics

Full size table

Results

Table 3 details the results of our model for the eleven terminal nodes identified by the regression model. The spatial lag coefficient is positive and statistically significant (therefore the “neighborhood effect” is in play).^{Footnote 11} Moran’s I on residuals is positive and significant in both the standard (Table A2 and Fig. A1) and spatial lag regression tree model (Table 3). However, the very low value (0.04) of the index in the latter model indicates that biased results with respect to spatial autocorrelation are unlikely.

Table 3 Spatial lag regression tree estimates

Full size table

The spatial lag regression tree is also preferred to the standard regression tree model in terms of the AIC.^{Footnote 12} We find that political participation is geographically clustered. Although this is in line with Tobler’s First Law of Geography (Tobler 1970), this is likely the result of several factors, such as labor and capital mobility, common culture, social involvement, as well as social networking (Lacombe et al. 2014).

The most significant non-linearities are found for median household income, the root or father node (the cut is at $10,557), followed by Black population (the cut is 2%) and subsequently for urban population (91.1%) and university graduates (43.7%). This suggests that income dominates the share of the Black, the less than graduate and the urban populations as the main identifier for multiple regimes. The economic conditions of voters, as well as race and ethnicity, are among the most important social and cultural divisions within the American electorate and their impact on voter turnout is a persistent finding in the literature (Fraga 2016; McCarty et al. 2016). Nonetheless, within the eleven subsets of counties, the results underscore the different signs and significance of the determinants of turnout (Fig. 2).

Node 3 includes 371 counties with median household income lower than $10,557 and a Black population lower than 2.4%. In this node Hispanics, household income and Veterans increase voter turnout while Blacks, the urban population and the percentage of the unemployed have a negative impact on voter turnout.

Node 5 (103 counties) includes a subset of counties with the same income characteristics as before, a Black population higher than 2.4% and a percentage of people with less than graduate education below 49.7%. In this subgroup, the higher the Black population and household income, and the fewer the graduates and the unemployed, the higher the voter turnout. In this node, a larger share of the population living in urban areas reduces voter turnout.

Node 6 (346 counties) differs from the previous node with a percentage of people with less than graduate education higher than 49.7%. In this subset, the results show that the higher the percentage of people affiliated to a group, Blacks and Veterans the higher the voter turnout while Hispanics, the urban population, less than graduate or with university education are negatively associated with voter turnout.

Node 9 (144 counties) encompasses a group of cases of median household income higher than $10,557, Black population lower than 2%, and a percentage of less than graduates below 43.7%. In this terminal group, as the Hispanics, household income, and the number of Veterans increase, voter turnout improves.

Node 11 (138 counties) includes a subset of counties with the same income characteristics as before, a Black population lower than 2%, a percentage of people less than graduate above 43.7% and a percentage of people affiliated to a group below 30.7%. In this node, median household income and the Veterans increase turnout while Blacks, urban population, and less than graduate and university graduate do not.

Node 13 (255 counties) includes counties with median household income higher than $10,557, Black population lower than 2%, a percentage of people with less than graduate education above 43.7%, a percentage of people affiliated to a group higher than 30.7% and Veterans lower than 10.9%. In this node the higher the Blacks and the median household income, the higher the voter turnout. The results also indicate that the urban population, less than graduate, university graduate and the unemployed negatively impact on turnout.

Node 14 (664 counties) has the same characteristics as node 13 but the percentage of Veterans is higher than 10.9%. In this subset people affiliated to a group as well as Veterans increase voter turnout. Household income also has a positive impact, while Blacks, people living in urban areas, people not graduated and people unemployed decrease turnout.

Node 18 (394 counties) takes in counties with median household income higher than $10,557, Black population above 2%, the percentage of urban population lower than 91.1%, people with less than graduate education lower than 3.7% and Black population lower than 15.3%. It is worth noting that in this (and the following) nodes Black population enters again as a cut, showing a second non-linearity in this variable. In this terminal group, Adherents, Veterans, the urban population and finally the median household income increase voter turnout. The results also show that Hispanics negatively affect voter turnout.

Node 19 (100 counties) has the same characteristics as node 18 with the Black population higher than 15.3%. In this subset of counties, membership of a group and the median household income increase voter turnout while Hispanics and Veterans decrease participation.

Node 20 (370 counties), in addition to the first three cuts as before, has people with less than graduate education above 3.7%. In this subset, the Hispanics and population with less than graduate education and the population living in urban areas reduce turnout. Members of a group and Blacks are more likely to turn out to vote. Household income and voting age also positively impact on turnout.

Finally, node 21 (161 counties) consists of counties that after the first two cuts seen above, have a percentage of the urban population above 91.1%. In this terminal group members of a group and the population with less than graduate education reduces the voter turnout.

Overall, our findings show that throughout the subgroups the effects of the explanatory variables on voter turnout are not uniform and vary over contexts. Indeed, the magnitude and the sign of the coefficients of the variables included in the analysis partly differ from subgroup to subgroup.

In Fig. 3, shades of the same colors represent differences within similar counties, whereas different colors denote differences between groups of counties. The first way to interpret the figure (i.e., differences within similar colors) emphasizes the variables involved in the estimations and, in particular, the most important non-linearities shared by the counties (median household income and Black population). From this perspective, the grayscale counties are those with household income below $ 10,557, while the green and blue counties are those with household income above $10,557. The bifurcation generating the other two groups occurs for a 2% lower (green scale) or higher (blue scale) Black population.

The second way (i.e., dissimilarities among colors) deals with the outcome of the splitting process. Specifically, it looks at the geographical patterns of terminal nodes across the US. From this point of view, the map shows a clustering of different subgroups in several areas of the country, which in turn is a sign of polarization. The group of counties with household income below $10,557 is mainly located in the south-eastern part of the country, except for the coast, where there are richer counties with a Black population higher than 2%. The area with vertices Kansas to Michigan and from Northern California to the State of Washington includes counties with household income above $10,557 and less than 2% of Black people. A few interesting subsets of counties emerge. The counties located on the Southern coasts and in the South-East of the country belong to node 3, while there are large swathes of green (node 14) in most of the central states. Texas and Florida are divided into two groups: node 3 and node 20 (very far apart in the figure) for the Lone Star State and node 18 and node 21 for the Sunshine State. Node 3 is also recurrent in the Appalachians, South Texas and mid-West, and includes mostly rural counties with a small population, predominantly White.

Among the counties in node 21, there are metropolitan areas such as Dallas (Dallas), New York (Hudson, New York, Richmond), Charlotte (Mecklenburg), Tucson (Pima), Phoenix (Maricopa) Pittsburgh (Allegheny), Chicago (Cook and Will), San Francisco (Santa Clara), San Diego (San Diego), Los Angeles (Los Angeles), and Atlanta (Fulton).

Conclusions

Over the past few decades and more dramatically in recent years, the American society has become more polarized in demographic, socio-economic, cultural, and political terms. An increasingly divided landscape hints at the heterogeneity of the electorate as a crucial factor for understanding trends in voting behavior. This paper tackles this issue. We put forward a combined methodology to deal with heterogeneity in regression coefficients and spatial dependence in the analysis of turnout in the 2012 US presidential elections to characterize the behavior of voters into several geographically-located clusters. We find that turnout in a given county is associated with the turnout in the surrounding counties. As pointed by LeSage and Dominguez (2012), one of the possible sources of spillover effects might be commuting, or social involvement and social networking (Tam Cho and Rudolph 2008). Furthermore, for different groups of US counties obtained through a spatial lag regression tree procedure, some variables have different statistical significance (or lack of it), and sometimes different signs, revealing multiple regimes of voting behavior mainly driven by household income, i.e., the root node. This non-homogeneity in regression coefficients, which in the spatial lag regression tree are unbiased by spatial dependence, is obfuscated by traditional methods that extrapolate a single average relationship between the variables.

Analyzing voting mobilization and identifying spatially clustered homogeneous socioeconomic and demographic subgroups of counties has some implications for political campaigns. It facilitates their ability to target specific groups of voters, defined by their socio-demographic characteristics, in order to mobilize voters, especially in swing states or districts, which are crucial in winning elections.

The methodology we use allows only cross-section data to be analyzed. Further research may compare different election years to trace turnout behavior over time and improve campaign goals as well as policy design and implementation proposals accordingly. In addition, party behavior and different types of elections may be studied, enriching the dataset with political characteristics (such as incumbency) that have not been considered here.

Change history

16 February 2023
A Correction to this paper has been published: https://doi.org/10.1007/s41651-023-00133-5

Notes

For example, see https://fivethirtyeight.com/features/how-georgia-turned-blue/ and https://centerforpolitics.org/crystalball/articles/georgia-senate-runoffs-breaking-down-november-looking-to-january/.
In April 2021, the Republican-controlled legislature passed a new law increasing the hurdles for people – in particular Blacks – to go to the polls (https://www.bbc.com/news/world-us-canada-56650565). Florida (https://www.axios.com/florida-desantis-voting-restrictions-fox-17b242df-950c-492c-a9bc-6531e82e3bf2.html) and Texas (https://www.axios.com/texas-voting-restrictions-legislation-fbeff1d4-e7b2-461d-8719-3867c0dbb982.html) followed suit. At the same time, the Senate majority and the White House embarked in an overhaul of US elections through the For the People Act and the John Lewis Voting Rights Advancement Act (https://apnews.com/article/joe-biden-voting-voting-rights-legislation-elections-5fde82aee5edbdabde43f48d7fc60e09).
The appropriate definition of polarization as well as the extent of its increase are highly debated in the literature and conflicting views have emerged in the past (see Glaeser and Wards, 2006; Abramowitz and Saunders 2008; Fiorina and Abrams 2008; Prior 2013). Recent data, however, paint a richer and more nuanced picture of political polarization than previous discussion have suggested (see Boxell et al. 2017).
To avoid confusion, we stress that we do not deal with partisan polarization.
Partisan polarization is the difference between an individual’s feelings towards their own party and their feelings towards the opposing party (Boxell 2020). This difference has been growing over the last decade, and increasingly people sharing similar partisan stances have concentrated in given areas. As a consequence, the difference between any two areas has become larger than in the past. In this sense, the political environment has become more polarized and heterogeneous at the same time.
Ahmed and Pesaran (2020), the study conceptually most closely related to ours, jointly analyze turnout and presidential election outcomes in a model with regional heterogeneity.
This method is only developed for cross-sectional data, which is clearly a limitation for our study.
We excluded counties from Hawaii, Alaska and Puerto Rico for lack of reliable data on VEP.
We do not include variables related with parties, candidates and similar characteristics.
The index measures the correlation of turnout in a county with the average of the neighboring counties. Where it shows positive (negative) and significant sign, similar (dissimilar) counties in terms of turnout are located close to each other.
As in Wagner and Zeileis (2019), we do not compute direct, indirect, and total effects, as customary in literature. In fact, as highlighted in the previous section, the initial estimation of the spatial lag model aims at filtering spatial dependence in the independent variable in such a way that a standard regression tree can be estimated. In this way, non-linearities in the independent variables can be accounted for minimizing the risk of having residual spatial autocorrelation that may affect the results.
AIC is used as a measure of goodness of fit as it is computed for the overall model, while R-squared is not computed for the overall model, but for each node.

References

Abramowitz AI, Saunders KL (2008) Is polarization a myth? J Polit 70:542–555
Article Google Scholar
Agnew J (1987) Place and politics. Allen and Unwin, Boston
Google Scholar
Ahmed R, Pesaran MH (2020) Regional heterogeneity and US presidential elections, CESifo Working Paper, n. 8615
Anselin L (1988) Spatial econometrics: methods and models. Kluwer Academic Publishers, Dordrecht, The Netherlands
Book Google Scholar
Anselin L and Bera AK (1998) Spatial dependence in linear regression models with an introduction to spatial econometrics. Statistics: Textbooks and Monographs, 155, 237–289
Boxell L (2020) Demographic change and political polarization in the United States. Econ Lett 192. https://doi.org/10.1016/j.econlet.2020.109187
Boxell L, Gentzkow M, Shapiro JM (2017) Greater Internet use is not associated with faster growth in political polarization among US demographic groups. Proc Natl Acad Sci 114(40):10612–10617
Article Google Scholar
Calvo E, Escolar M (2003) The local voter: a geographically weighted approach to ecological inference. Am J Polit Sci 47(1):189–204
Article Google Scholar
Cancela J, Geys B (2016) Explaining voter turnout: a meta-analysis of national and subnational elections. Elect Stud 42:264–275
Article Google Scholar
Cann DM, Cole JB (2011) Strategic campaigning, closeness, and voter mobilization in U.S. presidential elections. Elect Stud 30:344–352
Article Google Scholar
Durkan W (2021) Changing geographies of voter turnout: Michigan and the urban/rural divide. Polit Geogr 102449. https://doi.org/10.1016/j.polgeo.2021.102449
Fieldhouse E, Cutts D (2008) Diversity, density and turnout: the effect of neighbourhood ethno-religious composition on voter turnout in Britain. Polit Geogr 27(5):530–548
Article Google Scholar
Fieldhouse E, Cutts D (2012) The companion effect: household and local context and the turnout of young people. J Polit 74(3):856–869
Article Google Scholar
Fiorina MP, Abrams SJ (2008) Political polarization in the American public. Annu Rev Polit Sci 11:563–588
Article Google Scholar
Fraga BL (2016) Candidates or districts? Reevaluating the role of race in voter turnout. Am J Polit Sci 60(1):97–122
Article Google Scholar
Galster GC (2012) The mechanism(s) of neighbourhood effects: theory, evidence, and policy implications. In: van Ham M, Manley D, Bailey N, Simpson L, Maclennan D (eds) Neighbourhood Effects Research: New Perspectives. Springer, Amsterdam, pp 23–56
Chapter Google Scholar
Gimpel JG, Lay JC, Schuknecht JE (2003) Cultivating democracy: civic environments and political socialization in America. DC, Brookings Institution, Washington
Google Scholar
Glaser EL, Ward BA (2006) Myths and realities of American political geography. J Econ Perspect 20(2):119–144
Article Google Scholar
Holbrook T, Heidbreder B (2010) Does measurement matter? The case of VAP and VEP in models of voter turnout in the United States. State Polit Pol Q 10(2):157–179
Article Google Scholar
Iyegar S, Lelkes Y, Levendusky M, Malhotra N, Westwood S (2019) The origins and consequences of affective polarization in the United States. Annu Rev Polit Sci 22:129–146
Article Google Scholar
Kahane LH (2009) It’s the economy, and then some: modeling the presidential vote with state panel data. Public Choice 139(3):343–356
Article Google Scholar
Kim J, Euel E, Ding-Ming W (2003) A spatial analysis of county-level outcomes in US presidential elections, 1988–2000. Elect Stud 22:741–761
Article Google Scholar
Lacombe D, Holloway G, Shaughnessy T (2014) Bayesian estimation of the spatial durbin error model with an application to voter turnout in the 2004 presidential election. Int Reg Sci Rev 37(3):298–327
Article Google Scholar
LeSage JP, Dominguez M (2012) The importance of modeling spatial spillovers in public choice analysis. Public Choice 150(3):525–545
Article Google Scholar
McCarty N, Poole KT, Rosenthal H (2016) Polarized America: the dance of ideology and unequal riches. MIT Press, Cambridge
Moretti E (2012) The new geography of jobs. Houghton Mifflin Harcourt
Morgan JN, Sonquist JA (1963) Problems in the analysis of survey data, and a proposal. J Am Stat Assoc 58:415–434
Article Google Scholar
Pattie CJ, Johnston RJ (2000) People who talk together vote together: an exploration of contextual effects in Great Britain. Ann Assoc Am Geogr 90(1):41–66
Article Google Scholar
Pew Research Center (2017) The partisan divide on political values grows even wider, Washington D.C
Prior M (2013) Media and political polarization. Annu Rev Polit Sci 16:101–112
Article Google Scholar
Storper M (2018) Separate worlds? Explaining the current wave of regional economic polarization. J Econ Geogr 18(2):247–270
Article Google Scholar
Tam Cho W, Rudolph T (2008) Emanating political participation: untangling the spatial structure behind participation. Brit J Polit Sci 38(2):273–289
Article Google Scholar
Tobler W (1970) A computer movie simulating urban growth in the Detroit region. Econ Geogr 46(Supplement):234–240
Article Google Scholar
Wagner M, Zeileis A (2019) Heterogeneity and spatial dependence of regional growth in the EU: a recursive partitioning approach. Ger Econ Rev 20:67–82
Article Google Scholar
Zeileis A, Hornik K (2007) Generalized M-fluctuation tests for parameter instability. Stat Neerl 61(4):488–508
Article Google Scholar
Zeileis A, Hothorn T, Hornik K (2008) Model-based recursive partitioning. J Comput Graph Stat 17:492–514
Article Google Scholar

Download references

Funding

Open access funding provided by Università degli Studi di Brescia within the CRUI-CARE Agreement.

Author information

Authors and Affiliations

University of L’Aquila, L’Aquila, Italy
Nadia Fiorino
University of Brescia, Brescia, Italy
Nicola Pontarollo
University of Verona, Verona, Italy
Roberto Ricciuti

Authors

Nadia Fiorino
View author publications
You can also search for this author in PubMed Google Scholar
Nicola Pontarollo
View author publications
You can also search for this author in PubMed Google Scholar
Roberto Ricciuti
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nicola Pontarollo.

Ethics declarations

Ethics Approval

On behalf of all authors, the corresponding author states that the paper satisfies Ethical Standards conditions, no human participants, or animals are involved in the research.

Conflict of Interest

The authors declare no competing interests.

Informed Consent

On behalf of all authors, the corresponding author states that no human participants are involved in the research and, therefore, informed consent is not required by them. On behalf of all authors, the corresponding author consent to the submission of this manuscript.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Table A1 Correlation matrix

Full size table

Table A2 Standard regression tree

Full size table

Table A3 Spatial lag regression tree for CVAP

Full size table

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Fiorino, N., Pontarollo, N. & Ricciuti, R. Detecting Dividing Lines in Turnout: Spatial Dependence and Heterogeneity in the 2012 US Presidential Election. J geovis spat anal 6, 34 (2022). https://doi.org/10.1007/s41651-022-00127-9

Download citation

Accepted: 12 October 2022
Published: 11 November 2022
DOI: https://doi.org/10.1007/s41651-022-00127-9

Keywords

JEL Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Detecting Dividing Lines in Turnout: Spatial Dependence and Heterogeneity in the 2012 US Presidential Election

Abstract

Similar content being viewed by others

Spatial links in the analysis of voter turnout in European Parliamentary elections

Tunisian constituent assembly elections: how does spatial proximity matter?

A Multilevel Spatial Model to Investigate Voting Behaviour in the 2019 UK General Election

Introduction

Literature Review

Empirical strategy, Methodology, and Data

Results

Conclusions

Change history

16 February 2023

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Ethics Approval

Conflict of Interest

Informed Consent

Additional information

Publisher's Note

Appendix

Rights and permissions

About this article

Cite this article

Keywords

JEL Classification

Navigation

Detecting Dividing Lines in Turnout: Spatial Dependence and Heterogeneity in the 2012 US Presidential Election

Abstract

Similar content being viewed by others

Spatial links in the analysis of voter turnout in European Parliamentary elections

Tunisian constituent assembly elections: how does spatial proximity matter?

A Multilevel Spatial Model to Investigate Voting Behaviour in the 2019 UK General Election

Introduction

Literature Review

Empirical strategy, Methodology, and Data

Results

Conclusions

Change history

16 February 2023

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Ethics Approval

Conflict of Interest

Informed Consent

Additional information

Publisher's Note

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

JEL Classification

Search

Navigation