Introduction

Pundits and public opinion see political polarization on the rise in Western democracies. Empirical studies of various kinds claim that polarization is increasing (Dettrey and Campbell 2013), decreasing (Bauer and Munzert 2013), or not really changing (DiMaggio et al. 1996; Fiorina and Abrams 2008).

These conflicting messages stem partly from the fact that different issues or national contexts are studied, but an underlying problem is foremost that researchers diverge in their conceptual definition and measurement of polarization (Lelkes 2016). What they do agree upon is the general notion that polarization means an accentuation of differences. This general notion connects research on political polarization to a wider literature strand, measuring polarization also in distributions of, e.g., income (Esteban and Ray 1994), employment (Goos and Manning 2007; Cirillo 2018), or ethnicity (Montalvo and Reynal-Querol 2005; Schneider and Wiesehomeier 2006).

Overviews of existing polarization measures and principles are yet scarce (but see Bramson et al. 2016; Bauer 2019) and a more thorough connection of theoretical principles to empirical reality is desirable. Utilizing large-scale survey data, this study adds an empirical comparison of existing polarization measures for single attitudes, to distinguish an empirically grounded single polarization measure which can be further decomposed into underlying components representing different aspects.

Conceptual approaches to polarization

Polarization literature is fragmented across disciplines and topical content (see Bauer 2019, p. 2). To avoid misunderstandings, researchers should clarify how their work relates to the many existing conceptual approaches. First,we briefly outline these. Then, we clarify how our approach relates to them.

Psychology derived an individual-based definition of polarization: “members of a deliberating group move toward a more extreme point in whatever direction is indicated by the members’ predeliberation tendency” (Sunstein 2002). Already Myers and Lamm (1976) point out in their review on this group polarization phenomenon that there are other more complex concepts, where the term polarization “refer[s] to a split within a group of people.”

The latter is called societal polarization and that is what current debates are about, focusing on societies instead of groups. Under this umbrella term, political science further distinguishes between elite polarization, (accentuated differences within groups of elected officials), and mass polarization (accentuated differences in attitudes within the general public) (Fiorina and Abrams 2008). Approaches from an economics or ethnic-diversity background might not focus on political topics as such, but could likewise be termed mass polarization studies since accentuated differences within the general public are studied and similar measures might be applied. Another conceptually distinct facet of societal polarization is affective polarization: the degree of in-party love and out-party hate. Unlike elite- and mass polarization, it is an individual-level construct rooted in the social identity approach (Iyengar et al. 2012; Tajfel and Turner 2004).

Mass polarization is what we study in this paper: When we use the term polarization, it refers to accentuated differences within the general public. Further on, our conceptual approach is unidimensional: We investigate polarization within a single attitude (see Bauer 2019, p. 4) with a focus on attitudes toward political and politicized issues. However, we also include other topics, because of the conceptual similarity outlined previously. Thus, we do not analyze polarization conceptualized as ideological alignment across several issues as Abramowitz and Saunders (2008) or Baldassarri and Gelman (2008).

Measurement concepts from the literature

Researchers investigating mass polarization face two basic problems: Firstly, which data source should they use and secondly how should they assess polarization from the data source. Researchers agree on the answer to the first question: They measure mass polarization from responses in representative surveys (Bauer 2019). Researchers do not agree on the answer to the second question. Bramson et al. (2016) provide nine theoretically motivated polarization principles. Ongoing work by Bauer (2019) outlines existing measurement approaches and their conceptual differences. We give a short summary here to position the approach we use.

Early on, based on a solid axiomatization Esteban and Ray (1994) integrated several polarization features (‘within-group homogeneity,’ ‘inter-group heterogeneity,’ ‘small number of significantly sized groups’) into a common framework that distinguishes polarization from dispersion and inequality.

Esteban and Ray (1994) also described another commonly mentioned polarization feature: The center loses people to the extremes. This aspect is also mentioned and operationalized by Fiorina and Abrams (2008) and Dettrey and Campbell (2013).

DiMaggio et al. (1996) identified two principles for measuring societal polarization on a single attitude distribution and specified which distributional properties might capture them: The dispersion (measure: variance) and the bimodality principle (measure: kurtosis), which they based on a societal translation of the results of Esteban and Ray (1994) without utilizing their full measure. Downey and Huffman (2001) argued that these measures, variance and especially kurtosis, are very insensible when assessing polarization in multimodal structures.

Based on the measure of Esteban and Ray (1994) and expert opinions, Koudenburg et al. (2021) introduced a variation of the polarization measure for five-point rating scales that takes into account the psychological distance of individuals within a distribution, toward each other, and a neutral midpoint.

Van der Eijk (2001) provides another route to measure polarization as the opposite of the measurement of agreement in ordered rating scales. The coefficient of agreement was developed to measure how much respondents in a survey agree on the position of, e.g., a political party.

Recently, Bramson et al. (2016) extended and explicated existing frameworks to derive nine different aspects of polarization. For assessing the shape of the distribution, they introduce dispersion (mean absolute deviation instead of variance). Two more measures are based on counting attitudes which haven’t been taken by any individual (empty bins in a histogram): coverage (number of nonempty bins) and regionalization (number of gaps of empty bins). Like spread (range of a sample, maximum minus minimum), which can also be used for two groups, the remaining five aspects solely deal with subgroups within a distribution, assessed via exogenous variables or endogenously from the distribution’s shape. Bramson et al. (2016) mention that these aspects share similarities with the polarization features of Esteban and Ray (1994). Group consensus is measured as the absence of within-group dispersion. Distinctness refers to the overlap of the groups’ attitude distributions (see also Schmid and Schmidt 2006). Group divergence concerns the difference of the groups’ averages. Community fragmentation is either assessed from exogenous variables or endogenously by counting distributional modes. Size parity captures how equally sized groups are. The nine aspects provide a useful terminology, but many measures cannot be used in a robust way to compare different empirical distributions. For example, larger group divergence may become pointless when at the same time size parity approaches zero.

Research gaps

For the present study, we conclude that there is a demand for a practical measure of polarization in single attitude distributions which is adapted toward the empirical reality of attitude distributions. The first two principles of DiMaggio et al. (1996) are still worth to distinguish and the nine aspects of Bramson et al. (2016) provide useful terminology. Nevertheless, we claim some integration is desirable. In the following, we approach the goal of integration by exploration of empirical data based on the theoretical measurement concepts provided in the literature. We pose

Research Question 1:

Which measurement concepts of societal polarization on a single attitude dimension make a difference in empirical data?

Answering this question can advance polarization research in two ways. Firstly, it can guide future researchers in their measurement approach. Secondly, it can help bridging gaps between measurement theory and empirical reality. As Bramson et al. (2016) show for the aspects spread, coverage and regionalization, measures can be theoretically sound, but hardly applicable to empirical scenarios with survey data. Ideally, a good measure should satisfy both needs. Previous research already described characteristics of attitude distributions that researchers might encounter, such as a trimodal structure with one large central and two off-central peaks (see Downey and Huffman 2001; Lorenz 2017). We thus pose

Research Question 2:

Are there stylized characteristics of attitude distributions which can be used to improve the measurement?

Lastly, the value of a measure lies in the ability to draw informative conclusions when applied. To this end, we pose

Research Question 3:

What can we say about single attitude polarization for different topics, different European countries, and their time trends?

Data

We use data from the European Social Survey (ESS) (European Social Survey ERIC (ESS ERIC) 2016). Nine biennial ESS waves cover the years 2002 to 2018 and a total of 33 European countries (including Israel and Russia). Coverage of countries and topics varies across waves, whereby twelve countries and 19 questions appear in all waves (with one missing question for the Irish sample of wave one). Figure 1 in ESM provides an overview. We focus on 33 variables measured on eleven-point rating scales. Some stem from the ESS core module, others from rotating modules. Variables, their ESS short-label acronym, and their full verbal labels are listed in Table 1 in ESM. Variables from the core module include the catch-all political position on the left–right continuum (LRSCALE), attitudes on European unification (EUFTF) and immigration (IM..., three items), satisfaction ratings (STF..., HAPPY, seven items), ratings of generalized trust (PPL..., three items), trust in institutions (TRST..., seven items), and ratings of emotional attachment to the country and Europe (ATCH..., two items). From the rotating modules, we include ratings of fairness (..FR..., four questions, 2018) and attitudes toward climate change (CC..., or ...CC, four questions, 2016). All variables have certain aspects of individual attitudes but some are not at the core of political discourse which is thought to be especially prone to polarization. As we outlined in the introduction, measures of polarization can also be applied to “nonattitudinal properties” as long as appropriate data formats such as “Likert scale responses of one’s degree of agreement in a survey statement” are used (c.f. Bramson et al. 2016, p. 81). In the following, we will refer to all the Likert-type items listed above as attitudes. To focus on political polarization, we consider the three topics LRSCALE, EUFTF, and IMUECLT (“Does immigration undermine or enrich the culture of the country?”) as the core political topics in our data set and give special attention to them in the following.

The ESS is conducted to allow inferences about the general population of each country in each wave, which allows us to infer the degree of mass polarization. Its high standard of multilingualism allows the comparison between countries (Davidov et al. 2008). Representativity for the general population is reached by probability sampling with an effective sample size of 1500 (800 for small countries), which means, after discounting for design effects via design weights. We use the design weights of respondents to compute the distribution of responses on the 0–10 scale for every variable-by-country-by-year combination. We record the fractions of responses for each answering option and ignore all non-valid answers. Appendix 1 also provides information about response rates (fractions of individuals in the random sample who responded) for all country surveys. We call the distribution of answers to a question an attitude landscape. In total, we have 4155 attitude landscapes in our data set.

Our aim is to quantify the degree of polarization for these attitude landscapes. We start out with the “Empirical comparison of measurement concepts” section, where we compare existing measures to answer Research Question 1. We continue with the “Pentamodal model of attitude landscapes” section, where we exploit distinctive properties of attitude landscapes in pursuit of Research Question 2 and introduce an improved measurement model that assumes five endogenous groups in the population. Lastly in the “Empirical findings” section, we employ this model and answer Research Question 3.

Empirical comparison of measurement concepts

In the following, we first sketch out the common ground of polarization measures for single attitude distributions and exemplify remaining conceptual problems. Second, we show how the most common polarization measures can be applied to our data and discuss how they might rank these conceptual issues. Finally, we empirically compare different measures with a correlation analysis using the 4155 attitude distributions in our dataset. This answers Research Question 1.

Common ground and conceptual issues

Fig. 1
figure 1

The common ground and the conceptual problem of measuring polarization

The common ground for measuring societal polarization is that almost all measurement concepts agree on which attitude landscapes are maximally and minimally polarized. Minimal polarization is reached when all respondents agree on one attitude value (similarly already discussed for measuring consensus by Leik (1966)). It does not matter where the consensual value lies on the attitude scale (see example “minimal” in Fig. 1). Maximal polarization is achieved when the population is equally divided on both extremes of the scale (see example “maximal” in Fig.1). In our data, this would be the attitude landscapes, where 50% have attitude 0 and the other 50% have attitude 10. That way all polarization measures can be normalized to range from 0 to 1.Footnote 1 This view on the extremes captures the dispersion as well as the bimodality principle of DiMaggio et al. (1996) and is also formulated mathematically in Theorem 2 of Esteban and Ray (1994).

The conceptual problems in measuring polarization come when intermediate polarization is to be assessed. Figure 1 shows three stylized example landscapes of intermediate polarization which are not trivial to rank. The example “equal powers” shows two bins of equal size. Thus, there are two opposing groups with high internal consensus, but the discrepancy in attitude between these groups is not maximal. The other two examples have a larger range. Therefore, they could be seen as more polarized. The “maximal diversity” example shows no structure of opposing camps but a uniform distribution. In some sense, there is no accentuation of the differences as for the other two examples. The distribution “unequal extremes” shows the accentuation by only two groups and maximal difference. However, 90% have a consensus and thus, in another sense, it is much less polarized than the other two examples. So, for each of the three distributions, there are arguments that this is either the most or the least polarized one.

All three examples are stylized and in our sample of real-world attitude landscapes, we find none which comes close to these. Therefore, we explore next how the most common single attitude polarization measures rank these three examples and how different they measure the attitude landscapes in our data.

Measures in light of these issues

For the following operationalization of polarization measures, we call the attitude values of the rating scale to be \(0, 1, \dots , n\) and the fraction of the population holding them \( p_0, p_1, \dots , p_n \). We call the vector p, the attitude landscape. By definition, it holds \(p_i \ge 0\) for all i and \(\sum _i p_i = 1\). For the rating scale of the ESS, it holds \(n = 10\).

Esteban and Ray (1994) measure polarization as

$$\begin{aligned} \text {Pol}_\alpha (p) =\frac{2^{1+\alpha }}{n} \sum _{i,j=0}^n p_i^{1+\alpha } p_j |i-j| \end{aligned}$$
(1)

where \(\alpha\) is called the polarization sensitivity. Following the theory of Esteban and Ray (1994), \(\alpha\) should be larger than zero because for \(\alpha =0\) the measure would coincide with the Gini coefficient for the measurement of inequality.Footnote 2 The term \(p_ip_j\) weights the distance between attitudes i and j by the fractions of individuals holding both attitudes. Esteban and Ray (1994) distinguish polarization from inequality by taking into account that the contribution of antagonism between people with different attitudes (reflected in the term \(|i-j|\)) is increasing with the fraction of people holding that attitude reflected in the factor \(p_i^\alpha\) in \(p_i^\alpha p_i p_j = p_i^{1+\alpha } p_j\), which is called the identification of people with attitude i.Footnote 3

In contrast to Esteban and Ray (1994), we will also call \(\text {Pol}_0\) a polarization measure. Further on, \(\alpha\) should be less than or approximate 1.6 because otherwise not all axioms of Esteban and Ray (1994) would be fulfilled. In the following, we mostly use \(\text {Pol}_0\) and \(\text {Pol}_1\) for which the specified definitions are:

$$\begin{aligned} \text {Pol}_0(p)&=\frac{2}{n} \sum _{i,j=0}^n p_i p_j |i-j| \end{aligned}$$
(2)
$$\begin{aligned} \text {Pol}_1(p)&=\frac{4}{n} \sum _{i,j=0}^n p_i^2 p_j |i-j| \end{aligned}$$
(3)

Both have a probabilistic interpretation: \(\text {Pol}_0\) is the expected attitude distance of a pair of individuals randomly sampled from the population. So, we could also call it the average pair discrepancy. For \(\text {Pol}_1\), the probabilistic reasoning is about sampling triplets from the population and only regard those as contributing to polarization when two have the same attitude. Ignoring triplets with three different attitudes reflects that a pair’s discrepancy is considered only important when one of them is supported by a third person. When all three sampled individuals have the same attitude, there would be no discrepancy anyway.

Other common measures are the normalized mean absolute deviation from the mean (MAD)Footnote 4 and the normalized standard deviation (SD)

$$\begin{aligned} \text {MAD}(p)&= \frac{2}{n} \sum _{i=0}^np_i|i-\bar{x}|, \ \end{aligned}$$
(4)
$$\begin{aligned} \text {SD}(p)&= \frac{2}{n} \sqrt{\sum _{i=0}^np_i(i-\bar{x})^2} \end{aligned}$$
(5)

where \(\bar{x} = \sum _i p_i i\) is the average attitude.

Furthermore, the agreement index (Van der Eijk 2001) can be reversed and normalized to measure polarization (Ruedin 2016) which we call disagreement (\(\text {Dis}\)) in the following.

Having operationalized the measures, we now discuss how they rank the conceptual examples from Fig. 1. MAD, Dis, and Pol\(_0\) all measure the uniform distribution “maximal diversity” in Fig. 1 as most polarized among the three intermediate landscapes, followed by “equal powers,” and “unequal extremes” as least polarized. SD is similar but measures “unequal extremes” second and “equal powers” last because it weighs the large distance higher than the imbalance. The picture changes for Pol\(_\alpha\) with increasing \(\alpha\). For \(\alpha =0.4\), “equal powers” is most polarized and “maximal diversity” only second. The effect that discrepancy of larger bins is weighted higher with increasing \(\alpha\) kicks in. For \(\alpha =1\), “maximal diversity” even drops to be least polarized and “unbalanced extremes” becomes second. Finally, for \(\alpha =1.6\), the ranking is completely reversed (compared to α=0): “Unequal extremes” is most and “maximal diversity” least polarized. This demonstrates that a very large \(\alpha\) tends to measure high polarization in the presence of one large bin as long as there are some other bins.Footnote 5

An empirical comparison

We now turn to the third and final step and analyze how these measures differ empirically. Table 1 shows that Pol\(_0\), MAD, SD, and Dis are highly correlated given the empirically observed attitude landscapes. In particular, Pol\(_0\) shows the strongest correlation to all the three other measures compared to their pairwise correlation.

Thus for Research Question 1, we can conclude that these measurement concepts of polarization make no meaningful difference in empirical data.Footnote 6 Further on, Table 1 shows that the measurement concept of Esteban and Ray (1994) with varying \(\alpha\) is empirically relevant. There is only a moderate correlation between Pol\(_0\) and Pol\(_1\) and the correlation between Pol\(_0\) and Pol\(_{1.6}\) is even slightly negative. All these results also hold when focused on the three core political topics.

Table 1 Correlation of polarization measures in the dataset of 4155 attitude landscapes from the European Social Survey showing that mean absolute deviation (MAD), standard deviation (SD), and disagreement index (Dis) are all highly correlated with the polarization measure Pol\(_0\) or Pol\(_{0.4}\), whereas Pol\(_1\) and Pol\(_{1.6}\) measure a different aspect of polarization

From the theoretical and empirical exploration, we answer Research Question 1: Pol\(_0\) captures almost all information from MAD, SD, and Dis, while increasing \(\alpha\) in Pol\(_\alpha\) blends to another aspect which is empirically different. This other aspect kicks in only for \(\alpha\) larger than 0.4 and turns to become negatively correlated with Pol\(_\alpha\) reaching its theoretical upper bound close to \(\alpha =1.6\). Negatively correlated polarization aspects are not desirable for practical purposes; therefore, we focus on Pol\(_0\) and Pol\(_1\) in the following.

Figure 2 shows scatter plots of Pol\(_0\) and Pol\(_1\) against the average attitude for all attitude landscapes in our data set.

Fig. 2
figure 2

The polarization measures Pol\(_0\) and Pol\(_1\) of 4155 attitude landscapes from the European Social Survey against their average attitude and six examples of attitude landscapes from core political topics in 2018. The dark bins in the attitude landscapes are peaks in the landscape (neighboring bins are smaller). Countries in the examples are Serbia, Norway, Netherlands, Italy, Switzerland, and Germany. In the scatter plots, dark grey data points stem from our core topics

Pentamodal model of attitude landscapes

In this section, we pursue Research Question 2: Based on empirical characteristics of attitude distributions, we develop a theoretical model of their composition and utilize it to decompose Pol\(_0\) into meaningful components.

Even though it is obvious that the measures \(\text {Pol}_0\) and \(\text {Pol}_1\) assess different aspects of polarization, what distinguishes them from each other is not easily interpreted in an empirically relevant way. For example, Fig. 2 shows that the left–right self-placement in Norway (B) is low on \(\text {Pol}_1\) and comparably high on \(\text {Pol}_0\) although it has clear peaks with supposedly high identification more than left–right self-placement in Switzerland (E) which scores minimally higher on \(\text {Pol}_1\) but not because of multiple peaks but mostly because the landscape is dominated by a large amount of neutral attitudes. Further on, antagonism (defined as difference to attitudes of others by Esteban and Ray (1994)) caused by a central bin which is much larger than neighboring bins may be to a large extent caused by a lack of interest in the item topic and/or in filling in survey instruments among many of the neutral individuals. It would, thus, be desirable to distinguish antagonism stemming from this type of neutral individuals. The same would be of interest for antagonism caused by extreme attitudes when the corresponding bins largely exceed neighboring bins. This would enable to distinguish between polarization caused by extremists and polarization caused by moderates. In the following, we develop a theoretical model of the composition of an attitude landscape which we then use for such refinements. The model is motivated based on an empirical exploration of typical characteristics of attitude landscapes, in particular of their peaking patterns.

The example attitude landscapes in Fig. 2 show their peaking bins highlighted in a darker shade. Formally, a peak is a bin for which neighboring bins are smaller. It seems attitude landscapes are not simple distributions (e.g., uniform, bell-, or U-shaped), but have a multimodal structure. Besides these two observations, Lorenz (2017) quantified more stylized facts of attitude landscapes of left–right self-placements in the ESS: (i) The largest bin is almost always the central one usually exceeding the neighboring bins by far suggesting a discontinuous jump. (ii) Peaks appear often at the extremes (bins 0 and 10). Though often small in magnitude, still usually more people are extreme than close to extreme. (iii) The maximum number of peaks is five whereas six are theoretically possible (iv) Moderate off-center peaks (at bins 2 or 3 and 7 or 8) are frequent whereas peaks directly next to the center or the extremes (at bins 1, 4, 6, and 9) are extremely rare. Moreover, the bins around these moderate off-center peaks usually give a “smoother” impression suggesting an underlying bell shape. In our exploration of the ESS attitude landscapes, we observed these stylized facts in a similar way for all topics with the only notable exception that the central bin is not always the largest in attitude landscapes when the mean attitude is far away from neutral, though it is still almost always a peak. Figure 2 (bottom) in ESM shows more details of our exploration of peaks in ESS attitude landscapes.

We use these stylized facts of the empirical data to construct a measurement model based on the idea that attitude landscapes are composed of five latent groups related to the peaks found in the data.

The model

Attitude landscapes show properties of continuous distributions with smooth shapes, but these are contrasted by peaks at the extremes and in the center which often spike out. These properties emerge frequently across different types of response scales, as Züll and Scholz (2016) show and we find them for both 11-point and 10-point response scales (see Appendix 2). The response scale literature indicates that survey respondents often treat mid- and endpoint categories as special, which could explain the peaks at the extremesFootnote 7 and the center.Footnote 8

We translate this empirical duality into an assumption that there are two different kinds of individuals: Those who answer the question based on an underlying continuous valuation (being a real number/ sensitive to the scale), labeled as “moderates” in the following, and those who answer the question based on an underlying discrete valuation like “yes,” “undecided/neutral,” or “no” (where “yes” and “no” are replaced by the extreme labels of the underlying question, e.g., “fully agree” and “fully disagree”), labeled “extremists” and “centrists,” respectively.

Taken together, we postulate that an attitude landscape is composed of individuals from five endogenous groups: The Left Extremists (ExL), the Left Moderates (ModL), the Centrists (C), the Right Moderates (ModR), and the Right Extremists (ExR)Footnote 9. The attitude distributions within these five groups are as follows:

$$\begin{aligned} \pi ^\text {ExL}&= [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] \\ \pi ^\text {ExR}&= [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1] \\ \pi ^\text {C}&= [0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0] \\ \pi ^\text {ModL}&=[\pi ^\text {ModL}_0, \pi ^\text {ModL}_1, \dots , \pi ^\text {ModL}_9, \pi ^\text {ModL}_{10}] \\ \pi ^\text {ModR}&=[\pi ^\text {ModR}_0, \pi ^\text {ModR}_1, \dots , \pi ^\text {ModR}_9, \pi ^\text {ModR}_{10}] \end{aligned}$$

We assume the attitude distributions within the moderate groups are discretized and confined normal distributions (see Lorenz 2009).Footnote 10

That means, we define

$$\begin{aligned} \pi ^\text {ModL}_0&= \int _{-\infty }^{0.5} \varphi (x; \mu _\text {L}, \sigma _\text {L}) dx = \varPhi (0.5; \mu _\text {L}, \sigma _\text {L}), \\ \pi ^\text {ModL}_{10}&= \int _{9.5}^{\infty } \varphi (x; \mu _\text {L}, \sigma _\text {L}) dx = 1 - \varPhi (9.5; \mu _\text {L}, \sigma _\text {L}), \end{aligned}$$

and

$$\begin{aligned} \pi ^\text {ModL}_i&= \int _{i-0.5}^{i+0.5} \varphi (x; \mu _\text {L}, \sigma _\text {L}) dx = \varPhi (i+0.5; \mu _\text {L}, \sigma _\text {L}) - \varPhi (i-0.5; \mu _\text {L}, \sigma _\text {L}) \end{aligned}$$

for \(i = 1, \dots 9\), where \(\varphi (\cdot ;\mu ,\sigma )\) is the probability density function of a normal distribution with mean \(\mu=\mu_L \text{ or } \mu=\mu_R\) and standard deviation \(\sigma=\sigma_L \text{ or }\sigma = \sigma_R \), accordingly, and \(\varPhi\) is the corresponding cumulative distribution function \(\sigma _\text {R}\).

The pentamodal model exposes that an attitude landscape p can be modeled as a pentamodal distribution \(\pi\) which is the weighted sum of the attitude distributions of the five groups

$$\begin{aligned} \pi = w_\text {ExL}\pi ^\text {ExL} + w_\text {ExR}\pi ^\text {ExR} + w_\text {C}\pi ^\text {C} + w_\text {ModL}\pi ^\text {ModL} + w_\text {ModR}\pi ^\text {ModR}\end{aligned}$$
(6)

where the weights sum up to one \(w_\text {ExL} + w_\text {ExR} + w_\text {C} + w_\text {ModL} + w_\text {ModR} = 1\).

A pentamodal distribution is completely defined by the nine parameters—the five population frequencies and the location and scale parameters of the two moderate groups. The effective number of parameters is eight because the five weight parameters must sum up to one.

Parameter estimation

Given a real-world attitude landscape p (e.g., from ESS data), we estimate the best-fitting pentamodal distribution \(\pi = \pi (w_\text {ModL}, \mu _\text {L}, \sigma _\text {L}, w_\text {ModR}, \mu _\text {R}, \sigma _\text {R},\) \(w_\text {ExL}, w_\text {ExR},w_\text {C})\) by fitting the nine parameters with a customized standard optimization algorithm. To that end, we solve the minimization problem

$$\begin{aligned} \min _{\theta } \left( \sum _{i=0}^{10} (p_i - \pi _i(\theta ))^2 + \beta (\hat{w}_\text {ExL}^2 + \hat{w}_\text {ExR}^2 + \hat{w}_\text {C}^2)\right) \end{aligned}$$
(7)

where \(\theta = [w_\text {ModL}, \mu _\text {L}, \sigma _\text {L}, w_\text {ModR}, \mu _\text {R}, \sigma _\text {R}, w_\text {ExL}, w_\text {ExR},w_\text {C}]\) is the vector of parameters, \(\beta\) is a fitting parameter weighting the following three penalty terms which penalize negative weights \(\hat{w}_\text {ExL} = \min \{w_\text {ExL}, 0\}\), analog for \(\hat{w}_\text {ExR}\) and \(\hat{w}_\text {C}\), subject to the constraints

$$\begin{aligned} 0< \mu _\text {L}, \mu _\text {R}&< 10 \end{aligned}$$
(8)
$$\begin{aligned} w_\text {ModL}, w_\text {ModR}&\ge 0 \end{aligned}$$
(9)
$$\begin{aligned} w_\text {ModL} + w_\text {ModR} + w_\text {ExL} + w_\text {ExR} + w_\text {C}&= 1 \end{aligned}$$
(10)

using the ‘minimize’ function with the Sequential Least Squares Programming (SLSQP) algorithm in Python’s SciPy package.

The weights of extremists and centrists are allowed to be below zero. This is necessary, because some attitude landscapes have zero individuals answering 0 or 10 but any pentamodal model has a tiny positive fraction of extreme moderates due to the properties of the normal distribution. Small negative weights \(w_\text {ExL},w_\text {ExR}\) can compensate for this and forbidding negative weights in total would deliver very bad fits for some attitude landscapes. Nevertheless, negative weights are not desirable. Therefore, we introduced the penalty terms \(\hat{w}_\text {ExL}^2, \hat{w}_\text {ExR}^2, \hat{w}_\text {C}^2\).

After fitting the parameters \(\theta\), we computed \(R^2\) as a goodness-of-fit measure as

$$\begin{aligned} R^2=1-\frac{\sum _{i=0}^{10}( p_i - \pi _i(\theta ))^2}{\sum _{i=0}^{10}( p_i-\frac{1}{11})^2} \end{aligned}$$
(11)

The term \(\frac{1}{11}\) in the denominator results from the null hypothesis that the landscape has a uniform distribution and \(R^2\) specifies what proportion of the deviation from the uniform distribution is explained by the pentamodal model. The \(R^2\) values are summarized per topic and per country in Fig. 3.

Fig. 3
figure 3

\(R^2\) for each country and each topic. Ordered from lowest to highest median \(R^2\) value. Core political topics marked as red (* Outlier CY EUFTF is not plotted (\(R^2=0.76\)))

The explained variance is very high in almost all cases. However, this is not too surprising with an effective number of eight parameters to fit eleven values. In Appendix 5, we show that median \(R^2\) for different values of the weight \(\beta\) in the penalty term (Fig. 6 in ESM). These explorations lead us to use \(\beta =20\) for the results presented in the following. Furthermore, Appendix 5 shows an analysis of the residuum (Fig. 5 in ESM) between fitted and empirical distribution. These show only small variation around 0 with the worst fit being answer 8 over the ESS dataset. Further on, Appendix 5 shows the distribution of estimated parameters for all attitude landscapes and the four attitude landscapes with the worst \(R^2\) including a discussion why the model performs badly in these cases.

Figure 4 shows an example of the pentamodal model \(\pi\) as blue crosses together with the bins of the underlying attitude landscape p and a visual representation of the two Gaussian distributions of the moderates from the fitted pentamodal model. The upper part of the extreme and central bins matches the parameters \(w_\text {ExL}, w_\text {ExR}\), and \(w_\text {C}\) closely but is a result of a decomposition of the attitude landscape based on the pentamodal model which we explain next.

The source code for calculating the model parameters can be found in the Appendix 3 and the full data set can be found at DOI:10.17605/OSF.IO/DHW45.

Decomposition of empirical attitude landscapes

Despite good model fit, a pentamodal distribution never completely coincides with the underlying empirical attitude landscape. Nevertheless, we can use a best-fit pentamodal model \(\pi\) to decompose the underlying attitude landscape p into three groups: moderates \(p^\text {Mod}\), extremists \(p^\text {Ex}\), and residual centrists \(p^\text {resC}\).

$$\begin{aligned} p = p^\text {Mod} + p^\text {Ex} + p^\text {resC} \end{aligned}$$
(12)

To that end, we define

$$\begin{aligned} p^\text {Mod}_i&= \min \{p_i\ ,\ w_\text {ModL}\pi ^\text {ModL}_i + w_\text {ModR}\pi ^\text {ModR}_i\} \\ p^\text {Ex}&= [p_0 - p_0^\text {Mod}, 0, \dots , 0, p_{10} - p_{10}^\text {Mod}]\\ p^\text {resC}&= p - p^\text {Mod} - p^\text {Ex}\\ \end{aligned}$$

Of course, we could also define a decomposition into five groups, but the chosen groups suffice for our purposes.

The landscape of the moderates \(p^\text {Mod}\) is the pentamodal distribution of the moderates only capped by the empirical landscape if necessary. Note that, this distribution also has positive population at the extremes and in the center. There might even be a peak at the extreme when a larger part of the moderate normal distribution exceeds 0 or 10, e.g., when the fitted mean is close to extreme or the fitted standard deviation is very large. The landscape of the extremes \(p^\text {Ex}\) has positive values only at 0 and 10. It captures all population on these bins which are not covered in the moderate population. When the pentamodal fit is very close to the empirical landscape, then \(p^\text {Ex} \approx [w_\text {ExL}, 0, \dots , 0, w_\text {ExR}]\). Finally, \(p^\text {resC}\) covers the residual population which is neither moderate nor extreme. A good pentamodal fit makes \(p^\text {resC} \approx [0, \dots , 0, w_\text {C}, 0, \dots , 0]\).

Figure 4 shows the decomposition of attitude landscapes through stacked bins with the moderates at the bottom. The figure highlights the descriptive value of the pentamodal model: It estimates the fraction in the central bin which potentially chose 5 based on a moderate continuous attitude and those who chose 5 as a discrete choice of neutrality, e.g., because of a lack of knowledge or interest. Similarly, the model estimates the fraction of the extremists which can be counted as extreme moderates and those who are “genuine” extremists because of a discrete choice.

Fig. 4
figure 4

Distribution and model results of France left–right self-placements (LRSCALE) 2018. The underlying continuous distribution of moderates are displayed in red and green, and the best-fit pentamodal model is shown with blue crosses. The dark gray bars represent the extremists and centrists which are not from the moderate groups

Decomposition of Pol\(_0\)

We can use the decomposition of an attitude landscape into the moderates, extremists, and residual centrists based on the pentamodal model to also decompose the polarization measure \(\text {Pol}_0\) into meaningful components. Each component quantifies to which extend members of an endogenous group are polarized in relation to the overall attitude distribution. To that end, we must first define the partial polarization measure for a partial attitude landscape \(0 \le q \le p\) (the inequality is meant entrywise) as

$$\begin{aligned} \text {Pol}_\alpha (q, p) = \frac{2^{1+\alpha }}{n}\sum _{i,j=0}^{n} q_i^{1+\alpha }p_j|i-j|. \end{aligned}$$
(13)

The original measure appears as the special case \(\text {Pol}_\alpha (p) = \text {Pol}_\alpha (p, p)\).

Reconsidering the probabilistic interpretation of \(\text {Pol}_0(p)\) as average pair discrepancy, the partial polarization \(\text {Pol}_0(q,p)\) of the group with partial attitude landscape q is the average pair discrepancy individuals of the group represented by q perceive, when the other person is selected from the whole population. Each probability is weighted by the total population of the group represented by q, resulting in the component reflecting the proportion of its endogenous group within the distribution. Extremist polarization \(\text {Pol}_{0}^\text {Ex}\) would also stem from the fact that the share of left and right extremists is equally divided. A distribution driven by centrist polarization \(\text {Pol}_{0}^\text {resC}\) would mostly represent a high fraction of modeled centrists.

For further analysis of within-group homogeneity of moderates, we can additionally decompose Pol\(_0\) of moderates. It can be shown that \(\text {Pol}_\alpha (q, p) \le \text {Pol}_\alpha (p)\) and \(\text {Pol}_0(q, p) \ge \text {Pol}_1(q, p)\) for any attitude landscape p and any of its partial attitude landscapes q.Footnote 11

As mentioned, the probabilistic interpretation of Pol\(_1\) is a sampling triplet, whereby, in this case, two individuals belong to the investigated endogenous group of moderates represented by q. A high fraction of Pol\(_1\) of moderates points toward pronounced peaks, either one or two. Two peaks tend to come when also \(\text {Pol}_0\) is high.

These properties and the additive nature of Pol\(_0\) are the basis for the decomposition

$$\begin{aligned} \text {Pol}_0(p) = \underbrace{\text {Pol}_0(p^\text {Ex},p)}_\text {(i)} + \underbrace{\text {Pol}_0(p^\text {resC},p)}_\text {(ii)} + \underbrace{\text {Pol}_1(p^\text {Mod},p)}_\text {(iii)} + \underbrace{\text {resPol}_0^\text {Mod}(p)}_\text {(iv)} \end{aligned}$$
(14)

where \(\text {resPol}_0^\text {Mod}(p) = \text {Pol}_0(p) -\text {Pol}_0(p^\text {Ex},p) - \text {Pol}_0(p^\text {resC},p) - \text {Pol}_1(p^\text {Mod},p)\) is the residual part of \(\text {Pol}_0\) without the parts of the extremists and the residual centrists and without \(\text {Pol}_1\) of the moderates.

In the following, we call the decomposed components of polarization in an attitude landscape

  1. (i)

    extremist polarization (\(\mathbf{Pol} _\mathbf{0} ^\mathbf{Ex}\)),

  2. (ii)

    centrist polarization, (\(\mathbf{Pol} _\mathbf{0} ^\mathbf{resC}\))

  3. (iii)

    moderate identification-weighted polarization (\(\mathbf{Pol} _\mathbf{1} ^\mathbf{Mod}\)), and

  4. (iv)

    moderate residual polarization (\(\mathbf{resPol} _\mathbf{0} ^\mathbf{Mod}\)).

With this decomposition, we answer Research Question 2: Based on stylized characteristics of attitude distributions (namely their peakedness), we formulated the pentamodal model which allowed us to improve our measurement concepts into an empirically grounded single polarization measure that can be further decomposed.

Empirical findings

We start out by illustrating how the decomposed polarization measure can provide more meaningful interpretations of empirical attitude landscapes. Then, we show how the components interrelate. We finish the section with a practical application answering Research Question 3 on polarization across European countries, topics, and time.

In Fig. 5, we pick up the example attitude landscapes which we had analyzed in Fig. 2 regarding \(\text {Pol}_0\) and \(\text {Pol}_1\), and decompose them into the extremist \(p^\text {Ex}\), the centrists \(p^\text {resC}\), and the moderates \(p^\text {Mod}\). The landscape of moderates is further decomposed into a part which is identification-weighted \(2(p^\text {Mod})^2\) and the major remaining part \(p^\text {Mod} - 2(p^\text {Mod})^2\) (exponentiation is meant entry wise). The vector of identification-weighted moderates appears as part of the definition of \(\text {Pol}_1^\text {Mod}\). The gray scatter plots in Fig. 5 show the decomposed components of the examples in relation to all other attitude landscapes in our dataset.

Fig. 5
figure 5

The four components of the decomposition of Pol\(_0\) from Eq. (14) based on the pentamodal model. The examples are the same as in Fig. 2. The color code for examples A–F is red = extremists \(p^\text {Ex}\), green = residual centrists \(p^\text {resC}\), purple = moderates identification-weighted \(2(p^\text {Mod})^2\), and cyan = remaining part of moderates \(p^\text {Mod} - 2(p^\text {Mod})^2\)

We had previously identified Example A as most polarized with respect to Pol\(_0\) and Pol\(_1\) (see Fig. 2). Our decomposition adds two new insights: First, polarization is driven more by extremists and less by centrists (compared to other attitude landscapes). Second, the polarization of moderates is low in particular when identification-weighted.

Examples B and C show that this second insight is by no means typical for all attitude landscapes: both examples show high identification-weighted polarization of moderates. In Example B, the moderates show a bimodal distribution, whereas in C the right moderates dominate. This difference is reflected in B having higher values of the residual polarization of moderates \(\text {resPol}^\text {Mod}_0\).

Example D shows this characteristic even more clearly, where residual polarization of moderates is even higher than in B. This reflects that polarization is driven by moderates, in particular by dispersed distributions of the moderate left and moderate right. Example E shows how the decomposition can detect sources of low polarization: A large part of the polarization is because of centrists. Finally, Example F illustrates an example which is in between of distributions B with a bimodal and C with a unimodal distribution of moderates.

We shall now investigate the inter-relation between \(\text {Pol}_0\), \(\text {Pol}_1\) and the decomposed components. We illustrate this in Fig. 6. The correlation coefficients on the left-hand side are based on all 4155 attitude landscapes in our sample and on the right-hand side it is reduced to our core topics.

Fig. 6
figure 6

Pearson correlation coefficients of \(\text {Pol}_0\), its four components, and \(\text {Pol}_0^\text {Mod}\) and \(\text {Pol}_1\) for comparison. Full dataset on the left and reduction to the core political topics (LRSCALE, EUFTF, and IMUECLT) on the right

The correlation analysis shows that the moderate residual polarization has a correlation of 0.89 with the overall polarization, whereas the correlation of the moderate residual polarization to the other three components is closer to 0. This reflects our idea to isolate different independent aspects of polarization with the extremist, centrist, and identification-weighted moderate polarization. The only strongly negative correlations between the four components are between identification-weighted moderate polarization and extremist and centrist polarization. This underpins that a comparison of identification-weighted moderate polarization is most informative in comparison with the moderates residual polarization.

Further on, the identification-weighted moderate polarization is not correlated with the overall identification-weighted polarization \(\text {Pol}_1\) which shows that the decomposition and focus to the moderates delivers different information. We think, the overall identification-weighted polarization is difficult to interpret in isolation because it lumps together the identification of extremists, centrists, and moderates. High scores could be caused by any of those groups. Thus, they say little about the character of polarization.

In the following, we show how the pentamodal model and the decomposition of polarization can be used for cross-topic and cross-country comparisons and for the analysis of time trends.

Analysis of cross-topic data

We evaluate how polarized different topics are by grouping attitude landscapes by topic. That means for each topic, we have the attitude landscapes of all available country year combinations. Figure 7 shows box plots of Pol\(_0\) ordered by the median from most to least polarized. We highlight the three core topics left–right self-placement, European unification, and cultural impact of migration in red. Topic explanations are shown in Table 1 in ESM. The decomposed values of Pol\(_0\) are shown in Table 2.

Fig. 7
figure 7

Box plot of the complete dataset sorted by the median of Pol\(_0\). The mean is indicated by a circle

Table 2 Ranking from the highest to the least polarized topics in the ESS (ordered by the median)

The question about fair job chances for everyone in the country shows the largest polarization with a median of Pol\(_0\approx 0.6\). The second most polarized topic is how people assess whether limiting their own energy use could reduce climate change. Small whiskers on both topics indicate that polarization is consistently high across countries and years. The third most polarized topic is European unification but with higher variation across countries and time. Individual happiness shows the lowest polarization with \(\text {Pol}_0 \approx 0.4\).

Among the three core topics, the left–right self-placement has the lowest median polarization. All three core topics show comparably large variation in polarization. Furthermore, all are high on polarization of residual centrists \(\text {Pol}_0^\text {resC}\) and low on identification-weighted moderate polarization \(\text {Pol}_1^\text {Mod}\). This reflects that many political topics show large peaks of undecided or neutral centrists. The low level of \(\text {Pol}_1^\text {Mod}\) is not surprising because the measure is negatively correlated with \(\text {Pol}_0^\text {resC}\). European unification differs from immigration mostly on extremist’s polarization which is much stronger for European unification. The low polarization of left–right self-placement is mostly due to much lower residual moderate polarization compared to the two other core topics.

Strong differences in polarization are also visible for the two topics about emotional attachment. While the emotional attachment to Europe (ATCHERP) is the sixth most polarized topic, the attachment to the own country (ATCHCTR) has the second lowest polarization. However, both are high on polarization of extremists, emotional attachment to the own country even slightly higher than emotional attachment to Europe.

Cross-country comparison

Figure 8 shows the country rankings for European unification in 2018. The table is sorted by \(\text {Pol}_0\) but also shows all four components of its decomposition. In all columns, we color the cells with the largest values in deep red and those with the lowest in deep blue to allow an easy assessment of rankings in the four component measures and an assessment, which of the four components is over- or underrepresented in a country compared to the other countries.

Fig. 8
figure 8

Polarization ranking of countries on the topic of European unification (EUFTF) in 2018. Table and histograms are sorted by \(\text {Pol}_0\)

As already discussed in the “Empirical findings” section with respect to Fig. 5, the exceptionally high polarization on European unification in Serbia results mostly from extremist polarization and its basically trimodal shape.

We skip the analysis of Austria because the fit of the pentamodal model is bad resulting from \(p_1\) being unusually high. Further explanation is given in Appendix 5.

Figure 8 includes as examples six attitude landscapes each including a representation of the underlying pentamodal model as in Fig. 4 which may help further interpreting the results.

The lowest polarization on European unification appears in Norway with a low number of extremists, means of the moderates close to five, and a particular broad consensus on the centrist’s position. For the countries in between both extremes, the distributions have various structures. Hungary and Italy have comparable structures and similar moderates identification-weighted polarization \(\text {Pol}_1^\text {Mod}\). Although the fraction of moderates is similar, 82.5 % in Italy vs. 78.6 % in Hungary, the higher residual moderates polarization and overall polarization in Italy stems mostly from the effect of less accentuated peaks and higher uniformity within the moderates. At the lower ranks of polarization, Belgium stands for a country with more approval for European unification and double the weights of moderates right versus left. With over 78 % of moderates, identification-weighted moderate polarization is relatively high due to the overall weights. For comparison, Finland has about 80 % moderates with more similarly sized weights for the opposing moderates but with both moderates means closer to the midpoint, resulting in an overall lower polarization. Interestingly, the Brexit referendum in 2016 has not brought the United Kingdom toward strong polarization on European unification compared to other countries. However, polarization in the United Kingdom is more driven by polarization perceived by centrists than in most other countries.

The country ranking for left–right self-placement and migration in 2018 are shown in the same way in Figs. 8 and 9 in ESM.

“As Time Goes By:” trend analyses

Increasing polarization is a big lament in public media and science. However, such claimed trends do not show off as often in survey data. The pentamodal model and the decomposition of \(\text {Pol}_0\) can refine the picture. We demonstrate this with three examples for the core political topics in three different countries.

Figure 9 shows five distributions and the corresponding pentamodal model of the left–right self-placement in Denmark from 2002 to 2014, in the panels on the left-hand side. The right-hand side shows a stacked area plot of all four measures of which \(\text {Pol}_0\) is composed. Below are all components independently including a linear trend approximation with a \(\pm 1\sigma\) confidence area.

Fig. 9
figure 9

Analysis of the left–right self-placement in Denmark from 2002 until 2014. Left: Distributions and results of the pentamodal model including the proportion of moderate left and right and their according means. Right: \(\text {Pol}_0\) and the decomposed components of polarization over time

The overall left–right polarization \(\text {Pol}_0\) in Denmark has been increasing from 2002 to 2014. The increase can be attributed to a simultaneous increase of both the identification-weighted (Pol\(_1^\text {Mod}\)) and moderate polarization (resPol\(_0^\text {Mod}\)), while extremists polarization fluctuates without a clear trend, and centrists polarization declined.

The moderate left was initially large, close to centrist, and widely dispersed while the moderate right was smaller, less centrist, and more condensed. Until 2010, this picture changed with the right moderates becoming more, moving a bit closer to the center, and more dispersed, while the left becoming less, more extreme, and more condensed. Overall, the group of moderates on both sides became more and the residual centrists became less. In 2014, it looks a bit like a reversion to the 2002 situation with larger and more dispersed left moderates. So, the main driving force behind increasing polarization is the shrinking of the groups of residual centrists in favor of moderates on both sides. This is also the main reason for the overall decline of centrists polarization. We end the analysis in 2014 because there are no data for Denmark in the years 2016 and 2018.

Fig. 10
figure 10

Analysis European unification (0 = gone too far, 10 = should go further) in the United Kingdom from 2004 until 2018. Left: Distributions and results of the pentamodal model including the proportion of moderate left and right and their according mean. Right: \(\text {Pol}_0\) and the decomposed components of polarization

The polarization about European unification in the United Kingdom in Fig. 10 shows that the pentamodal model can reveal trends in different aspects of polarization which remain covered when trends in total polarization are mostly constant. From 2004 to 2014, the ratio between the moderate left and moderate right increased in favor for the opinion that European unification has gone too far. This shows a clear trend toward the Brexit referendum’s decision in 2016. The ratio became much more balanced after the Brexit decision and is becoming equally sized in the year 2018 again. The distribution of all moderates combined (the light gray distribution without the hatched parts) became more uniform until 2018 and the total level of moderates was decreasing from 81 to 70% in favor of centrists as well as extremists. This explains the increase of centrists and extremist polarization and the decrease of identification-weighted moderate polarization.

Fig. 11
figure 11

Analysis of Hungarian attitudes about the impact of immigration on the country’s culture (0 = underminded, 10 = enriched) from 2002 to 2018. Left: Distributions and results of the pentamodal model including the proportion of moderate left and right and their according mean. Right: \(\text {Pol}_0\) and the decomposed components of polarization

The example of Hungary in Fig. 11 shows an opinion shift in the attitudes about immigration’s influence on the country’s culture. With the European migrant crisis in 2015, the Hungarian parliament decided to enforce an anti-immigration policy (Thorleifsson 2017), particularly “protecting their Christian roots and culture” (Viktor Orban). This shift is also represented in the society with a spike in left extremists and moderate left in 2016. “Left” stands here for people who think that immigration undermines the Hungarian culture. Simultaneously, the centrist polarization drops, pointing toward a politicization of the society in favor of anti-migration policies. Furthermore, the extremist polarization and residual moderate polarization increased from 2014 to 2016. Remarkably, Pol\(_0\) remains mostly stable despite the shift of the mean attitude. What happened in Hungary shows more the characteristic of what is called group polarization which is the collective shift toward more extreme attitudes. This is not captured by bipolarization as measured by Pol\(_0\) and its decomposition.

Appendix 6 shows further trends for Pol\(_0\) and its four components for the three topics in Fig. 10 in ESM). The tables show the trends through the slope of a linear approximation. All countries which participated in more than six rounds of the ESS were included. No Europe-wide trend of increasing or decreasing polarization could be detected for the three core topics, neither in total polarization nor in its components.

Concluding, we have applied the polarization measure to answer Research Question 3: Regarding cross-topic comparisons, we saw considerable variation across countries but a common pattern of high \(\text {Pol}_0^\text {resC}\) and low \(\text {Pol}_1^\text {Mod}\) in the core political topics. In terms of cross-national comparisons, a similarly high variation emerged, this time for both overall polarization and also its decomposition. The same applied to polarization time trends. The pentamodal model and the decomposed polarization measures can show details of how substantial political debates, like Brexit in the United Kingdom and migration in Hungary, effect the evolution of the attitude distribution of the general public even when the overall level of polarization remains mostly constant.

Conclusion

The measurement of societal polarization on a single attitude dimension is controversially discussed in the literature. The question what is to be seen as polarized and what not may seem trivial because there is agreement about the minimally and maximally polarized attitude landscape. However, in between, there are many aspects which could be emphasized. We propose the measure of Pol\(_0\) based on the measurement concept of Esteban and Ray (1994), because it can be decomposed into different components concerning different groups and identification-weighted components as introduced by Esteban and Ray (1994). Pol\(_0\) is almost perfectly correlated with other “catch-all” polarization measures focusing on the dispersion principle of DiMaggio et al. (1996). Furthermore, it has an appealing probabilistic interpretation: It is the expected antagonism (distance in attitude) a pair of two randomly selected individuals from the population perceive.Footnote 12

Motivated by empirical observations on 4155 attitude landscapes on eleven-point scales in the ESS dataset, we developed the pentamodal model which assumes groups of left and right moderates modeled by a normal distribution and left and right extremists, and centrists modeled as groups focused on just one answer. The model allows the decomposition of attitude landscapes into these five groups. Thereby, the model also estimates the fraction of those individuals answering extreme (0 or 10) or neutral (5) into those who do so as part of their moderate continuously adjusted judgment and those who do so as a discrete choice either by a convinced extreme position or principal neutrality (which can also be interpreted as lack of knowledge or interest).

As a validation of our model, the introduced \(R^2\) showed a mean of \(R^2\)=0.99 with only 0.6% of cases dropping below \(R^2<0.95\). The model works across all countries and topics with few exceptions. Consequently, we are able to decompose Pol\(_0\). We can measure the polarization perceived by extremists and centrists. The remaining polarization of the moderates can be further decomposed by specifying the identification-weighted part Pol\(^\text {Mod}_1\) following the framework of Esteban and Ray (1994).

The pentamodal model and the decomposition of \(\text {Pol}_0\) provide a reasonable way to assess how polarized a certain topic is in a certain European country in a particular year and which group drives it in comparison with a set of reference cases, e.g., other topics in the same country, other countries on the same topic, or other years. A structured analysis can follow these questions:

  1. 1.

    How strong is the level of total of polarization \(\text {Pol}_0\) compared to the reference set?

  2. 2.

    How much is the polarization driven by extremists \(\text {Pol}^\text {Ex}_0\)? (Compared to the other aspects and to other landscapes in the reference set.) A large value indicates that there are many excess extremists which are not within the fit of moderates.

  3. 3.

    How much is the polarization driven by the residual centrists \(\text {Pol}^\text {resC}_0\)? (Compared to the other aspects and to other landscapes in the reference set.) A large value may also indicate that many individuals in the population are not very interested in forming nuanced attitudes on this topic.

  4. 4.

    What is the level of the identification-weighted polarization of moderates \(\text {Pol}^\text {Mod}_1\)? This points toward a more peaked distribution of moderates instead of a uniform distribution.

  5. 5.

    Is there something else visible in the attitude landscapes and the fitted pentamodal model? This can be assessed visually in comparison with other attitude landscapes.

Using these guidelines for the three core political topics, we found that European unification and immigration are among the polarized ones, while left–right self-placement is among the topics with the lowest polarization mostly because of higher shares of centrists. Strong variation in topical polarization exists between different countries: For example, in Norway, left–right self-placement is polarized (0.545, ranked sixth among 19 countries) but European unification not (0.458, ranked last of 19 countries); the other way round, in Estonia, European unification is polarized (0.621, ranked fifth) and left–right self-placement not (0.384, ranked last) .

Overall, we find no indication of a general trend of increasing or decreasing polarization, neither in total nor in one of its components. The strongest general increase we found for the left–right self-placement is in Denmark, which is to a large extend driven by a decrease of the number of centrists. Given the political and media discourse, attitudes on Europe in the United Kingdom and on immigration in Hungary should show trends in polarization. We do not find this for the total polarization of the general population. However, we find that polarization on European unification in the United kingdom became more driven by extremists and centrists over the Brexit discourses and less by identification-weighted moderates. The polarization on immigration in Hungary became more driven by identification-weighted moderates and extremists. At the same time, there was a substantial shift of the average attitude toward anti-immigration attitude which is not reflected in a substantial impact on total polarization. The migration attitudes polarization discussed in the media may thus either refer to this downward shift in Hungary or reflect polarization on the European level.

With this measurement framework, we provide a data-driven and more nuanced view on mass polarization such that it can be discussed with more specific definition and quantitative evidence. The polarization indices derived from the pentamodal model can also serve future analyses of context conditions for polarization, such as the political system, income inequality, social cohesion, or other country-based indicators. It would also be interesting to explore relations to other concepts of polarization, for example, issue alignment (Baldassarri and Gelman 2008).

Further discussion

A topic of some debate is the contribution of social media to polarization (Barberá 2014, 2015; Garimella and Weber 2017; Eady et al. 2019). In many studies, researchers try to infer attitude scores (typically liberal-conservative or left–right) from postings (e.g., tweets), the follower–followee network, and context information. The population active on social media does not represent the general population well (Barberá and Rivero 2015). Our work can help to compare polarization in social media with the general population. To that end, we provide the response rates extracted from the ESS documentations in Appendix 1.

Our data exploration elicited that almost all attitude landscapes have multimodal structures and do not follow simple distributions. This raises the questions if these patterns emerge through processes of attitude dynamics in the population which are analyzed with agent-based models (Flache et al. 2017; Lorenz et al. 2021). The pentamodal model can provide a solid structure of empirical data as stylized facts to validate such models based on their macroscopic outcomes (see Meyer 2019).

The conceptual problem of measuring polarization as depicted in Fig. 1 showed three different types of intermediate polarization. The difference between Pol\(_0\) and Pol\(_1\) distinguishes well between the first (“maximal diversity”) and the other two. When three attitude landscapes have the same Pol\(_0\), the one with the lowest Pol\(_1\) is closest to “maximal diversity.” However, the difference between “equal powers” and “unbalanced extremes” (as visible for example in the moderates of Fig. 5 B and F) would need other measures which we did not explore further in this study.

The validity of the pentamodal model’s assumption of specific groups of extremists and centrists, different from the rest of moderates who can be everywhere at the response scale could be increased by further evidence independent of the stylized facts of the distributions. Two possibilities are (a) investigating and sampling endogenous groups within the dataset like exemplary shown in Appendix 4 using party affiliation, or (b) constructing a survey experiment that verifies the true affiliation of participants into one of our five groups. By using the assumption of endogenously given groups, one may think of transferring the group-specific measurement concepts of Bramson et al. (2016) more directly by using the parameters of the Gaussian functions of the two moderate groups, e.g., by defining group divergence as \(|\mu _\text {L} - \mu _\text {L}|\), group consensus based on \(-(\sigma _\text {L} + \sigma _\text {R})\), size parity based on the parity of \(w_\text {ModL}\) and \(w_\text {ModL}\), and distinctness describing the ratio of the overlap of two groups. All these are feasible directions for future work and might help to further refine and improve the measurement of polarization.

We refrained from it because in the model parameters maybe sometimes sensitive to small fluctuations in the data around the midpoint. Therefore, using measures that utilize the groups of moderates individually, like divergence, distinctness, and group consensus, results in much higher uncertainty than Pol\(_1^\text {Mod}\) and resPol\(_0^\text {Mod}\) in our decomposition of Pol\(_0\).

Another possible direction of future improvement of characterizing polarization would be to build on the bipolarization measures by Foster and Wolfson (2009) and Wang and Tsui (2000). By following a solid axiomatic foundation and satisfying Esteban and Ray (1994) axioms, they qualify specifically for the bimodal distribution like the distribution of moderates which we assume in the pentamodal model. This would enable a new direction for more comprehensive rankings in future studies. However, the integration in the decomposition of Pol\(_0\) based on the pentamodal would be complicated because of the different nature of the measure.

Another direction of future research could be the reduction of the number of parameter of the pentamodal model through the identification of further regularities. Fitting eleven data points with eight free parameters looks like little improvement. It could, for example, be that means of moderates were related, e.g., when one group was close to the center the other was not. We made some explorations to find such relations of parameters of the 4155 attitude landscapes with the aim to construct a model with fewer parameters. We found some correlations between \(\mu _\text {L}\) and \(\sigma _\text {R}\) and analogously between \(\mu _\text {R}\) and \(\sigma _\text {L}\). We did not start to simplify the model based on this finding, because correlations were small and theoretical plausibility was not very strong. Our feeling is that, simpler models would come with a substantial loss of goodness-of-fit.

We apply the pentamodal model only to questions on eleven-point scales from zero to ten. Scherpenzeel (2002) outline why the eleven-point scale has the most advantages in the context of the Swiss Household Panel. Wu and Leung (2017) advocate the eleven-point response scale also as a means to reject the criticism that short Likert scales offer ordinal data only, precluding many arithmetic operations that can only be performed for interval data. Nevertheless, many surveys use shorter scales like seven or five-point response scales. The groups in the pentamodal model should be reasonable also when using these scales, but these scales have lower numbers of response options than the pentamodal model has parameters. So, for such scales, it seems more appropriate to develop new models with similar heuristics.

We want to note that the European Social Survey would also allow to study the polarization aspects of issue alignment (constraint principle) and issue partisanship (consolidation principle) of exogenously given groups, e.g., party supporters (see Baldassarri and Gelman 2008). However, this was beyond the scope of this study.

Potential limitations could apply from a psychometric viewpoint. This stance assumes that public opinion measures extracted from survey data contain measurement error. Their quality depends on item characteristics such as the bipolar eleven-point response scale used here.

However, Leung (2011) advocates the use of 11-point scales as compared to other Likert scales because of increased sensitivity which did not seem to come at the cost of cognitive fatigue. The latter argument is frequently made against the use of long scale formats but could not be supported empirically by the findings: Varying the number of response categories did not affect (re-scaled) means and standard deviations, factor loadings, average item-to-item correlations, or other psychometric scale properties. One can conclude that longer and more sensitive scale types do not distort the measurement of constructs. Our descriptive analysis shows that even more details may be observed thanks to sensitivity, for example, different types of pentamodal distributions.

Furthermore, the existence of one midpoint instead of two (as for example in a 10-point scale) seems to not threaten instrumental validity (see also Appendix 2). Chyung et al. (2017) showcase strategies to reduce the interpretative ambiguity of scale midpoints, among which is the inclusion of a “don’t know” response option, as present in the ESS data. The ESS thus follows this best practice recommendation, which helps our interpretation of responses. We do acknowledge that a degree of ambiguity remains and that the issue is an ongoing psychometric debate. With the possibility in the pentamodal model to characterize fractions of the center position to different ideological positions, we further acknowledge different psychometric hypotheses and motives for centrist self-placement (Rodon 2014).

Also from a psychometric perspective, Harzing (2006) criticized cross-national comparisons. She found that countries exhibit different response styles regarding agreement bias and extreme answering. This was related to differences in cultural dimensions such as extraversion, uncertainty avoidance or collectivism (Harzing 2006). This should be kept in mind as an alternative explanation for why we found no indication of an overachieving, cross-national trend. Generally, the ESS survey program was designed to allow cross-national attitude comparisons. Explicit measures taken to improve comparability are outlined on the ESS website as well as in bi-annual data quality reports (Wuyts and Looseweldt 2019).

The aforementioned limitations only apply from a psychometric and mostly individualistic viewpoint; however, we can also take survey responses at face value and take a societal viewpoint. Questions with ordered rating scales are not only asked as part of psychometric measurement instruments. They are used by face value in psychotherapy and pain regulation (Berg and De Shazer 1993; Farrar et al. 2001), e.g., a pain assessment of a patient is not to be judged by the therapist as potentially subject to measurement error, but as basis to judge the effectiveness of therapeutic interventions by the patient. In the form of stars, rating scales are used in online recommendation systems for movies, dining places, and all sorts of consumer products, service providers, and customers. That way, numerical attitudes toward products are communicated, negotiated, and judged for interpersonal purposes and gain value in themselves. Further on, individual measurement error is likely to not play a large role on the societal level.

In a similar way, ordered rating scales are also the basis of modern range voting systems, e.g., majority judgment, which Balinski and Laraki (2011) proposed using the example of French presidential election. In that voting system, each voter has to assess each candidate with a rating from “reject” to “excellent.” The voting system extracts the median rating for each candidate and declares the candidate with the highest median as the winner.Footnote 13 Range voting systems aim to reduce advantages for candidates which are politically polarizing. The pentamodal model may help to classify the empirical conditions when the reduction of such advantages can be empirically realized.