1 Introduction

Immigration is one of the most pressing challenges the EU countries are facing in recent years. Although many European countries have had to respond to the most severe migratory challenge since the end of the Second World War, we note the unprecedented arrival of refugees and irregular migrants reaching Europe by crossing the Mediterranean Sea in 2016, with 181,400 people arriving in Italy and 173,450 in Greece.

According to data from the Organization for Economic Cooperation and Development, the number of new asylum-seekers reached a record of about 1.6 million in 2015 with approximately 22% of refugees coming from Syria, by far the leading country of origin. As confirmed by the latest Eurostat data, asylum seekers came from nearly 150 countries, with a growing share of applicants coming from visa-free countries (26% of first time applicants in 2020). The influx of immigrants to Europe has intensified the debate on the acceptance of newcomers into European countries and the threat posed by immigrants to members of the host society. Understanding what drives the individual and cross-national variations in public support for (or opposition to) immigration is therefore an issue of central importance for academics and policymakers.

To study attitudes toward the immigration phenomenon, we analyze the cross-national ESS (European Social Survey) data that measure changes in social structure, conditions, and opinions in Europe. In particular, we focus on the last five rounds of the survey (i.e., ESS 5, 2010; ESS 6, 2012; ESS 7, 2014; ESS 8, 2016; ESS 9, 2018), period in which the problem of immigration has become particularly acute in Europe. Our aim is to characterize homogeneous groups of Europeans presenting similar levels of attitude toward immigration, conceived as a latent trait. In such a context, an interesting research question concerns the effect of different socio-economic covariates on the probability of belonging to the different groups. We are interested in analyzing the probabilities to belong to the classes with the higher or lower immigration acceptance level for Europeans depending on age, number of completed years of education, feeling about household’s income, size of place of living, country of citizenship, and work experience abroad. In this way we can show also which countries tend to be more or less positive toward migrants. In particular, also considering the evolution of the tendency toward immigration across time in a dynamic fashion, the article addresses questions such as: Do the European publics vary across Europe? What are the countries with the lowest and highest immigration acceptance? How do the pattern vary across time? Are the European publics becoming more tolerant during the successive rounds of the ESS survey?

The ESS is the most highly regarded cross-national survey program in the world, conducting rigorous representative surveys to the highest professional and methodological standards across Europe.Footnote 1 The survey measures the attitudes, beliefs, and behavior patterns of diverse populations across European countries. Many of the questions fielded in the most recent rounds of the survey are repetitions of questions administered almost two decades ago (in the first round of the ESS, i.e., ESS 1, 2002). This enables to chart trends over time in attitudes and compare development in different European countries. Therefore, the survey provides one of the most authoritative databases to study attitudes toward immigration across countries and time.

Among the items of the questionnaire adopted within ESS, we explore six polytomous items with ordered responses concerning attitudes toward immigrants (of the same race, of a different race, and from poorer countries outside Europe) and the perceived costs and benefits of migration (for country’s economy, country’s cultural life, and country’s place to live). Therefore, we propose to equally divide the items into two latent dimensions: general acceptance of immigration and impact of immigration on the host country.

For the analysis of the responses to the six items at issue, we adopt a latent class (LC) approach including covariates and we also consider certain Item Response Theory (IRT) parametrizations. Due to the inclusion, among the covariates, of dummy explanatory variables for the country, we account for the multilevel structure of the data provided for individual Europeans. Moreover, in addition to the study of the distribution of the latent traits of interest related to the attitude toward immigration, and its dependence on the covariates, the adopted approach allows us to study the problem of invariance of the questionnaire among countries and rounds (see Steenkamp and Baumgartner 1998; Kankaraš et al. 2010; Millsap 2011; Davidov et al. 2014, 2018). This is a crucial issue to achieve comparability between these different situations.

The remainder of this paper is organized as follows. In the next section (Sect. 1) we review the relevant literature describing the most important studies concerning immigration and based on the ESS data. Then, we describe the data and outline the adopted modeling approach in Sects. 2 and 3, respectively. The empirical analysis is presented in Sect. 4. Final remarks are given in the Sect. 5.

2 Literature review

European societies are experiencing an increasing rate of immigration and, therefore, it is thus not surprising that the literature on attitudes toward immigrants has multiplied in recent years (Hainmueller and Hopkins 2014; Fitzgerald and Awar 2018). This literature suggests that negative attitudes toward immigrants and immigration are affected by both individual-level and country-level factors. At individual level, studies are focused on socio-demographic characteristics as determinants of attitudes toward newcomers. These studies show that unemployment, low income, a vulnerable socio-economic position, or low levels of education may increase the perception of economic threat due to immigration (Coenders and Scheepers 2003; Raijman et al. 2003; Kunovich 2004; Semyonov et al. 2008; Gorodzeisky 2011).

In addition, Espenshade and Hempstead (1996) found that people who are alienated politically may be looking for others to blame and, consequently, may be more negative toward immigrants. It is also argued that political conservatism is an important predictor of immigrants’ derogation (e.g., Semyonov et al. 2006). Additionally, some authors show that members of the host society overestimate the size of the immigrant population which, in turn, results in more negative attitudes toward them (i.e., Schneider 2008; Gorodzeisky and Semyonov 2020).

The second line of research conjectures that, instead of focussing only on individual differences, one should examine cross-country variation in attitudes toward immigration. It is noted that large or increasing immigration flows, deteriorating economic conditions (Scheepers et al. 2002; Semyonov et al. 2006; Schlueter and Wagner 2008; Crepaz and Damron 2009; Rustenbach 2010; Kuntz et al. 2017), policies that are not aimed at strengthening integration of newcomers (Schlueter et al. 2018; Green et al. 2019), negative media reports related to immigration (Schlueter and Davidov 2013; Schlueter et al. 2018), negative events such as economic shocks (Meuleman et al. 2018), or terrorist attacks (Schlueter et al. 2018) may all result in more negative attitudes toward immigration in a country. However, the relations between economic conditions, size of immigrant population, and anti-immigrant attitudes are not confirmed in the studies presented by Sides and Citring (2007) or Strabac and Listhaug (2008), among others.

The phenomenon of massive immigration has continued to fuel the debate on this phenomenon as both academics and policymakers have not yet reached a consensus on what drives natives to view immigration as threatening and why similar people living in different countries tend to vary greatly in their opinions, even after controlling for socio-economic differences (Raijman et al. 2003). Note that most of the previous studies based on the ESS data are mainly concentrated on different relationships between individual characteristics or country policies (i.e., welfare policy or other characteristics such as the size of the immigrant population, economic conditions, and foreign investments) and anti-immigrants attitudes for the previous rounds (i.e., Schlueter and Wagner 2008; Rustenbach 2010; Davidov and Meuleman 2012; Markaki and Longhi 2012; Nagayoshi and Hjerm 2015). In contrast, the present work focuses on the evolution of the individual attitudes toward immigration in the different EU host countries in the last few years (2010–2018), in which problems related to immigration have become particularly acute. In this way we show which countries tend to be more or less positive toward immigration and we analyze the temporal dynamics of the phenomenon under study.

As in most of the cited works, we base our analysis on individual-level responses. Moreover, we consider both baseline and time-varying socio-demographic characteristics that are assumed to impact on the immigration attitudes evolving over time. Additionally, including dummy explanatory variables for the country, we also account for the multilevel structure of the data.

Another contribution of our study is that, although we base our study on the individual-level responses, we conceive the attitude toward immigration as a discrete latent (non-observable) construct, analyzed by suitable LC models. In fact, this attitude is not directly measurable and depends on individual characteristics that are not directly observable, such as cultural background, economic status, and political views. People might be more likely to have anti-immigrant attitudes when they cannot relate to the culture of the immigrants (e.g., ethnic background). Economic competition and anti-immigrant attitudes may occur because immigrants are taking the jobs of native workers especially at the bottom of the labour hierarchy. Also, right or left political orientations may clarify why differences in anti-immigrant sentiments occur. In fact, according to Dennison (2017): “Attitudes to immigration at the individual level can be powerfully predicted by fundamental psychological traits, with individuals displaying openness and excitability more drawn toward pro-immigration positions and those displaying conscientiousness and concern over safety more drawn toward anti-immigration positions”.

3 Data presentation

The ESS is an academically driven cross-national survey that has been conducted across Europe since its establishment in 2001. Every two years, face-to-face interviews are conducted with newly selected cross-sectional samples. The survey measures the attitudes, beliefs, and behavior patterns of diverse populations in different nations.Footnote 2

Here we focus, in particular, on the changing attitudes in the EU countries in the last years (2010–2018), that is, the last five rounds of the survey (ESS 5, 2010; ESS 6, 2012; ESS 7, 2014; ESS 8, 2016; ESS 9, 2018). The analyzed dataset, which includes individuals with complete data, is referred to a sample of 101,106 respondents living in 12 European countries who take part in all of the ESS rounds of interestFootnote 3. We consider six items concerning the different aspects of immigration acceptance with ordinal responses that measure two dimensions: general acceptance of immigration and impact of immigration on the host country. As the factor analysis of the first three items yielded a single dimension they are also labelled as allowance, whereas the other three are known as symbolic and economic threat (Heath et al. 2016) or realistic threat (Davidov et al. 2018). Overall, the first three items given below measure the first dimension (referring to the extent the country should accept different groups of immigrants) and the others three (concerning different effects of immigration for a country) define the second one:

  • \(Y_1\)allow many/few immigrants of the same race/ethnic group as majority population (1–allow none, 2–allow few, 3–allow some, 4–allow many to come and live here);

  • \(Y_2\)allow many/few immigrants of different race/ethnic group from majority population (1–allow none, 2–allow few, 3–allow some, 4–allow many to come and live here);

  • \(Y_3\)allow many/few immigrants from poorer countries outside Europe (1–allow none, 2–allow few, 3–allow some, 4–allow many to come and live here);

  • \(Y_4\)immigration bad or good for country’s economy (1–bad, ..., 11–very good for the economy);

  • \(Y_5\)country’s cultural life undermined or enriched by immigrants (1–undermined,..., 11–enriched cultural life);

  • \(Y_6\)immigrants make country a worse or better place to live (1–worse, ..., 11–better place to live).

Originally, the first three items (\(Y_1\)\(Y_3\)) had the reverse order of the (four) response categories whereas we keep the original order of 11-point Likert type scale for the last three items (\(Y_4\)\(Y_6\)). In fact, previous studies concluded that the immigration attitudes in the last years are becoming slightly positive; therefore, we prefer to analyze the immigration attitudes, as opposed to more popular anti-immigration analyses.

It is worth to compare the data structure we use for the analysis with that used in other studies. In particular, Davidov and Meuleman (2012) analyzed one “combined” scaled variable reject of immigrants as the average of the first three items (\(Y_1\)\(Y_3\)) for the first three rounds of the ESS survey. Each of the three items of this measurement scale inquires whether respondents would like their country to allow only a few or many immigrants of a certain group to come. The original responses are registered on a 4-point scale (1–allow many, 4–allow none). Scale variable reject is operationalized as the average over these three indicators. Markaki and Longhi (2012), in their analyses, converted the 11-point scales of items \(Y_4\)\(Y_6\) (corresponding to our second latent dimension) into binary variables for certain rounds (ESS 1, 2002; ESS 2, 2004; ESS 3, 2006; and ESS 4, 2008). They recoded the original scales of items into binary variables with the value 1 given to those who answer 1–5 (immigration is bad for the economy; undermining cultural life; worsening life in the country) while a value of 0 is assigned to those who answer 6–11 (immigration is good for the economy; enriching cultural life; improving life in the country). Note that, in our study we keep the original order of the 11-point scale for three items, so as to avoid information loss. Moreover, most of these works (see also Ervasti et al. 2008; Schlueter and Wagner 2008; Gorodzeisky and Semyonov 2009; Meuleman et al. 2009; Gorodzeisky 2011) are focused on the negative perception, that is, the anti-immigrant attitude, perceived threat of immigration, or immigrant derogation.

For a preliminary description of the phenomenon under study, Table 1 reports the means \(\bar{Y}_j\) for the item responses of the ESS rounds of interest, covering the period 2010-2018, whereas Table 2 shows the means of each response variable for every country. The means \(\bar{Y}_j\) is computed for each item after assigning score 1 to 4 (\(Y_1\)\(Y_3\)) or 1 to 11 (\(Y_4\)\(Y_6\)) to the categories in increasing order, respectively. The full tables including the distribution of the item responses for each country and round are presented in the Appendix (see Tables 13 and 14).

We emphasize that since our study is based on combining data from different countries and rounds, the design weights in combination with population size weights (European Social Survey 2014, Sec. 2 and 3) are applied in computing descriptive statistics as well as in the estimation part of our analysis.

Table 1 Weighted average scores (\(\bar{Y}_j\)) by year
Table 2 Weighted average scores (\(\bar{Y}_j\)) by country

We stress that surveys such as the ESS face methodological challenges with respect to the comparability of the concepts used in different countries. Concepts that are popular in one country may be not common in another country, people may understand specific questions differently across countries, translations might be imprecise leading to biased scores, and people in various countries might use response scales differently when responding to survey questions (Cieciuch et al. 2015; Cieciuch and Davidov 2016). However, Davidov et al. (2018) suggested that two latent constructs of our interest, allowance and realistic threat, may be used for cross-country comparisons with confidence, as they displayed approximate scalar invariance across most ESS countries in the study.

Overall, responses are mainly concentrated on the middle category (the third category for the first and the sixth category for the second dimension), whereas category 1, corresponding to the lowest level of immigration acceptance, is selected less than 15% of the times for each item and in each round (see Table 13). We also observe a higher percentage of acceptance for immigrants of the same race/ethnic group than for immigrants from poorer countries outside Europe. However, we observe the clear decrease in the last two rounds of those who allow no immigrants from poorer countries outside Europe and the public opinion is not so “polarized” as in the previous rounds. In this regard, Ford (2017) compared just two rounds of the survey and concluded that attitudes have become somewhat more “polarized” between 2002 and 2014, particularly in the case of attitudes toward migrants from poor countries outside Europe. He showed an increase from 11% (in 2012) to 20% (in 2014) of those who feel that none of these migrants should be allowed to come. At the same time, it was observed an increase in the percentage of people who feel that many such migrants should be allowed to enter, from 11 to 12%.

In terms of preferred immigrants, some differences are also observed among countries. In most of the countries there is a higher percentage of those who believe that cultural life is enriched by immigrants or immigration makes the country a better place to live than those who believe that immigration is good for economy. Sweden, Germany, and Finland are the most positive toward immigrants, especially as far as the items corresponding to the second dimension are concerned. On the contrary, the Czech Republic is the most negative, characterized by the lowest average scores for all the six items.

We also consider important socio-economic background characteristics of the respondents introduced by a suitable structure of covariates (with possible categories indicated in brackets for categorical variables):

  • round – included by dummy variables (“Round 5, 2010” as reference category, “Round 6, 2012”, “Round 7, 2014”, “Round 8, 2016”, “Round 9, 2018”);

  • country – included by dummy variables (“DE–Germany” as reference country, “CZ–Czech Republic”, “BE–Belgium”, “EE–Estonia”, “FI–Finland”, “FR–France”, “GB–United Kingdom”, “IE–Ireland”, “NL–Netherland”, “PL–Poland”, “SE–Sweden”, “SI–Slovenia”);

  • gndr – gender (0–“female” (F) as reference category, 1–“male” (M));

  • agea – age of respondent;

  • domcil – place of living, included by dummy variables (1–“big city” (BC), 2–“suburbs or outskirts of big city” (SBC), 3–“town or small city” (T) as reference category, 4–“country village” (V), 5–“farm or home in countryside” (C));

  • ctzcntr – citizen of the country (0–“no”, 1–“yes” as reference category);

  • eduyrs – number of years of full-time education completed;

  • wrkac6m – paid work in another country, period of more than 6 months in the last 10 years (0–“no” as reference category, 1–“yes”);

  • uemp3m – ever unemployed and seeking work for a period more than three months (0–“no” as reference category, 1–“yes”);

  • pdwrk – paid work during the last 7 years (0–“no”, 1–“yes” as reference category);

  • hincfel – feeling about household’s income nowadays (1–“very difficult on present income”, 2–“difficult on present income”, 3–“coping on present income”, 4–“living comfortably on present income”).

Summary statistics for the distribution of the covariates are reported in Table 3.

Table 3 Weighted frequency distribution for each covariate (%) in years 2010-2018

We notice that the majority of respondents are citizens of the country, females, with an average number of years of education equal to 13.58, and live in towns or small cities. The respondents are mainly adults with an average age of over 49. Most of them report to cope on the present income and to have a paid work (59.26% of the respondents) during the last seven years. Over 5% of the respondents had a paid work in another country during a period longer than 6 months and over 30% of the respondents had experience of being unemployed and seeking work for a period longer than three months.

4 Methodology

For the analysis of the data described in the previous section and to address our research questions about the evolution of attitudes toward immigration, we adopt the LC approach (Lazarsfeld and Henry 1968; Goodman 1974). In particular, we consider the LC model with covariates (Dayton and Macready 1988; Bandeen-Roche et al. 1997; Vermunt 2010), also in its IRT version for polytomous ordinal items proposed in Bacci et al. (2014) that extends the approach introduced by Bartolucci (2007) for dichotomous items (see also von Davier 2008). We suppose that the items measure a certain number of latent traits (general acceptance of immigration and impact of immigration on the host country in the context of our study). A crucial assumption characterizing the models at issue is the discreteness of the distribution of the latent traits, giving rise to a finite number of latent classes, each one characterized by the same latent trait levels in the IRT version. Moreover, we assume that the individual covariates affect the probability of belonging to the different classes. Among these covariates we consider the country of the respondent, so that the adopted models account for the multilevel data structure by fixed effects. In the following we first describe the general LC approach, then its IRT version for polytomous ordinal items, and then we outline maximum likelihood estimation of these models.

4.1 Latent class approach

Let \(Y_{ij}\) denote the response variable for individual i and item j, where \(i=1,\ldots ,n\) and \(j=1,\ldots ,r\), with n denoting the overall number of individuals in the survey (101,106 in our application) and r denoting the number of items (6 in our application). Each variable \(Y_{ij}\) has \(l_j\) categories indexed from 0 to \(l_j-1\); in our application \(l_j=4\) for \(j=1, 2, 3\) and \(l_j=11\) for \(j=4, 5, 6\). The observed responses \(y_{ij}\) are collected in the vectors \({{\varvec{y}}}_i=(y_{i1},\ldots ,y_{ir})'\) and we also observe a column vector of fixed covariates \({{\varvec{x}}}_i\) for every i.

The LC model (Lazarsfeld and Henry 1968; Goodman 1974) assumes that the population is divided into a certain number k of latent classes that are identified by the individual latent variables \(U_i\), \(i=1,\ldots ,n\). These variables are discrete with k support points, from 1 to k, and given \(U_i\) the random variables in \({{\varvec{Y}}}_i\) are assumed to be conditionally independent. This is the well-known assumption of local independence, which is common to most latent trait models.

Each latent class is characterized by a specific conditional distribution of the response variables given this class. This is formalized by the following probabilities

$$\begin{aligned} \phi _{jy|u}=p(Y_{ij}=y|U_i=u),\quad i=1,\ldots ,k,\,j=1,\ldots ,r,\,u=1,\ldots ,k, \end{aligned}$$
(1)

depending on individual i only through the class he/she belongs to. Moreover, for each class we have a conditional probability of belonging to this class given the covariates, that is,

$$\begin{aligned} \pi _u({{\varvec{x}}}_i)=p(U_i=u|{{\varvec{x}}}_i),\quad i=1,\ldots ,n. \end{aligned}$$

In order to model these probabilities, we adopt a multinomial logit parametrization, under which

$$\begin{aligned} \log \frac{\pi _u({{\varvec{x}}}_i)}{\pi _1({{\varvec{x}}}_i)}=\beta _{0u}+ {{\varvec{x}}}_{i}'{{\varvec{\beta }}}_{1u}, \quad u=2,\ldots ,k, \end{aligned}$$
(2)

with class-specific intercepts \(\beta _{0u}\) and regression parameters \({{\varvec{\beta }}}_{1u}\).

4.2 LC-IRT approach

The previous approach may be made more parsimonious by considering an IRT parametrization of the conditional response probabilities based on explicitly considering the number of latent traits measured by the items. Let q be the number of these latent traits, also called dimensions (2 in our case), and let \({{\varvec{\theta }}}_u=(\theta _{u1}, \ldots , \theta _{uq})'\) be the vector of latent trait levels for the individuals in class u.

In order to allow the conditional response probabilities \(\phi _{jy|u}\) defined in (1) to depend on \({{\varvec{\theta }}}_u\), we consider the general formulation proposed by Bacci et al. (2014); see also Bartolucci (2007). In particular, we consider the probability vectors \({{\varvec{\phi }}}_{j|u}=(\phi _{j0|u},\cdots , \phi _{j,l_j-1|u})'\), the elements of which sum up to 1. The adopted IRT formulation assumes for \(j=1,\ldots ,r\) that

$$\begin{aligned} g_y({{\varvec{\phi }}}_{j|u})=\alpha _j\left( \sum _{d=1}^{q}\delta _{jd}\theta _{ud}-\tau _{jy}\right) ,\quad y=1,\ldots ,l_j-1,\,u=1,\ldots ,k, \end{aligned}$$

where \(\delta _{jd}\) is a dummy variable equal to 1 if item j measures latent trait of type d and to 0 otherwise, with \(d=1,\ldots ,q\) and \(j=1,\ldots ,r\). Moreover, \(g_y(\cdot )\) is a link function specific of category y and \(\alpha _j\) and \(\tau _{jy}\) are item parameters, usually referred to as discriminating and difficulty indices and on which suitable constraints need to be assumed.

Among the possible parametrizations, for the full list see Bacci et al. (2014), we consider that based on the so-called global logits that are strongly related to the cumulative logits for ordinal variables (Agresti 2012, Ch. 8). This leads to the LC (and also multidimensional) version of the popular Graded Response Model of Samejima (1969), here denoted by LC-GRM, with covariates. In a more explicit form, for \(j=1,\ldots ,r\), this model is based on the assumption

$$\begin{aligned} \frac{p(Y_{ij}\ge y|U_i=u)}{p(Y_{ij}< y|U_i=u)}=\alpha _j\left( \sum _{d=1}^{q}\delta _{jd}\theta _{ud}-\tau _{jy}\right) ,\quad u=1,\ldots ,k,\, y=1,\ldots ,l_j-1, \end{aligned}$$
(3)

so that the ordinal structure of the items is taken into account.

It is also possible to consider simplified versions of the model based on the previous assumption, such as that based on equally spaced difficulty parameters \(\tau _{jy}\), which is related to the rating scale version of the GRM introduced by Muraki (1990), and that based on all discrimination parameters \(\alpha _j\) equal to 1 (or to an arbitrary positive value), leading to the one-parameter logistic parametrization; see also Van der Ark (2001). Obviously, the first constraint make sense only when all items have the same number of response categories.

Note that also under the IRT specification of the LC model, the effect of the covariates on the probabilities \(\pi _u({{\varvec{x}}}_i)\) may be modeled through a multinomial logit parametrization as in (2). However, exploiting the possible order of the classes induced by the support points \({{\varvec{\theta }}}_u\), we can use the parametrization

$$\begin{aligned} \log \frac{1-\pi ^*_{u-1}({{\varvec{x}}}_i)}{\pi _{u-1}^*({{\varvec{x}}}_i)}=\beta _{0u}+ {{\varvec{x}}}_{i}'{{\varvec{\beta }}}_{1u}, \quad u=2,\ldots ,k, \end{aligned}$$

which is again based on ordinal logits, where

$$\begin{aligned} \pi _{u}^*({{\varvec{x}}}_i)=\pi _1({{\varvec{x}}}_i)+\ldots +\pi _{u}({{\varvec{x}}}_i) \end{aligned}$$
(4)

are cumulative probabilities.

The estimates of the regression parameters in this parametrization may be also illustrated in a simpler way as we have a single regression coefficient for each covariate.

4.3 Likelihood based inference

To estimate the models illustrated above, we maximize the log-likelihood function

$$\begin{aligned} \ell ({{\varvec{\theta }}})=\sum _{i=1}^{n} \log p({{\varvec{y}}}_i|{{\varvec{x}}}_i), \end{aligned}$$
(5)

based on the assumption of independence between sample units and where \(p({{\varvec{y}}}_i|{{\varvec{x}}}_i)\) is the manifest probability of the observed sequence of responses for this individual, given the covariates. This probability may be computed as

$$\begin{aligned} p({{\varvec{y}}}_i|{{\varvec{x}}}_i)=\sum _{u=1}^{k}p_u({{\varvec{y}}}_i)\pi _u({{\varvec{x}}}_i), \end{aligned}$$

with \(p_u({{\varvec{y}}}_i)=\prod _{j=1}^J \phi _{jy_{ij}|u}\) being the conditional probability of observing the response vector \({{\varvec{y}}}_i\) given that individual i belongs to latent class u (i.e., \(U_i=u\)). We recall that the probabilities \(\phi _{jy|u}\), included in the previous expression, are directly used as parameters under the initial LC model formulation, while they depend on the parameters \({{\varvec{\theta }}}_u\) under the GRM formulation; see assumption (3).

When available, sample weights may be accounted for in the estimation process by including them in the log-likelihood function. In particular, let \(w_i\) denote the weight for individual i. The weighted log-likelihood function has expression

$$\begin{aligned} \ell ({{\varvec{\theta }}})=\sum _{i=1}^{n} w_i\log p({{\varvec{y}}}_i|{{\varvec{x}}}_i). \end{aligned}$$
(6)

Note that we use normalized weights, so that \(\sum _{i=1}^nw_i=n\), in order to properly apply the model selection criteria described in the following.

Maximum likelihood estimation is performed by the EM algorithm (Dempster et al. 1977), and in particular we use the implementation available in the R package MultiLCIRT (Bartolucci et al. 2016) described in Bartolucci et al. (2014). For a deep description of this algorithm in the context of LC-IRT models see Bartolucci (2007) when sample weights are not used and then the log-likelihood is equal to (5). In the presence of such weights, the log-likelihood (6) can be maximized by a simple modification of this algorithm consisting, in particular, in using a slightly different version of the M-step when the parameter estimates are updated.

In applications, a crucial point is the selection of the most suitable model for the data at hand in terms of number of latent classes (k), number of dimensions (q), and the possible constraints on the item parameters. In particular, we rely on information criteria such as the Akaike Information Criterion (AIC Akaike 1973) and the Bayesian Information Criterion (Schwarz 1978, BIC). We recall that these criteria are based on the following indices that must be minimized:

$$\begin{aligned} AIC= & {} -2\hat{\ell } + 2 \#\mathrm{par}, \end{aligned}$$
(7)
$$\begin{aligned} BIC= & {} -2\hat{\ell } + \log (n) \#\mathrm{par}. \end{aligned}$$
(8)

In the previous formulas, \(\hat{\ell }\) denotes the maximum log-likelihood of the model at issue and \(\#\mathrm{par}\) stands for the number of free parameters. In applying these criteria we look for the most parsimonious model specification when they lead to different choices.

In order to assess the quality of clustering, it is typically used the entropy (Biernacki and Govaert 1997; Celux and Soromenho 1996), computed as

$$\begin{aligned} EN = -\sum _{i=1}^n\sum _{u=1}^k p(u|{{\varvec{y}}}_i) \log p(u|{{\varvec{y}}}_i), \end{aligned}$$
(9)

and the NEC (Normalized Entropy Criterion), proposed by Celux and Soromenho (1996), computed as

$$\begin{aligned} NEC = \frac{EN}{\hat{\ell }-\hat{\ell }_1}, \end{aligned}$$
(10)

where \(\hat{\ell }_1\) denotes the maximum log-likelihood of the model with only one class (see also Biernacki et al. 2000). We recall that larger values of EN indicate a low separation between classes, whereas lower values of NEC are preferable, with the convention that for one class \(NEC=1\). The maximum value of EN is equal to \(n\log (k)\), corresponding to the case of all posterior probabilities of the latent classes equal to 1/k.

Finally, we recall that after parameter estimation, sampled individuals may be assigned to the latent classes on the basis of the estimated posterior probabilities \(\hat{p}(u|{{\varvec{y}}}_i)\). In particular, \(\hat{p}(u|{{\varvec{y}}}_i)\) refers to the probability that individual i belongs to latent class u, namely, it is an estimate of

$$\begin{aligned} p(u|{{\varvec{y}}}_i) = \frac{p_u({{\varvec{y}}}_i)\pi _u({{\varvec{x}}}_i)}{p({{\varvec{y}}}_i|{{\varvec{x}}}_i)}. \end{aligned}$$

According to the Maximum-a-Posteriori rule, individual i is assigned to the class corresponding the highest values of this probability.

5 Empirical analysis

In applying the LC approach described in the previous section to the data illustrated in Sect. 2, we first dealt with model selection, regarding first of all the optimal number of latent classes. We carried on with testing invariance across countries and rounds, so as to explicitly consider the comparability between these different contexts. We also dealt with the hypothesis of unidimensionality, corresponding to the assumption that the two latent traits may be reduced to only one, and other hypotheses of interest, amounting to compare the LC-IRT and LC models. Particular emphasis was given to the interpretation of the estimates of the parameters involved in the conditional response probabilities and in the class weights.

5.1 Model selection and testing

5.1.1 LC model selection

In the first step of our analysis, we started with the LC model and for this model we considered the choice of the number of latent classes (k) trying to also determine whether this number is the same across countries and rounds.

With the aim to assess measurement invariance in a cross-cultural comparative setting, we followed the general procedure illustrated by Kankaraš et al. (2010). The model selection procedure usually starts by determining the required number of latent classes. Therefore, at the beginning we estimated, on the pooled set of data, the LC model with different parameters (conditional response probabilities and class weights) for each country and round, and no other covariates. Table 4 shows, for k from 1 to 8, the results in terms of AIC and BIC (as defined in (7) and (8)), along with the maximum log-likelihood and the number of free parameters for fully heterogeneous and unrestricted multigroup LC model. For each fitted model, the table also reports the value of entropy and of the \(\textit{NEC}\) as defined in (9) and (10), respectively.

Table 4 Information criteria, log-likelihood value (\(\hat{\ell }\)), and number of parameters (#par) for the LC models without covariates (in bold the lowest value of BIC)

The lowest value of BIC is reached for \(k = 6\) latent classes. The corresponding LC model, denoted by \(M_1\), has a huge number of free parameters, namely 14,340, which can be strongly reduced by introducing suitable constraints as discussed in the following.

Regarding the class separation note that the maximum value of entropy amounts to 181,158 and then we can consider the value of EN for 6 classes, equal to 23,387, as adequate. Moreover, the value of NEC is close to that of the best model among the fitted ones.

In the context of LC analysis, measurement invariance is established when the class-specific conditional response probabilities are equal across groups and then a structurally equivalent (homogenous) model is achieved. This implies that it is necessary to impose across-group equality restrictions on these conditional probabilities in order to test for measurement equivalence (Kankaraš et al. 2010). Therefore, in the further stage of our analysis we considered the LC model with 6 classes and the same measurement model, namely common conditional response probabilities across countries and rounds, and where the class weights are also constants or depend on these two covariates in additive way. Table 5 shows results, again in terms of AIC and BIC, for this model. Note, that comparability is only established if we can impose across-groups restrictions on the model parameters without deteriorating the fit with the data.

Table 5 Information criteria, log-likelihood value (\(\hat{\ell }\)), and number of parameters (#par) for different versions of the LC model with 6 classes and the same measurement model for all country/round and country and round as covariates

The model selected on the basis of the results in Table 5 is the last one denoted hereafter by \(M_2\). This is an LC model with 6 classes and the same measurement model for all country/round, while covariates country and round affect the class weights. This model is considerably better than the initial model (\(M_1\)) according to the BIC, and then we reach the important conclusion that the questionnaire items are measurement invariant over country and round of the survey.

5.1.2 LC-IRT model selection

In the next stage of our analysis, we tested the LC-GRM for polytomous ordinal items based on parametrization (3), allowing for bidimensionality, discreteness of the latent trait distribution, time constant and time-varying covariates under the multinomial logit parametrization formulated in (2). In fact, as suggested by the structure of the questionnaire, the items may be grouped into two dimensions corresponding to “general acceptance of immigrants” (\(Y_1\)-\(Y_3\)) and “impact of immigration on host countries” (\(Y_4\)-\(Y_6\)), respectively. An important issue is if these two dimensions may be reduced to only one. This issue may be addressed by comparing the bidimensional model with 6 classes with its unidimensional counterpart (see Bartolucci 2007, for details) on the basis of AIC and BIC.

The results of LC-GRM model comparisons with 6 classes and the same measurement model for all country/round and country and round as covariates and different dimensional structure are given in Table 6. Note that these models do not provide better results than the LC model with 6 classes (presented in Table 5) for the analysis of the data at hand, which is indicated as model \(M_2\) above.

Table 6 Information criteria, log-likelihood value (\(\hat{\ell }\)), and number of parameters (#par) for LC-GRM models with 6 classes, the same measurement model for all country/round, country and round as covariates and different dimensions

In the next step we determined the contribution of the other covariates to the latent trait distribution. For this aim, we adopted a forward stepwise selection process in which these covariates are included singularly at each step. Values of AIC and BIC after each new covariate is included are reported in Table 7. The inclusion of the covariates presented in Sect. 2 and not listed in that table does not lead to an improvement in terms of BIC.

Table 7 Information criteria, log-likelihood value (\(\hat{\ell }\)), and number of parameters (#par) for the LC models with 6 classes, the same measurement model for all country/round, country and round as covariates extended also one-step-at-a-time by the other socio-economic features

In the end, the selected model, denoted by \(M_3\), is the LC model with 6 classes and the same measurement model for all country/round and 6 other covariates affecting the class weights: eduyrs, hincfel, age, ctzcntr, domcil, and wrkac6m. This is our final model, under which we obtained the results commented in the next section.

5.2 Results

Table 8 reports the estimated prior probabilities under the selected model \(M_3\) averaged over all the observed covariate configurations (\(\hat{\bar{\pi }}_u\), \(u=1,\ldots ,k\)), while the estimated conditional probabilities (\(\hat{\phi }_{jy|u}\)) are reported in Appendix (Table 15).

Table 8 Estimated average prior probabilities under the selected LC with 6 classes, covariates, and multinomial logit link function

The estimated conditional probabilities show that the chance of answering with a high response category (corresponding to a high level of immigration support) generally increases from class 1 to 6, whereas the probabilities of answering with a low response category (corresponding to a low level of acceptance) generally decreases as the class index increases. In other words, the latent classes are substantially ordered according to attitude levels of immigration acceptance and this is a great advantage in terms of interpretability. For a more straightforward check, we computed class/item-specific scores that are obtained by the weighted average of a set of scores assigned to each response category (1 for the first, 2 for the second, and so on) and weights equal to the conditional response categories. The class/item-specific scores, denoted by \(\hat{\bar{\phi }}_{ju}\), are presented in Table 9 and represented in Fig. 1; these scores confirm that we are essentially dealing with ordered latent classes even if we are not relying on a GRM parametrization.

According to the results in Table 8, most subjects (24.3%) belong to class 4, which is characterized by an upper-intermediate level of general immigration acceptance and opinion about impact of immigration on the host country. This class is also characterized by one of the highest conditional probabilities for the third category (\(y=2\)) and for the sixth category (\(y=5\)). Over 9% of subjects are in class 1 and 13.9% of subjects are in class 6, corresponding to the lowest and highest levels of immigration attitudes, respectively.

Table 9 Class/item-specific scores \(\hat{\bar{\phi }}_{ju}\)
Fig. 1
figure 1

Estimated score values \(\hat{\bar{\phi }}_{ju}\) for the selected LC model with 6 classes, covariates, and multinomial logit link function

The estimates of the regression coefficients for the covariates included in the multinomial logit parametrization, see assumption (2), are displayed in Table 10, together with the corresponding p-values for the hypothesis that each coefficient is equal to 0. Clearly, most of the considered covariates are significant at the 5% (marked as \(^{**}\)).

Table 10 Estimated covariate coefficients (\(\hat{\beta }_{u}\), \(u=2,\ldots ,k\)) under the selected LC model with 6 classes, covariates, and multinomial logit link function
Table 11 Estimates of the individual weights \(\pi _u({{\varvec{x}}}_i)\), \(u=1,\ldots ,6\), for different values of the covariates, for the selected LC model with \(k=6\) classes
Table 12 Estimates of the cumulative individual weights \(\pi _u^*({{\varvec{x}}}_i)\), \(u=1,\ldots ,6\), for different values of the covariates, for the selected LC model with \(k=6\) classes

The most interesting estimates concern the effect of time (included by time dummies round) and country, which may be interpreted considering that the 6 latent classes of individuals are essentially ordered from that with the lowest to that with the highest level of immigration acceptance. Regarding the first aspect, we conclude that European publics are becoming slightly more tolerant, with a significant difference between round 5 and the other rounds. Moreover, as the regression parameters for most of the country covariates are negative in comparison to Germany, Europeans in all the other countries (with exception of Sweden) tend to be more negative toward immigrants, especially in Czech Republic, Estonia, and United Kingdom. These results are in agreement with the conclusions of Heath and Richards (2016), who compared the frequencies for the selected questions asked in 2002 and 2014.

For a clearer interpretation of the results, we calculated the individual prior probabilities \(\pi _u({{\varvec{x}}}_i)\) (Table 11) and the corresponding cumulative prior probabilities (Table 12), which are defined according to (4) for an “average man” who has a covariate profile that is the most common (i.e., a German, 50 years old, with 13 completed years of education, with a citizenship, living in town or small city, never unemployed, having a paid work during the last 7 years, but not having a chance to work in another country in the last 10 years, coping on present income).

Then, we considered how the prior probabilities change when each single covariate changes. The prior probabilities at varying levels of one covariate were prepared for the most frequent category for all covariates. Consequently, cumulative probability corresponds to the chance that the respondent characterized by the considered covariate belongs to a latent class until class u (see Eq. 4). These results are reported in the Tables 11 and 12 and also represented in Figs. 2, 3, 4, 5, 6 and 7.

Fig. 2
figure 2

Estimated cumulative prior probabilities according to the round under the selected LC model with 6 classes, covariates, and multinomial logit link function

Fig. 3
figure 3

Estimated cumulative prior probabilities according to the country under the selected LC model with 6 classes, covariates, and multinomial logit link function

Fig. 4
figure 4

Estimated cumulative prior probabilities according to the the place of living under the selected LC model with 6 classes, covariates, and multinomial logit link function

Fig. 5
figure 5

Estimated cumulative prior probabilities according to the income perception under the selected LC model with 6 classes, covariates, and multinomial logit link function

Fig. 6
figure 6

Estimated cumulative prior probabilities according to the number of completed years of education under the selected LC model with 6 classes, covariates, and multinomial logit link function

Fig. 7
figure 7

Estimated cumulative prior probabilities with age under the selected LC model with 6 classes, covariates, and multinomial logit link function

Figure 2 shows the slightly decreasing cumulative probability for the second and third class (characterized by low and intermediate immigration acceptance) with respect to the fifth round of the survey.

Figure 3 confirms that the highest (close to 60%) probability of belonging to the first two classes is for Czech, followed by United Kingdom, Estonia, Slovenia, and France. As far as the classes with upper-intermediate and high immigration acceptance are concerned (i.e., latent classes 4 to 6), Czech is the country with the lowest chance to belong to those groups, followed by Estonia, Slovenia, and United Kingdom (confirmed also by the lowest prior probabilities in Table 11 for those classes). In contrast to Czech and Estonia, the lowest probability to belong to one for the first three classes and the highest increase of cumulative probability between the last two classes (characterized by very high and the highest level of immigration acceptance) are observed for Sweden and Germany. Regarding the third class, the highest prior probability is observed for Finland, followed by Poland. Moreover, most countries are prone to belong to class 4 characterized by the upper intermediate level of immigration acceptance with the prior probability over than 0.3, especially for Netherland, Sweden, Belgium, and France (see Table 11).

Concerning the other socio-economic features considered in our analysis, we observe the positive regression parameters for education (eduyrs), income level perception (hincfel), place of living (BC, SBC), and ctzcntr (with exception for the first class) covariates (see Table 10). Therefore, as the number of years of education increases, the level of immigration acceptance (i.e., the probability to belong to classes with a higher immigration acceptance, see Fig. 6) also increases. We can observe that the probability of belonging to the first two classes is over 0.70 for uneducated respondents as opposed to those with 25 completed years of education reaching, in turn, probability of belonging to one of the first 5 classes equal to 0.65. These results are reasonable and in agreement with previous researches (Coenders and Scheepers 2003; Kunovich 2004; Rustenbach 2010; Nagayoshi and Hjerm 2015).

Individuals living in villages (V) or having homes in countryside (C) tend to be more negative about immigration compared to those living in towns or small cities (see also Davidov and Meuleman 2012; Markaki and Longhi 2012). The results given in Fig. 4 present higher cumulative prior probabilities up to the third class for respondents living in those three areas of European countries, with the highest points for residents of countrysides opposed to the respondents living in the big cities (BC) or in the suburbs (SBC) of Europe.

The attitude toward immigration increases with the feeling about household’s income and the size of place of living. The probabilities to belong to the classes with upper-intermediate and high immigration acceptance increase with higher levels of income perception, as also clarified by Fig. 5. Europeans living comfortably on present income are considerably more prone to belong to classes 5 and 6, compared to those living very difficult or difficult on present income (see also Table 11).

Based on the results presented in Tables 10, 11, and 12 we also conclude that respondents who had a paid work in another country for a period longer than 6 months in the last 10 years and people not having citizenship of the country tend to be more supportive of immigration phenomena.

Finally, older people seem to be less prone to accept immigrants in their countries of origin. Figure 7 presents a noticeably lower probability of belonging to the classes with lowest level of immigration attitudes for the youngest respondents of the survey. Accordingly, with age the increasing tendency to belong to one of the first three classes is observed (as opposed to the classes with the higher immigration acceptance level).

6 Discussion

To evaluate the changing attitudes toward immigration in EU countries, we adopt a latent variable approach for ordinal polytomously-scored items. The approach relies on discrete latent variables and allows for covariates that influence the weights of the latent classes. The approach is applied to the analysis of cross-national ESS data for the period 2010-2018.

The present study provides some clear contributions to our understanding in explaining attitudes toward immigration of European public opinion in the years with the highest immigration dynamics in Europe:

  • Differently from previous researches, we show that the analyzed (heterogenous) survey data can be explained by 6 latent classes corresponding to homogeneous groups of Europeans with the similar levels of immigration acceptance.

  • We present results on the tendency of general immigration acceptance and the impact of immigrants on host countries in the recent years as well. This extension of traditional Item Response Theory models, based on the assumptions of discreteness and also multidimensionality of the latent trait, may be especially useful in socio-economic data analyses where the normality and unidimensional assumptions of the latent trait (explicitly introduced) are very often too restrictive (Bartolucci et al. 2014; Genge 2017).

  • We present the effect of different socio-economic covariates and show that, in the considered period, Europeans are becoming slightly more positive in their attitudes toward migrants, but this tendency can be especially observed in countries such as Germany or Sweden.

  • The adopted models also allow us to study the problem of measurement invariance of the items across countries and rounds of the longitudinal survey. Testing for this assumption enables us to achieve the reliable comparability between the results presented for respondents answering questions at different time points and living in different EU countries. Therefore, the presented approach is very useful in heterogenous data analyses facing the methodological challenges in cross-country comparisons.

Europeans are quite heterogeneous in terms of public support for (or opposition to) immigration, in their extent of internal unity, and in the drivers associated with anti-immigration sentiment. The characteristics of the 6 latent classes corresponding to homogeneous groups of citizens may help to formulate more precise political ideas and confrontations addressed to identified groups of Europeans with the similar levels of immigration acceptance. Moreover, providing information concerning immigrant attitudes in different countries across time represents a powerful means for policies designed to decrease the distances between members of the host society and to promote intergroup contacts.