1 Introduction

Since 2000, the number of international migrants has grown by more than 100 million, with Europe representing the destination area of the largest number of international migrants in the world (87 million in 2020; see United Nations Department of Economic (2020)). The intensification of immigrant flows has been a central issue in public debates over the last two decades, shaping people’s attitudes toward immigration. The last wave of the EurobarometerFootnote 1 highlights that immigration is among the top concerns of people in all Member States (Standard Eurobarometer 2022). Recent literature has stressed the role of immigration in the populist and demagogic debate that adopts the strategies of negative other-representation and criminalisation of immigrants: among others, see Greco and Polli (2019) and Combei and Giannetti (2020). In particular, the link between immigration and criminality inflames the political debate in Europe (Solivetti 2018), even though Europeans’ attitudes towards immigration are quite heterogeneous among countries (see Drazanova et al. (2020)). The association between immigration and crimes is also a long-studied issue among researchers, producing uncertain results. Social phenomena related to immigration are complex and change depending on the time and place considered. Thus, the association with crime is far from trivial, as the results could be influenced by many unobserved factors that take part in the picture (Taft 1933; Martinez 2000; Kubrin 2023).

This paper focuses on a specific aspect of such a big picture, i.e., immigrants in prison. We question whether different attitudes of countries’ policies toward migrant integration correspond to different propensities to hold foreigners in prison. Moreover, we aim to introduce the FNCH model on clusters as a new quantitative tool to investigate demographic and social research issues.

The aim of this work is not the assessment of a causal relationship between migrants’ integration and foreigners’ propensity to be incarcerated. One may estimate the causal effect of a specific policy targeting integration; however, it is impossible to measure the effect of the level of integration as a whole. It is not a limitation of the method but rather an intrinsic characteristic of causality and causal inference. Indeed, the level of integration is not a single intervention received by a target at a particular point in time and space, but it can be seen as a feature of a country (that may vary in time). There are diverse variables that may play a role in the determination of the incarceration rate of foreigners in different countries, going from countries’ levels of punitiveness to the difference in the composition of the migrant populations. In fact, it is well-known that different countries have different levels of punitiveness and incarceration policies (see, among others, Solivetti (2010)). Furthermore, foreigners’ incarceration rate may depend both on their propensity to commit crimes and on countries’ incarceration policies; in turn, the propensity to commit crimes may be influenced, among other things, by one’s culture, level of inclusion, legal status, and socioeconomic characteristics, all of which are affected by integration policies. Figure 1 sketches some possible causal relations.

Fig. 1
figure 1

Graph representing the plausible relations among phenomena connected to immigration, integration, and incarceration. In this work, we investigate the existence of the association between integration policies and immigrants’ incarceration (in squares). However, we do not speculate on the existence of a causal relation; we expect the relationship between the two to be mediated and influenced by several other variables

Our objective is to describe the phenomenon of foreigners in prison through the lenses of integration policies without suggesting a direct causal relation. We investigate the problem by focusing on European countries. Given the subjects’ heterogeneity and the complexity of such a multidimensional phenomenon, we propose a clustering approach. Then, we compare the propensity of immigrants to end up in prison among clusters of countries with similar integration policies. We leverage data from multiple sources. To cluster European countries by their level of integration towards migrants, we rely on MIPEX data. To the aim of this work, we focus on the year 2019 and European countries. The covered policy areas of integration are:

  • Labour Market Mobility

  • Family Reunion

  • Education

  • Political Participation

  • Long-term Residence

  • Access to Nationality

  • Antidiscrimination

  • Health

A more detailed description of MIPEX dimensions is given in Sect. 3. For a more extensive one, see Solano and Huddleston (2020) and Alaimo et al. (2021). Concerning the total and foreign population stock and the number of persons held in prison, we rely on Eurostat data and the United Nations data from different agencies, namely UNDESA (UN Department of Economic and Social Affairs) and the UNODC (UN Office on Drugs and Crime), respectively.

Our contribution is twofold. On the one hand and from a methodological perspective, we introduce a new quantitative tool to investigate demographic and social research issues, namely the use of Fisher’s noncentral hypergeometric (FNCH) model on clustered data. In particular, we rely on a model-based clustering approach, similar to Alaimo et al. (2022), to model the MIPEX dimensions, which offers the advantage of clearly stating the assumptions behind the clustering algorithm. It also allows the analysis to benefit from the inferential framework of statistics to address some of the practical questions arising when performing clustering, i.e., determining the number of clusters, detecting and treating outliers, assessing uncertainty about which components each unit belongs to Bouveyron et al. (2019). Then, we exploit FNCH to model the number of foreigners held in prison. Leveraging this distribution, we account for the possibility that different clusters have different propensities to hold foreigners in prison. The use of the FNCH model on clustered data is effective in the case of small sample size and when only count data are available.

On the other hand, we provide interesting findings that contribute to the literature on immigrants’ integration and detention. We find that the cluster with the lowest mean values of the dimensions capturing policies’ integration is also the cluster where the exposure of foreigners to imprisonment is the highest. Moreover, in all clusters, the propensity to hold foreigners in prison is also higher than that of citizens, suggesting that differences among clusters may be due to different overall political attitudes rather than different integration policies.

The article is organised as follows. Section 2 gives a brief background on the related literature and works. Section 3 describes data and their sources in detail. In Sect. 4, the methods used, namely model-based clustering via multivariate Gaussian mixtures (4.1) and Fisher’s noncentral hypergeometric distribution (4.2), are briefly described. The analysis and the results are described and discussed in Sect. 5. The Conclusions, including the limitations of the study and future developments, follow. Supplementary material can be found in Appendix A; in particular, there we introduce a descriptive analysis of the data.

2 Literature background and related works

Countries’ political attitudes shape policies to regulate immigrants’ integration (Joppke 2007), which in turn affect immigrants’ standard of living, their level of inclusion, and their ability to remain in the destination country (Helbling et al. 2020; Solano and Huddleston 2020; Solano and De Coninck 2022). The association between immigration, politics, integration and criminality is a long-studied issue among researchers. However, the results and their interpretations are mixed. In fact, social phenomena related to immigration are complex and change by the considered time and place. Thus, the association with crime is far from trivial, as many other unobserved factors could influence the results. To conceptualise this association, as assessed in Solivetti (2018), some authors have adopted Merton’s thesis within the “anomie” conceptual framework (Merton (1938), and subsequent versions and revisions) that high social pressure to succeed materially in the face of scarce legitimate opportunities leads to crime and other forms of deviance. Other authors have supported the so-called economic model of crime, which, following Becker’s study Becker (1968), assumes that crime is a rational option whenever its benefit outweighs its cost. Crime costs and benefits, in turn, are influenced by economic conditions, which affect both legitimate opportunities and returns to crime. According to Bianchi et al. (2012), from a theoretical viewpoint, there are several reasons to expect a significant relationship between immigration and crime. Becker (1968), Ehrlich (1973) assess that this may happen because immigrants and natives face different legitimate earning opportunities, different probabilities of being convicted and different conviction costs. Also, immigration may affect crime rates as a result of natives’ response to the inflows of immigrants Borjas et al. (2010). In Europe, Boateng et al. (2021) analyse aggregate-level data obtained from 21 European countries to assess the effects of immigrants on three different types of violent crimes. Their results indicated a null relationship between immigration and crime, suggesting that immigration is unrelated to all three types of crimes assessed. Throughout this paper, we focus the analysis on foreigners held in prison. A first descriptive analysis of the differences in incarceration rates for natives and foreigners by political attitude has been done in Jackson and Parkes (2008). The authors found that the “impact” on incarceration rates of corporatist and neoliberal economic and political regimes in Germany, France and Britain between 1970 and 2003 is not the same for native workers, or “insiders”, and “outsiders”, namely, foreign workers and asylum seekers. Another investigation on the relationship between the political party vote and the attitude toward immigrants has been done by Indelicato et al. (2023), which used the 2013 International Social Survey Project (ISSP) dataset of six countries (namely Belgium, Germany, Spain, France, the UK, and Portugal). The authors leveraged hybrid fuzzy TOPSIS (Technique for Order of Preference by Similarity to the Ideal Solution). They found that the left-party voters are more open toward immigrants than the right-party voters and that the green-party voters show the most positive attitudes toward immigrants. The link between nationalism, political patriotism, and anti-immigrant attitudes has been investigated in Grigoryan and Ponizovskiy (2018), focusing on Russia, pointing out the case where anti-immigrant sentiments are widespread across all social strata. The authors conducted multiple group confirmatory factor analysis (MGCFA) on the International Social Survey Program’s (ISSP) National Identity module in 1995, 2003, and 2013. Their findings support the theoretical distinction that nationalism is linked to anti-immigrant attitudes, political patriotism is linked to more positive attitudes, and cultural patriotism is largely unrelated to attitudes toward immigrants. Lynch and Simon (1999) examines the relationship between immigration policy and criminal involvement of immigrants in Australia, Canada, France, Germany, Great Britain, Japan, and the United States. Findings show a general (but not perfect) pattern in which nations with a higher number of immigrants and an attitude more toward migrant integration have lower ratios of immigrant-to-native crime than nations with less liberal policies. Koopmans (2010) found opposite results; the author investigates how integration policies and welfare-state regimes have affected the socioeconomic integration of immigrants, focusing on Germany, France, the United Kingdom, the Netherlands, Switzerland, Sweden, Austria and Belgium. The results suggest that multicultural policies, which grant immigrants easy access to equal rights and do not provide strong incentives for host-country language acquisition and inter-ethnic contacts, when combined with a generous welfare state, have produced low levels of labour market participation, high levels of segregation and a strong over-representation of immigrants among those convicted for criminal behaviour.

3 Data sources

3.1 Migrant integration

In this paper, we refer to the definition of migrant integration given in Solano and De Coninck (2022), which states that migrant integration refers to the process of settlement, interactions with the receiving society and social change due to immigration (Penninx 2019; Garcés-Mascareñas and Penninx 2016; Entzinger 2000). Among the various definitions of migrant integration, we adopted the one articulated by Solano and De Coninck (2022), which also informs the development of the MIPEX. MIPEX is a comprehensive and robust tool for assessing and benchmarking governmental efforts to facilitate migrant integration across countries. It plays a pivotal role in guiding key policy actors in leveraging indicators to enhance integration governance and policy effectiveness. Specifically, MIPEX evaluates policies aimed at fostering socioeconomic and civic integration. As described in Solano and Huddleston (2020), Alaimo et al. (2021), the MIPEX is a system of 167 policy indicators. It includes 52 countries and collects data from 2007 to 2019 to provide a view of integration policies across a wide range of differing environments. Experts from each country evaluate the 167 basic indicators, which are first aggregated in 58 indicators, and then in 8 policy areas, via arithmetic mean.Footnote 2 Each dimensional synthetic indicator is bounded between 0 and 100: the higher the value, the better the situation in that policy area. Although not without its critics, this index is widely acknowledged in scholarly discourse as a fundamental reference for comparative analyses of migrant integration (Hadjar and Backes 2013; Ruedin 2015; Rayp et al. 2017; Ingleby et al. 2019). Integration policies relate to the conditions required to become and to remain part of a specific society and the entitlement rights as well as the support migrants receive (Garcés-Mascareñas and Penninx 2016; Entzinger 2000). An overview of many of the existing indexes and indicators on migration policies and their methods is given in Solano and Huddleston (2022). MIPEX data are freely downloadable from the Migrant Integration Policy Index website.Footnote 3 We focus on 2019 data.

A brief description of the policy areas of integration covered is:

  • Labour Market Mobility: evaluates countries’ policies for full equality of rights and opportunity in the labour market for migrants. According to Solano and Huddleston (2020), integration of immigrants into the labour market is a process that happens over time and depends on general policies, context, immigrants’ skills and the reason for migration.

  • Family Reunion: Family reunification policies determine if and when separated families can reunite and settle in their new homes. According to Solano and Huddleston (2020), many Western European countries restrict eligibility for reunification to the nuclear family and expect transnational families to live up to standards that many national families could not: high incomes, no social benefits and the ability to pass language or cultural tests.

  • Education: evaluates policies to support immigrant pupils in finding the right school or class or in “catching up” with their peers.

  • Political Participation: is one of the weakest areas of integration (Solano and Huddleston 2020). In most countries, foreign citizens are not enfranchised or regularly informed, consulted or involved in local civil society and public life. Despite European norms and promising regional practices, political participation is still almost absent in integration strategies in Bulgaria, Lithuania, Romania, and Slovakia.

  • Long-term Residence: evaluates policies that support the path of migrants for the security of a permanent residence. It may be a fundamental step toward full citizenship and better integration outcomes.

  • Access to Nationality: Facilitating access to nationality can significantly increase naturalisation rates and boost integration outcomes. Nationality policies are a major weakness in most European countries (Solano and Huddleston 2020), especially Austria, Bulgaria, the Baltics and Eastern Europe.

  • Antidiscrimination evaluates antidiscrimination laws to inform and support victims to take the first step in the long path to justice.

  • Health: The inclusion of migrants into the health system of destination countries is coming to be seen as an essential component of their integration.

For brevity, we name the indicators using one-word labels, according to Alaimo et al. (2021) notation throughout the work.

3.2 Foreigners in prison

The United Nations Office on Drugs and Crime (UNODC) collects data on access and functioning of justice, including persons held in prison, on a yearly base (UNODC 2020). Member States usually submit national data to UNODC through the United Nations Survey of Crime Trends and Operations of Criminal Justice Systems (UN-CTS). UNODC labels as “persons held in prison” all persons held in prisons, penal institutions or correctional institutions; non-criminal prisoners held for administrative purposes should be excluded, for example, persons held pending investigation into their immigration status or foreign citizens without a legal right to stay (UNODC 2020). Data can be detailed by citizenship (citizens/foreigners). For the sake of coherence with foreigners’ stocks (see Sect. 3.3), we refer to 2020 data. For a few countries,Footnote 4 UNODC data for 2020 are not available. In such cases, we rely on data collected for the World Prison Brief (WPB) by the International Centre of Prison Studies in London WPB (2020), which are proven to be coherent with the UNODC data for those countries whose data are available on both sources; UNODC itself includes the WPB among its sources.

Note that the definition of foreigners in prison includes all people who are not citizens of the country of residence; thus, such a definition does not perfectly overlap with the definition of “immigrant”, namely “a person establishing his or her usual residence in the territory of a Member State for a period that is or is expected to be of at least 12 months, having previously been usually resident in another Member State or a third country (Regulation (EC) No 862/2007 on Migration and international protection). “Foreigners in prison” may include travellers and foreigners who are not country residents; we assume such overcoverage errors to be negligible. Moreover, in countries where citizenship is based on ius sanguinis, “foreigners in prison” may also include people born in the country of residence, even if they have never lived abroad. On the other hand, the term “foreigners” does not include people who return to their country of citizenship after living in another country.

Finally, we would like to stress again that our target population is “foreigners in prison” or incarcerated foreigners, which differs from “criminal foreigners”. Indeed, the latter would also include foreigners who have committed crimes that do not lead to imprisonment, foreigners who have not been incarcerated yet, and foreigners who have committed crimes that have not been reported. On the other hand, “foreigners in prison” may include unsentenced detainees who might not be guilty.

3.3 Foreigners in a country

Data about the size of a population and its composition by citizenship are made available yearly by Eurostat (2024). We refer to the population on January 1, 2020. For a few countries,Footnote 5 Eurostat data for 2020 are not available. In such cases, we rely on data collected by the United Nations Department of Economic and Social Affairs, Population Division (UNDESA) (2020). While the total population size perfectly overlaps, there is a difference in the “foreigners” population. Indeed, Eurostat gives information about foreign citizens. In contrast, UNDESA estimates, based on official statistics, are mainly obtained from population censuses on the foreign-born or foreign population.Footnote 6 As pointed out in the previous section, the definitions of foreigners and foreign-born are different.

4 Model setting

As stated in the Introduction, we aim to assess whether there is a significant association between migrants’ integration, measured using the MIPEX, and the proportion of foreigners in prison. The first natural way to measure such an association that comes to mind is the regression analysis; one may look at the proportion of foreigners in prison as a dependent variable and regress it on the MIPEX dimensions. This approach may not be appropriate for at least two reasons. First, our dataset includes 34 observations, i.e., 34 countries, and 8 regressors, namely the MIPEX dimensions. The sample size is too limited for so many covariates, and the degrees of freedom are insufficient to obtain reliable estimates. Second, there would be interpretability issues. The coefficients of multivariate (linear) regressions must be interpreted ceteris paribus, i.e., keeping everything else constant; each indicates the variation of the dependent variable for a unit change of the relative covariate. In the case object of this work, it would mean estimating how the proportion of foreigners in prison changes for a unit change in one of the MIPEX dimensions, keeping constant all the others. Given the small number of observations, this would lead to a questionably reliable interpolation.

For these reasons, we need other, more sophisticated yet simple statistical methods. To this aim, we first cluster countries according to MIPEX dimensions; then, we estimate the relative exposure to imprisonment relying on Fisher’s noncentral hypergeometric (FNCH) model, which arises naturally in this context. The following subsections briefly describe the methods.

4.1 Clustering via finite mixture of multivariate normal

Mixture models have an interesting origin story, dating back to the late 19th century. They were first thought of by two notable figures: the biometrician, statistician and eugenicist Karl Pearson and the evolutionary biologist Walter Weldon. In 1893, Weldon observed something unusual in his study of shore crabs: the proportions of their forehead to body lengths varied in a way that didn’t fit the usual patterns. He thought this could be evidence of the crabs evolving into different groups. Pearson took this idea and developed a new mathematical method to analyse such data, laying the groundwork for what we now call mixture models. In simple terms, mixture models are a useful tool that allows us to model data more accurately by combining different statistical distributions. Finite mixture models can describe populations in which the assumption of a finite number of subpopulations holds. Consequently, finite mixtures offer suitable models for cluster analysis by assuming that each group of observations in a given dataset suspected to contain clusters originates from a population characterised by a distinct probability distribution. Although these distributions may belong to the same family, they differ in the values of their distribution parameters. So, by using finite mixture densities as models for cluster analysis, the clustering problem becomes that of estimating the parameters of the assumed mixture and then using the estimated parameters to calculate the posterior probabilities of cluster membership (Everitt et al. 2011). Model-based clustering, specifically using finite mixtures of multivariate normal distributions, has gained considerable popularity due to its ability to approximate the density function of any unknown distribution (Titterington et al. 1985; Li and Barron 1999). We have already mentioned in the Introduction that, unlike classical non-parametric distance-based clustering methods, model-based clustering offers the additional advantage of leveraging the inferential framework of statistics to address various practical questions that arise during clustering. These include determining the number of clusters, detecting and handling outliers, and assessing uncertainty regarding the assignment of units to specific components (Bouveyron et al. 2019).

Let \({\varvec{x}} = \{{\varvec{x}}_1, {\varvec{x}}_2, \dots , {\varvec{x}}_N\}\) be a matrix of N multivariate observations independent and identically distributed, each of dimension J, so that \({\varvec{x}}_i=\{x_{i1}, x_{i2}, \dots , x_{iJ}\}\), with \(i=1,\dots ,N\). A finite mixture model represents the probability distribution or density function of one multivariate observation, \({\varvec{x}}_i\), as a finite mixture or weighted average of G probability density functions, called mixture components (Bouveyron et al. 2019):

$$\begin{aligned} f({\varvec{x}}_i) = \sum _{g=1}^{G} \tau _g f_g ({\varvec{x}}_i\mid {\varvec{{\varvec{\theta }}}}_g) \end{aligned}$$
(1)

Where \({\varvec{\tau }}_g\) is the probability that an observation was generated by the g-th component, under the constraints that \({\varvec{\tau }}_g\ge 0\) for \(g=1,\dots , G\), and \(\sum _{g=1}^{G} {\varvec{\tau }}_g =1\), while \(f_g (\cdot \mid {\varvec{\theta }}_g)\) is the density of the g-th component given the values of its parameters \({\varvec{\theta }}_g\).

We consider the case in which each component arises from a multivariate normal distribution \( {\varvec{x}}_i\mid {\varvec{{\varvec{\theta }}}}_g\sim f_g({\varvec{x}}_i\mid {\varvec{{\varvec{\theta }}}}_g) = MVN({\varvec{\mu }}_g, {\varvec{\Sigma }}_g)\), and has the form:

$$\begin{aligned} f_g({\varvec{x}}_i\mid {\varvec{\mu }}_{g},{\varvec{\Sigma }}_{g})= \frac{1}{\mid 2\pi {\varvec{\Sigma }}_{g}\mid ^{1/2}}\exp \left\{ -\frac{1}{2}({{\textbf {x}}}_i -{\varvec{\mu }}_{g})'{\varvec{\Sigma }}_{g}^{-1}({{\textbf {x}}}_i-{\varvec{\mu }}_{g})\right\} \end{aligned}$$
(2)

where \({\varvec{\mu }}_{g}\) is the mean vector and \({\varvec{\Sigma }}_{i}\) the covariance matrix. Model parameters are estimated using the iterative Expectation-Maximisation (EM) algorithm (Dempster et al. 1977).

Clustering via finite mixtures of multivariate normal (McLachlan and Basford 1988; Fraley and Raftery 2002; Melnykov and Maitra 2010; Fraley and Raftery 2007) allows classifying each observation \(x_{ij}\), with \(j=1,\dots , J\), into one of the G groups by computing the posterior probabilities. In the multivariate setting, the covariances’ volume, shape, and orientation can be constrained to be equal or variable across groups. Thus, a parsimonious version of the model is considered, with 14 possible models with different geometric characteristics (Scrucca et al. 2016). The optimal model and the optimal number of clusters are chosen according to the Bayesian information criterion (BIC).

Finite mixtures of multivariate normal classification is sufficiently flexible to encompass many current clustering algorithms, including those based on the sum of squares criterion (Banfield and Raftery 1993). A limitation is that it is restricted to Gaussian distributions and does not allow for noise. However, this case does not stand with our data.

4.2 Fisher’s noncentral hypergeometric distribution

Professor R. A. Fisher first described Fisher’s noncentral hypergeometric distribution (FNCH) in 1935, in “The Logic of Inductive Inference”, published in the Journal of the Royal Statistical Society (Fisher 1935), in the context of contingency tables tests.

FNCH describes a biased urn model where the balls are drawn independently, without replacement. Imagine an urn full of blue and green balls. The urn model is said to be biased if the probability of drawing a blue ball depends not only on the proportion of blue and red balls in the urn but also on the relative weight of the colours.

Assume the balls to be people charged with a crime. These people come from two ethnic groups: the “blues” and the “greens.” There is no reason to suspect that one group is more inclined to commit crimes than the other. At the end of the (independent) trials, if the judgements are fair (unbiased), the proportion of convicted blues must be similar to the proportion of convicted greens. Yet, if the proportion of convicted blues is significantly higher than that of the greens, one may blame the judgements to be biased in favour of the greens.

More formally, denote with \({\varvec{M}} = (M_B, M_G)\) the number of blue and green balls in the urn, respectively, or the number of blues and greens charged with a crime. Denote with \({\varvec{Y}} = (Y_B, Y_G)\) the number of blue and green balls drawn or the number of blues and greens convicted. Each blue charged will be convicted with probability \(\pi _B\); similarly, \(\pi _G\) is the probability for a green to be convicted.

Hence, we can model the count \(Y_i\), \(i = B, G\), using the Binomial model, i.e.,

$$\begin{aligned} Y_i \sim \text {Binom}(M_i, \pi _i), \quad \forall i \end{aligned}$$
(3)

According to Harkness (1965), FNCH can be seen as the conditional distribution of such independent Binomial distributions given their sum \(Y_B + Y_G = n\); in other words, the composition of convicted people with respect to their ethnicity, given that n people in total have been convicted, follows an FNCH:

$$\begin{aligned} {\varvec{Y}} \mid Y_B + Y_G = n \sim \text {FNCH}({\varvec{M}},n,w) \end{aligned}$$
(4)

where

$$\begin{aligned} w = \dfrac{\pi _B/(1-\pi _B)}{\pi _{G}/(1-\pi _{G})}, \end{aligned}$$
(5)

is the odds ratio, i.e., the weight of blues with respect to the greens; in other words, w represents how much the blues are more convicted than the greens. Whether blues and greens have the same weight, i.e., if \(\pi _B = \pi _G\), then \(w = 1\). Yet, if \(w > 1\), there would be evidence of a higher propensity to convict blues with respect to greens; \(w < 1\) would mean that greens are more likely to be convicted than blues. Hence, the FNCH model can be useful for testing the hypothesis “\(w = 1\)” against the alternative that the categories have different weights.

In the multivariate case, namely, when there are more than two colours in the urn, the notation can be extended easily. For instance, in the case of three categories, \({\varvec{Y}} = (Y_1, Y_2, Y_3)\), \({\varvec{M}} = (M_1, M_2, M_3)\). In this case, we need to define \(3-1\) relative weights and choose a category as a reference: e.g., if we choose category 1 as a reference, we need to define the weight of colour 2 with respect to colour 1 (\(w_2\)), and the weight of colour 3 with respect to colour 1 (\(w_3\)). Hence, \({\varvec{w}} = (1, w_2, w_3)\), where \(w_i, i = 2, 3\) is

$$\begin{aligned} w_i = \dfrac{{\pi _i}/{(1-\pi _i)}}{{\pi _1}/{1-\pi _1}}. \end{aligned}$$
(6)

FNCH has been underused in the statistical literature mainly because of the computational burden given by its probability mass function; for generic number of categories N:

$$\begin{aligned} P\left( {\varvec{Y}} = {\varvec{y}}\mid \sum \limits _{i=1}^{N}{Y_i}=n\right) = \dfrac{\prod \limits _{i=1}^{N}{\left( {\begin{array}{c}M_i\\ y_i\end{array}}\right) {w_i}^{y_i}}}{\sum \limits _{{\varvec{z}}\in {\mathcal {Z}}}{\prod \limits _{i=1}^{N}{\left( {\begin{array}{c}M_i\\ z_i\end{array}}\right) {w_i}^{z_i}}}} \end{aligned}$$
(7)

where \({\mathcal {Z}} = \left\{ {\varvec{y}} \in {{\mathbb {N}}_0}^N: \bigg [\sum \limits _{i=1}^{N}{y_i} = n \bigg ] ~ \cap ~ \bigg [0 \le y_i \le M_i\bigg ], \forall i \right\} \).

The sum at the denominator makes unfeasible the derivation of an MLE for \(w_i\) in closed form; numerical approximation methods are provided by (Fog 2008). In a Bayesian perspective, Ballerini and Liseo (2022) provides Markov Chain Monte Carlo (MCMC) methods to derive the posterior probability in the univariate case, also dealing with both weight w and size \({\varvec{M}}\) parameters unknown, in the presence of informative prior information.

5 Analysis and results

5.1 Integration clusters

To cluster MIPEX dimensions data, we rely on a model based clustering approach based on parameterised finite Gaussian mixture models for its ability to approximate the density function of any unknown distribution (Titterington et al. 1985; Li and Barron 1999), computed via the package Mclust (Scrucca et al. 2016) of the R statistical software.Footnote 7 It should be noted that in a longitudinal setting, MIPEX dimensions have already been clustered in Alaimo et al. (2021), while in a cross-sectional setting, an attempt to cluster the MIPEX data has been made in Seri et al. (2022), and focusing on European countries in Hooghe and Reeskens (2009), even if the implemented model is not specified. We compare 14 models with different geometric characteristics of the covariances; each model is applied for a different number of components, \(1 \le G \le 9\). Using the BIC, the selected model parametrisation is EVI (diagonal, equal volume, varying shape) with 3 components (Fig. 2).

The cluster means are shown in Table 1.

Table 1 Cluster parametrisation and means
Fig. 2
figure 2

Countries by cluster membership

  • Cluster 1: Slovakia, Slovenia, Hungary, North Macedonia, and Bulgaria.

    • It is characterised by very low scores in the integration-policy areas of education and political participation; moreover, quite low values characterise the citizenship and labour strands. However, Cluster 1 performs better than the others in the family reunions, long-term residence, and antidiscrimination areas. It must be noticed that the security of permanent residence may be a fundamental step on the path to full citizenship and better integration outcomes, according to Solano and Huddleston (2020). Hence, in the case of a significant association between integration and incarceration, the ease of long-term residency would likely represent an important determinant.

  • Cluster 2: Portugal, Spain, France, Belgium, Netherlands, Germany, Switzerland, Ireland, United Kingdom, Iceland, Norway, Sweden, Finland, Denmark.

    • Overall, it is the most virtuous cluster toward migrant integration. It presents higher means than the other 2 clusters in labour market mobility, education, political participation, access to nationality and health, and a lower mean in the family reunion strand, in which only Portugal and Sweden present particularly high values. Countries in this cluster evaluate immigrants’ impact on society most positively (Drazanova et al. 2020).

  • Cluster 3: Estonia, Lithuania, Latvia, Poland, Czechia, Austria, Serbia, Romania, Italy, Malta, Croatia, Albania, Greece, Cyprus, and Turkey.

    • Countries in this cluster are characterised by generally quite low values in all the dimensions. The only particularly high values are those of Austria and Italy in the health area and of Serbia and Romania in the antidiscrimination area.

These results perfectly align with the results in Drazanova et al. (2020) concerning the attitude towards immigration in different European countries. Indeed, countries in Cluster 2 also have the most positive attitude towards immigration in terms of the degree of belief that immigration has an overall positive impact on society. On the contrary, countries in Clusters 1 and 3 have a mostly negative perception.

5.2 Propensity to be held in prison

Let \({\varvec{M}}^c = \left( M^c_1, \dots , M^c_{3}\right) \) be the 3-dimensional vector of counts corresponding to the number of citizens residing in their respective countries in clusters \(g = 1, 2, 3\). Similarly, we denote with \({\varvec{M}}^f = \left( M^f_1, \dots , M^f_{3}\right) \) the vector of counts corresponding to the number of foreigners residing in countries of clusters \(g = 1, 2, 3\). Hence, \(M^\text {c}_g\) and \(M^\text {f}_g\) would be the number of citizens and foreigners, respectively, residing in the \(g^{th}\) cluster; \(M_g = M_g^c + M_g^f\). For each cluster g, let \(Y^c_g\) and \(Y^f_g\) be the numbers of citizens and foreigners, respectively, held in prison.

Therefore, \(Y_g^c\) are the citizens held in prison among all the \(M_g^c\) citizens residing in cluster g. Similarly, \(Y_g^f\) are the foreigners held in prison among all the \(M_g^f\) citizens residing in cluster g.

In the following subsection, we test whether the foreigners held in prison weight similarly in the different clusters; in Sect. 5.2.1, we will further test the hypothesis that, within each cluster, foreigners and citizens have the same propensity to be held in prison.

We generally assume:

$$\begin{aligned} Y^h_g \overset{\text {ind}}{\sim } \text {Binom}\left( M^h_g, \pi ^h_g\right) \quad h=\text {c},\text {f} \;, \end{aligned}$$
(8)

where \(\pi ^h_g\) is the probability to be held in prison in cluster g for each status h. We can estimate the clusters’ propensity to be incarcerated as \({\hat{\pi }}^h_g = y^h_g/M^h_g\), with \(g = 1, 2, 3\) being the cluster label. However, we should be interested in the relative propensity in the form of odds ratios for two reasons. First, for the sake of interpretability. Second, the odds ratio would not be affected by the presence of undercoverage in the population stocks if we assume the undercoverage rates to be constant over the clusters. For instance, consider the possibility that \(M^\text {f}_g\) is undercovered, i.e., that cluster g misses some units when counting the number of foreigners; above all, it may be due to illegal immigration. In such a case, we would observe a portion \(k^\text {f}_g \in (0,1]\) of \(M_g^\text {f}\). If we assume \(k^\text {f}_g = k^f\) for all g, the odds ratios remain constant whatever the level of \(k^f\); that does not hold for the absolute propensity. Such an assumption is plausible; Mastrobuoni and Pinotti (2015) reports that, in many developed countries, the incidence of irregular immigrants is approximately \(25\%\) of the total immigrants (see also González-Enríquez (2009)).

Fisher’s noncentral hypergeometric distribution arises naturally in this context. For instance, if we want to test the weights of foreigners held in prison in different clusters, we can write:

$$\begin{aligned} {\varvec{Y}}^f \mid \sum \limits _{g=1}^{3}{Y^f_g=n^f} \sim \text {FNCH}\left( {\varvec{M}}^f,n^f,{\varvec{w}}^f\right) \;, \end{aligned}$$
(9)

where \({\varvec{w}}^f\) is the vector of weights - one of them set equal to 1; if all elements of \({\varvec{w}}\) were estimated to be close to 1, it would mean that there is not a significant different in the propensity to hold foreigners in prison among clusters.

It is important to highlight that utilising clustered odds ratios derived from Fisher’s noncentral hypergeometric model, which captures relative propensities, enables us to address specific challenges that may arise when employing classical regression methods in the presence of a few observations. One notable advantage is that it mitigates estimation issues encountered when dealing with a limited number of units (34 countries) alongside a substantial number of covariates (8).

5.2.1 Between clusters

Cluster analysis has identified three clusters; we aggregate migrant stocks and count data of foreigners in prison accordingly. We provide estimates for the odds ratios \(w_g^\text {f}\)’s in (9) via numerical approximation, using the R package BiasedUrn (Fog 2015). Tables 2, 3 and 4 show the estimates and their 95% confidence intervals, for \(g^*=1\), \(g^*=2\), \(g^*=3\), respectively.

Table 2 Point estimates and confidence intervals of the odds ratios comparing foreigners’ propensity to be incarcerated among clusters. Reference cluster \(g^* = 1\)
Table 3 Point estimates and confidence intervals of the odds ratios comparing foreigners’ propensity to be incarcerated among clusters. Reference cluster \(g^* = 2\)
Table 4 Point estimates and confidence intervals of the odds ratios comparing foreigners’ propensity to be incarcerated among clusters. Reference cluster \(g^* = 3\)

Cluster 2 is the “most virtuous” cluster in terms of migrants’ integration, and it also has the lowest relative propensity to hold foreigners in prison. Compared to it, in Clusters 1 and 3 foreigners are about 1.4 and 2 times more exposed to detention, respectively. We do not aim to claim a causal relationship between the two phenomena; too many confounders, such as (among others) the gender and age structure of the foreign population in the different clusters and differences in the foreigners’ countries of origin, are involved. To investigate further, we look at the different propensities to detain foreigners and citizens within each cluster.

5.2.2 Within clusters

We still consider an FNCH model to test whether foreigners’ propensity to be held in prison differs from that of citizens within each cluster. In such a univariate case, we consider each cluster to be an urn composed of two categories: foreigners and citizens. We assume the number of foreigners held in prison in a cluster, conditionally on the total number of people held in prison in that cluster, is FNCH distributed:

$$\begin{aligned} Y^f_g \mid Y^f_g + Y^c_g = n_g \sim \text {FNCH}\left( \left( M^f_g,M^c_g\right) ,n_g,\phi _g\right) \end{aligned}$$
(10)

where

$$\begin{aligned} \phi _g = \dfrac{\pi ^f_g/\left( 1-\pi ^f_g\right) }{\pi ^c_g/\left( 1-\pi ^c_g\right) } \;; \end{aligned}$$
(11)

a \(\phi _g > 1\) would suggest that foreigners are more exposed to detention than citizens in cluster g.

Results are shown in Table 5.

In all clusters, foreigners and citizens are differently exposed to being imprisoned; in particular, foreigners are more inclined, or exposed, to be held in prison than citizens. Indeed, a person can be held in prison either after the delivery of the sentence or still unsentenced; foreigners generally suffer more from custodial pretrial measures than the so-called domestic detainees (e.g., for the Italian case, see Gonnella (2015)).

Note that in Cluster 1, the relative exposure of foreigners to citizens is slightly lower than that of the other clusters. A reason behind the latter result could lie in the fact that Cluster 1 groups transit countries for immigration. Such countries also have strong rejection policies, because of which they have been even sanctioned by the European Union.Footnote 8

Table 5 Point estimates and confidence intervals of the odds ratios that compare the propensity to be incarcerated of foreigners and citizens within clusters (reference category: citizens)

5.3 Discussions

By clustering the MIPEX data, we found 3 clusters of countries with similar integration policies. It must be noticed that another attempt to cluster MIPEX data has been made in Hooghe and Reeskens (2009). The authors analysed a set of 28 European countries for 2006, achieving 2 clusters. Even though they used another clustering method on fewer countries, for a different year and on disaggregated data, there are several similarities in the countries’ cluster memberships. Leveraging the FNCH model to analyse the differences between clusters, we observe a significantly different propensity to detention in different clusters. In particular, Cluster 3 has a propensity to hold foreigners in prison about twice that of Cluster 2. However, the results of the analyses within clusters that test the similarity between the propensity to hold foreigners and citizens in prison point out that foreigners are more likely to be incarcerated, but the propensities are very similar among clusters. It suggests that the different propensities to detention may be due to different political attitudes, as suggested by Jackson and Parkes (2008) and Indelicato et al. (2023), rather than different integration policies in themselves. Indeed, integration policies mirror the bigger picture of political attitudes in different countries.

6 Conclusions

The interrelated issues of crime and integration have garnered widespread attention from scholars, policymakers, and the general public for several decades. However, due to the intricate nature of these phenomena, comprehensive quantitative analyses examining the relationship between integration policies and crime remain poorly explored. Our study contributes to addressing this research gap by focusing on a specific sub-dimension of immigrants’ criminality, namely immigrants in prison, and investigates it through the prism of integration policies. To our knowledge, no quantitative analyses have been conducted on the relationship between foreigners in prison and immigrants’ integration policies; in this paper, we make a first attempt to test the existence of a link between the two phenomena in European countries.

We leverage model-based clustering to group European countries according to the evaluation of their integration policies, modelling the eight dimensions of MIPEX via finite mixtures of Gaussian densities. Then, we compare the different exposures to be held in prison among and within clusters, relying on Fisher’s noncentral hypergeometric distribution, which is used to model clusters’ counts of incarcerated foreigners. We find that Cluster 3, the cluster with the lowest mean values of the policies integration dimensions, is also the cluster where foreigners are more exposed to imprisonment, followed by Cluster 1. Moreover, in all clusters, the propensity to hold foreigners in prison is also higher than that of citizens, suggesting that differences among clusters may be due to different overall political attitudes rather than different integration policies.

However, one limitation of our work is that due to the unavailability of data, we cannot consider some possible interesting structures in the foreigners’ population, such as gender and age. Also, we can not consider possible covariates of interest in estimating the propensity to be held in prison, like the differences by country of origin. It would also be interesting to picture the differences between the inner countries. Another limitation of this work is that it neglects the time dimension. Despite the proposed cross-sectional approach’s help in providing a picture of the two phenomena together, it would be interesting to study the association between integration and foreigners’ incarceration from a panel perspective. For these reasons, explaining the links through which integration policies might impact foreigners’ propensity to be incarcerated goes beyond the scope of our work. However, we aim to add these specifications in future works as the data becomes available.

Given these findings, several policy recommendations arise. On the one hand, countries should strengthen their actions in investigating the association between immigrants’ integration and incarceration. To this aim, enhanced data collection efforts are essential to better understanding immigrant incarceration. This includes detailed information on age, gender, country of origin, and other relevant factors, allowing for more tailored policy responses. Supporting longitudinal studies will also help monitor the long-term impact of integration policies, providing deeper insights into their effectiveness over time. Finally, international collaboration and sharing of best practices can help identify and implement effective integration strategies more broadly.