1 Introduction

In the socioeconomic literature, we observe a strong consensus that the well-being concept encompasses multiple dimensions and that looking only at economic aspects may distort perceptions, leading to inadequate policy actions (Atkinson & Bourguignon, 1982).

The 2009 report by the Sen-Stiglitz-Fitoussi Commission on the Measurement of Economic Performance and Social Progress marked a milestone in this debate, requiring researchers across the globe to develop new tools for the multidimensional monitoring of well-being (Stiglitz et al., 2009). Since then, the tools used to measure well-being have flourished in Europe and beyond. The standard of living, quality of life, quality of services and many other aspects of well-being have been measured and monitored through an increasing number of specialized indicators. More recently, climate awareness has created new imperatives for the private and public spheres. Air pollution, water quality, particulate matter and other environmentally related indicators have begun to be assessed throughout Europe, expanding the definition of well-being with an environmental dimension. In many European countries, these elementary indicators have been integrated into national accounts, expanding policymakers’ access to information when designing policies.

Despite these remarkable advances, providing a unique definition of well-being remains a challenge, both on the macro and individual levels. Over the years, scholars have worked to create theoretical frameworks reflecting such multidimensional ideas (see, e.g., Bourguignon and Chakravarty, 2003; Alkire and Foster, 2011). On the macro level, advanced theoretical models are mainly based on a set (or dashboard) of indicators of demonstrated consistency with the well-being construct. Examples include the OECD Better Life Index (BLI) and the Canadian Index of Wellbeing.

In Italy, the first theoretical framework developed in this debate was the “Equitable and Sustainable Well-Being (BES)” jointly proposed in 2013 by the National Council for Economics and Labor (CNEL) and the Italian National Institute of Statistics (ISTAT).

The emergence of a new multi-dimensional well-being paradigm has been revolutionary but not without drawbacks. Comparing nations or sub-regions with multiple and arbitrary sub-dimensions of well-being has become a daunting task (Kasparian & Rolland, 2012). This hurdle gives rise to the need for synthesis. Composite indicators (CIs) fulfil this requirement by reducing complex systems into lower-dimension spaces, thus allowing the performance of an individual unit to be evaluated across space and time.

The state-of-the-art aggregation methods for constructing composite indicators entail a broad list of approaches, from simple ones, such as linear aggregation, to more refined ones. Refined empirical indices are built on non-substitutable and non-compensatory indicators and allow for comparison across territorial units (see, e.g., De Muro et al., 2011; Mazziotta and Pareto, 2013, 2018; Scaccabarozzi et al., 2022).

Although they effectively fulfil their synthesis requirement, most CIs’ approaches require researchers to rely on several structural assumptions, for example, lack of uncertainty measure, normative weighting, temporal stability, spatial independence and linearity in the functional form (Ciommi et al., 2017). The approach we propose in this paper addresses three of them.

First, we argue that the normative selection of indicator weights is problematic. For several CIs, the choice of weights comes from expert judgments or is neutral by setting all indicators equally weighted (Mazziotta & Pareto, 2013). This approach exposes indicator weights to the subjectivity of those involved in constructing the CI.

Second, we question the assumption of the spatial independence of elementary indicators across areas. In current approaches, information about well-being depends exclusively on variables from the area analyzed and not on neighbouring areas. However, economically speaking, the neighbourhood is not random (Fusco et al., 2018), but instead describes a common culture among enterprises, a shared set of administrative rules on the provincial or regional level and so on, creating spatially aligned clusters, not to mention the detrimental influence of neighbouring factors on the validity of model estimates. In the linear regression framework, spatial correlation creates duplicated information and inflates the variance of the statistical model, damaging the validity of the estimated standard errors (Anselin & Griffith, 1988). As suggested by Fusco et al. (2018), spatial composite indicators bring out inherent local differences by identifying spatial clusters when elementary indicators are well clustered. Moreover, the lack of attention to the variables’ spatial dimension may have significant consequences when assigning weights (Sarra & Nissi, 2020).

Third, traditional indices usually lack a measure of uncertainty. This last feature can be problematic if policies or resource allocation decisions are based on cutoff values or index percentiles (Hogan & Tchernis, 2004).

Researchers solve the weights selection problem relying on data-driven statistical models such as principal component analysis (PCA), factor analytic models (Chelli et al., 2015), and Bayesian latent class models (Hogan & Tchernis, 2004; Machado et al., 2009; Ciommi et al., 2020). These weighting methods are helpful when dealing with large data sets to reduce data dimensionality and find common patterns. One critique of this method is that it can accommodate only linear relationships among variables, while it would be reasonable to have non-linear underlying patterns (Canning et al., 2013). Anyhow, when used within the field of well-being CIs, the factor analytic model offers a straightforward interpretation: the elementary indicators are manifestations of an underlying latent construct interpreted as the well-being and factor loadings represent the contribution of each indicator to the well-being construct (Rijpma, 2016; Ciommi et al., 2020).

Here, we follow the above-outlined approach based on factor models. In addition, we assume that well-being spillovers occur among neighboring provinces, creating well-being levels that are spatially correlated. Since we deal with spatial data, we must reformulate the traditional factor analytic model to incorporate spatial co-variation. We follow Hogan and Tchernis (2004) and Davis et al. (2021) and propose a Bayesian latent factor model for spatially correlated multivariate data. Our Bayesian model confers the distinct advantage of estimating a distribution of well-being for each province instead of relying on single-point estimates, thus providing a measure of uncertainty for our estimates. Another advantage of the Bayesian setting is that it can handle missing values with a posterior imputation procedure. In this way, the model directly incorporates the uncertainty caused by missing data into the resulting model’s estimates.

This paper proceeds in three steps, as in Ciommi et al. (2020). We first partition the BES elementary indicators into three distinct well-being domains: social, economic, and environmental. We analyze Italian provinces’ well-being for each dimension through composite indicators, including spatial correlation among provincial well-being levels. Under the assumption that neighbouring areas influence each other, our proposed method allows us to obtain more precise estimates by exploiting information from neighbouring provinces (Hogan & Tchernis, 2004). In the second step, we consolidate the three well-being dimensions into an overall well-being index for each Italian province. Lastly, we estimate the well-being levels of macro-regions (NUTS 1) and evaluate their evolution over time.

The paper is organized as follows. Section 2 describes the data and summarizes the results from the exploratory spatial analysis. Section 3 explains the statistical methodology. Section 4 presents the estimates from implementing statistical models to ‘Province BES’ data. Section 5 is devoted to concluding remarks.

2 Data

Our analysis is based on data from the Province BES dashboard (‘BES at the local level’).Footnote 1The Province BES data contains 55 elementary indicators of well-being grouped into 11 macro-domains for the 107 Italian provinces over the period 2004–2021 (ISTAT, 2021). This data source enables well-being monitoring in the Italian territories over time. The presence of missing values, especially in the early and later years, led us to restrict the analysis to 2012 to 2019. We hold elementary indicators with at least one non-missing value for each remaining year. The final set counts 34 elementary indicators. We list and report descriptive statistics for the selected elementary indicators in Appendix 1.

Our set of indicators resulted in a missing value percentage of 0.7%, which we then impute with a posterior imputation procedure, as explained in Sect. 3.

As in Ciommi et al. (2020), we partition the elementary indicators into three well-being domains: social, economic, and environmental. In doing so, we aim to build composite indicators for each Italian province that summarize the level of well-being in each of these domains.

As mentioned in the introduction, we assume that neighboring provinces have spatially correlated levels of well-being. To test this assumption, we explore the spatial correlation of Province BES indicators through a spatial exploratory data analysis (SEDA). Specifically, we estimate the Moran I test of global spatial correlation (see Moran, 1950) and an indicator of the local spatial association (LISA) (see Anselin, 1995). Both approaches test the hypothesis of spatial randomness against the alternative of spatial clustering across each Italian province and BES elementary indicators.

We perform these spatial assessments for each year from 2012 to 2019. The assessment highlights significant spatial auto-correlation for many indicators in all well-being domains. Moran’s I coefficients notably differ from zero, indicating spatial solid clustering or patterns. Some indicators show variations in spatial auto-correlation across time. For instance, Life expectancy at birth and Women’s political representation in municipalities increased their spatial clustering from 2012 to 2019. Some indicators demonstrate non-significant spatial auto-correlation, reflected in higher p-values (above 0.05), indicating a lack of spatial clustering. For instance, Public transport networks, Specialized doctors, and Density of historical green areas, among others, show no significant spatial patterns in both years. The LISA assessment highlights the highest concentration of spatially correlated observations in East-North and Southern areas. The economic domain has the greatest number of elementary indicators with significant spatial correlation and clustering. Surprisingly, the environmental indicators only have a weak spatial association. The detailed results from the exploratory spatial assessment outlined above are in Sect. 1 of the Appendix.

This empirical evidence favours our hypothesis that neighbouring provinces share information on socioeconomic development levels. Thus, we estimate latent factor analytic statistical models that flexibly account for spatial correlation in the observed data.

3 Methodology: Bayesian Factor Model for Spatially Correlated Data

We incorporate spatial information following the Bayesian factor model proposed by Hogan and Tchernis (2004). This model is based on a latent variable framework, where elementary indicators manifest a hidden construct—the province’s well-being.

For province i, where \(i=1,\ldots ,N\), with \(N = 107\) Italian provinces, let \(Y_{id}\) denote the elementary indicator d in province i. The length D of the observed vector \(\varvec{Y}_i = (Y_{i1},\ldots ,Y_{iD})\) depends on the well-being domain considered: the social domain has \(D = 20\) indicators, the economic domain has \(D= 9\) indicators, and the environmental domain has \(D =5\) indicators.

For each observation i, the latent factor model assumes an L dimensional \(\left( L<D\right)\) latent variable \(\delta _{i}\) that fully characterizes socioeconomic characteristics, which in turn manifest themselves through \(\varvec{Y}_{i}\). Here, we assume \(L=1\), hence reducing the model to one latent factor for each province, and represent the model in a hierarchical form as in Fig. 1.

Fig. 1
figure 1

A graphical representation of a Bayesian hierarchical latent variable model

On the level of observed data, the likelihood is:

$$\begin{aligned} \varvec{Y}_{i}\mid \varvec{\mu },\varvec{\lambda },\delta _{i},\Sigma \sim \text {Multivariate Normal}\left( \varvec{\mu }+\varvec{\lambda }\delta _{i},{\Sigma }\right) , \end{aligned}$$
(1)

where \(\varvec{\mu }\) is a \(D \times 1\) mean vector, \(\varvec{\lambda }\) is a \(D\times 1\) vector of factor loadings, and \({\Sigma }=\textrm{diag}\left( \sigma _1^2,\ldots ,\sigma _{D}^2\right)\) is a diagonal matrix measuring residual variation in \(\varvec{Y}_{i}\), implying independence among the elements of \(\varvec{Y}_{i}\) conditionally on \(\delta _{i}\).

In this model, each factor loading is a variance component, i.e. \(\lambda _d = \text {cov}(Y_{id},\delta _i)\). Because residual variances \(\sigma _d\) differ across elementary indicators, factor loadings measure covariance on different scales. They cannot be directly compared to assess the strength of the association between the elementary indicator and provinces’ well-being. Instead, following Hogan and Tchernis (2004) and Davis et al. (2021), we can examine squared correlation coefficients \(\lambda _d^2 /(\lambda _d^2 + \sigma _d^2)\). The squared correlations represent the proportion of variation in latent province well-being explained by each elementary indicator, providing a comparable measure to the weights used in a standard weighted average.

On the second level, let \(\varvec{\delta }=\left( \delta _{1},\ldots ,\delta _{N}\right) ^T\) be the vector of province latent indexes. The prior distribution is:

$$\begin{aligned} \varvec{\delta }\sim \text {Multivariate Normal}\left( \varvec{0}_N,\Psi \right) , \end{aligned}$$
(2)

where \(\Psi\) is a \(N\times N\) spatial covariance matrix with 1’s on the diagonal and \(\psi _{is}=\text {corr}\left( \delta _i\delta _s\right)\) on the off-diagonal. When \(\Psi =I_N\), the model assumes spatial independence. The well-being composite index for province i is summarized by the mean of the posterior distribution of the latent factor \(\delta _i\) given \(\varvec{Y}\) and \(\varvec{\mu },\varvec{\lambda },\Sigma\).

The prior distributions for the remaining parameters in (1) are:

$$\begin{aligned}&\lambda _d\sim \text {Normal}(g,G)(\lambda _1>0); \end{aligned}$$
(3)
$$\begin{aligned}&\sigma ^2_d \sim \text {Inverse-Gamma}(\alpha /2,\beta /2); \end{aligned}$$
(4)
$$\begin{aligned}&\mu _d \sim \text {Normal}(0, V_\mu ). \end{aligned}$$
(5)

The primary scope of prior distributions is to include subjective opinions on the parameters of interest. However, to let the data speak for themselves and simplify the derivation of posterior distributions, we use conjugate diffuse priors by choosing \(g=0\), \(G=10000\), \(\alpha =1/1000\), \(\beta =1/1000\), and \(V_\mu =1000\).

To include spatial correlation, we work on the spatial covariance matrix \({\Psi }\), parametrizing it both marginally and conditionally. The first marginal specification assumes that the generic element \(\psi _{is}\) of the prior covariance matrix is

$$\begin{aligned} \psi _{is}=\text {corr}(\delta _i\delta _s)= \text {exp}(-\omega d_{is}), \end{aligned}$$
(6)

where \(\omega\) models spatial correlation and \(\omega \ge 0\) ensures that \(\psi _{is}<1\); \(d_{is}\) is the Euclidean distance between centroids of area i and s and \(d_{ii}=0\) by definition (see Hogan and Tchernis, 2004).

The second way to parametrize the covariance matrix \({\Psi }\) is through conditional auto-regressive (CAR) specifications of spatial dependency (see Besag et al., 1991). The more general structures are the Gaussian CAR models. These models first require the construction of a set \(\mathcal {R}_i\) of areas neighbors of area i. Thus, if we assume the conditional distribution of each \(\delta _i\) to be

$$\begin{aligned}\delta _i\mid \{\delta _s: s \in \mathcal {R}_i\}\sim \text {Normal}\left( \sum _{s \in \mathcal {R}_i}{\beta }_{is}{\delta }_s,\frac{1}{\alpha _i}\right) , \end{aligned}$$

then the joint marginal distribution of \(\varvec{\delta }=({\delta }_1,\ldots ,{\delta }_N)^T\) follows a \(\mathrm {Multivariate-Normal}(\varvec{0},{B}^{-1})\), where B is \(N \times N\) spatial covariance matrix with \(\{\alpha _1,\ldots ,\alpha _N\}\) along the diagonal and \(-\alpha _i\beta _{is}\) on the off-diagonal, provided that B is symmetric and positive definite (see Besag, 1974). The \(\beta _{is}\) are general weights defining the influence of province s on the prior mean of \(\delta _i\), while \(\alpha _i\) represents area-level characteristics such as the number of neighborhoods Hogan and Tchernis (2004).

To ensure that B is positive-definite and symmetric, one or more parameters in the CAR models should be constrained. Here, we consider two different CAR specifications.

Model CAR A defines \(\mathcal {R}_i\) as the set of adjacent indicator tracts. R is an adjacency (weight) matrix with \(R_{ii}=0\), \(R_{is}=I(s\in \mathcal {R}_i)\) and \(R_{is}=R_{si}\). Thus, the model assumes \(\beta _{is}=\omega R_{is}\) and \(\alpha _i=1\) (constant), where \(\omega\) measures the degree of spatial correlation. This leads to the definition

$$\begin{aligned} {B}={I}_N-\omega {R}. \end{aligned}$$
(7)

One necessary condition for ensuring that B is positive definite and symmetric is that the ordered eigenvalues \(\xi _1,\ldots ,\xi _N\) of R satisfy: \(\xi _1^{-1}< \omega < \xi _N^{-1}\).

Model CAR B, defines \(\mathcal {R}_i\) in the same way as CAR A but here \(\beta _{is}=\omega R_{ij}(n_s/n_i)^{1/2}\), and \(\alpha _i=n_i\). Where \(n_i\) and \(n_s\) are the number of neighbours of area i and s, respectively.

For this model

$$\begin{aligned} {B}=\text {diag}(n_i)-\omega (n_i*n_s)^{(1/2)}{R}. \end{aligned}$$
(8)

We estimate the model posterior distribution using Markov Chain Monte Carlo methods. Specifically, we use a Gibbs sampling algorithm that includes Metropolis Hasting steps to estimate spatial parameter \(\omega\). At each step of the sampling algorithm, we obtain a draw from the conditional posterior distribution of the model parameters and the latent well-being \(\varvec{\delta }\). We use these draws to build the posterior distributions of all model parameters after accounting for a burn-in period before convergence. We simulate 6000 draws and“burn”3000 of them. To obtain our distribution of well-being ranking, we rank the estimates of \(\delta _i\) in each sampling iteration. The province posterior mean ranking is the mean of a province’s rank across all iterations.

A key advantage of this model is that it can handle missing values through a posterior imputation procedure. The procedure replaces missing elementary indicator values with“draws”from the first level equation conditional on current iterations’ “draws”of the latent factor and the other models”parameters (for more details, see Davis et al. 2021).

We carried out a sensitivity analysis to assess the impact of our prior choices on: (a) the parameters \(\mu _\omega\) and \(V_\omega\) of the spatial parameter \(\omega\) prior distribution, (b) the prior mean and variance, g and G, of the factor loading \(\lambda _j\), and (c) the prior variance \(V_\mu\) of the mean \(\mu\). We also changed the seed or initial values. Finally, we modified the definition of the spatial topology in the CAR models by increasing the number of neighborhoods and defining the spatial weight matrix R differently. In each case, the resulting estimates remained stable.

The results from this assessment prove the stability of the estimated values to variation in prior choices with a slight degree of instability in the marginal correlation model when changing the prior distribution on the spatial parameter. Data are available upon request.

4 Results from the Empirical Application

In this section, we compare and comment on the results from estimating the Bayesian factor model described above on the Province BES data. We first focus on the three well-being domains separately (Sect. 4.1). Then we assemble the three well-being dimensions into an overall well-being indicator (Sect. 4.2). Finally, we aggregate provincial composite indicators into macro areas (NUTS 1) and estimate the Italian macro areas’ well-being over the years (Sect. 4.3).

4.1 Economic, Social and Environmental Well-Being

We first summarize the posterior distributions of the factor loadings (\(\lambda _d\)), residual standard deviations (\(\sigma _d\)), and squared correlations (\(\lambda _d^2/(\lambda _d^2 +\sigma _d^2))\) for each well-being domain. Next, we present composite indicator estimates for each province, which enable us to examine the extent of divergence in provincial well-being over time and space. To aid in visualizing this heterogeneity, we employ maps that offer an intuitive representation of the spatial distribution of well-being. Furthermore, we compare our data-driven posterior well-being rankings with those obtained through the widely used Mazziotta–Pareto methodology, as extensively documented in the literature (De Muro et al., 2011; Mazziotta & Pareto, 2013). We document the level of agreement between both CIs’ approaches. In the following, we focus solely on the results derived from model CAR B, which demonstrated superior models’ performances compared to the other spatial models (see Sect. 1 for models’ selection results).

Tables 1, 2, and 3 report the mean posterior estimates of factor loadings, residual standard deviations and squared correlations in the year 2019 (comparisons across spatial models and for 2012 are in Appendix 1).

Starting from Table 1, we find that the leading indicators in the economic domain are Employment rate, Non-participation rate, and Youth non-participation rate, followed by Pensioners with low pension. These indicators exhibit high squared correlations, indicating their significant and equal weights in explaining the variation in latent economic well-being. Moving on to the environmental domain (Table 2), we observe that the indicators Waste recycling services and Separate collection of municipal waste are the primary drivers of variation in environmental well-being. However, we note that the remaining indicators in this domain have much smaller squared correlations, suggesting a lower impact on environmental well-being.

Finally, turning our attention to Table 3, we find that the most influential indicator for social well-being is Graduates mobility, followed by People not in education employment or training (neet) and People with at least upper secondary education (25–64 years). The square correlations for this domain differ significantly across elementary indicators. Suggesting that the elementary social indicators only partially explain social well-being variation across provinces.

Our data-driven approach reveals that elementary indicators have varying weights and contributions to each well-being domain. This result challenges traditional approaches that equally weigh all indicators and emphasizes the need to consider each domain’s specific context and characteristics when evaluating provincial well-being.

Table 1 Economic well-being: factor loadings, residual standard deviations, and squared correlations with 95% credibility intervals, on CAR B model, in 2019
Table 2 Environmental well-being: factor loadings, residual standard deviations, and squared correlations with 95% credibility intervals, on CAR B model, in 2019
Table 3 Social well-being: factor loadings, residual standard deviations, and squared correlations with 95% credibility intervals, on CAR B model, in 2019

Figures 2, 3, and 4 illustrate each province’s well-being composite indicator and its posterior credibility interval in 2012 and 2019. These figures allow us to assess the variation in well-being trends and rankings of the Italian provinces relative to the Italian mean (represented by the vertical dotted line at 0). Additionally, in Appendix 1, we report for each province the posterior distribution quantiles for the three well-being composite indicators.

Examining the figures, we can discern notable patterns in the distribution of well-being across different domains over time. Specifically, in Figs. 2 and 3, we observe consistent stability in the well-being distribution for the social and economic domains. Only a few provinces exhibit above-average values in the social domain, while most provinces cluster around the mean. This behaviour suggests low polarization in social well-being across Italian provinces. In turn, the economic domain displays more provinces with above and below-average values, indicating stronger polarization of economic levels across the Italian surface.

Figure 4 focuses on the environmental composite indicator. Here, we observe more pronounced variations across the years. Specifically, from 2012 to 2019, a significant decline in environmental well-being is evident across most Italian provinces, leading to heightened polarization in this domain.

Finally, the posterior credibility intervals offer additional insights into the uncertainty surrounding our findings. Over the years, social and economic well-being consistently display relatively narrow credibility intervals for all provinces, indicating low uncertainty in the CIs estimates. On the other hand, there is a gradual decrease in the width of confidence intervals in the environmental dimension over time. This shift reflects an increasing confidence in the CI’s point estimate for 2019 compared to 2012. Notably, the figures highlight the impact of missing data in the elementary indicators of Sud Sardegna province. These missing values notably increase the uncertainty surrounding Sud Sardegna’s CI estimate, evident through much wider credibility intervals than in other provinces.”

Fig. 2
figure 2

Social well-being: composite indicator estimates for Italian provinces in 2012 (left panel) and 2019 (right panel). Note: In each panel, the bars indicate each province’s mean posterior composite indicator value. The horizontal black line corresponds to the 90% posterior credibility interval. The vertical bar at 0 indicates the Italian average for 2012–2019. The wide credibility interval for Sud-Sardegna province is due to the high percentage of missing values. Source: Our elaboration of ISTAT “Province BES” data

Fig. 3
figure 3

Economic well-being: composite indicator estimates for Italian provinces in 2012 (left panel) and 2019 (right panel). Note: see Fig. 2

Fig. 4
figure 4

Environmental well-being: composite indicator estimates for Italian provinces in 2012 (left panel) and 2019 (right panel). Note: see Fig. 2

Next, we map the composite indicators’ estimates for all Italian provinces at the beginning (the year 2012) and end (the year 2019) of the analysis period. Figures 5 and 6 showcase the spatial distributions of social, economic and environmental well-being, respectively. Consistent with our earlier findings, the spatial distribution of social well-being (Fig. 5) remains relatively stable over time. Some of the Northern provinces, particularly the regional capitals, exhibit higher levels of social well-being. In the economic domain, a noticeable polarisation persists between the northern and southern provinces, with the former consistently displaying higher levels of well-being. Regarding environmental well-being (Fig. 6), the North-East provinces tend to fare better. However, an interesting trend emerges: the northern provinces show improvement in their environmental well-being levels, whereas the southern provinces experience a marked decline, indicated by the darker shading throughout the years.

Finally, for each well-being domain, we compare the rankings based on our method with rankings based on the Mazziotta–Pareto methodology, widely used for policy decision-making in Italy. The Mazziotta–Pareto index (MPI) consists of the arithmetic mean of normalized elementary indicators, incorporating a penalization term for indicator variability, with equal weights assigned to all indicators. The general formula for computing the index entails two steps (Mazziotta & Pareto, 2013). First, we calculate the normalized indicator values as follows:

$$\begin{aligned} z_{id} = 100 + \frac{(y_{id} - \bar{y}_d)}{s_d}10, \end{aligned}$$

where \(\bar{y}_d\) and \(s_d\) represent the elementary indicator d mean and standard deviation respectively. Then, we estimate the MPI as follows:

$$\begin{aligned} \text {MPI}_i = M_{z_{i}} + S_{z_{i}}\text {cv}_i, \end{aligned}$$

where \(M_{z_{i}} = \frac{\sum _{d = 1}^D z_{id}}{D}\), \(S_{z_{i}} = \sqrt{\frac{\sum _{d = 1}^D (z_{id}-M_{z_{i}})^2}{D}}\), and \(\text {cv}_i =\frac{S_{z_{i}}}{M_{z_{i}}}\).

Figures 7 and 8 show the rankings estimated by our Bayesian model on the x-axes and the corresponding Mazziotta–Pareto rankings on the y-axes. The diagonal line indicates perfect agreement between the Mazziotta–Pareto rank and our mean posterior rank. The farther the provinces are located from this line, the higher the disagreement between the two methodologies.

First, we notice high agreement (Pearson correlation coefficients \((\rho ) >0.8\)) between the two methodologies, more pronounced at the top 20% of the rank distribution in all three domains. The economic domain has the highest ranking agreement (\(\rho = 0.96\)), followed by the social (\(\rho = 0.92\)) and the environmental domains (\(\rho = 0.86\)). We observe more disagreement towards the bottom to the middle of the distribution. These discrepancies in rankings can be attributed to the variation in the weights assigned to the elementary indicators in our model compared to the equally weighted Mazziotta–Pareto indicator. This finding indicates that different evaluation methods for provincial well-being can lead to changes in provincial ranks. However, it is important to note that our results align with the Mazziotta–Pareto ranking for some provinces, suggesting that certain provinces have specific needs that warrant more focused interventions.

Fig. 5
figure 5

Maps of provincial social well-being composite indicators, for 2012 (top panel) and 2019 (bottom panel). Note: Italian provinces are grouped in well-being quintiles. The more ’purple’ colors refer to worse-off provinces, while ’greener’ shades indicate better-off provinces. The black dots indicate provincial capitals. Provinces with negative values are below the Italian averages over the entire period of analysis

Fig. 6
figure 6

Maps of provincial economic (left) and environmental (right) well-being composite indicators, for 2012 (top panel) and 2019 (bottom panel). Note: see Fig. 5

Fig. 7
figure 7

Social well-being posterior mean rankings and Mazziotta–Pareto rankings for 2019. Note: Posterior mean rankings produced by model CAR B. The R in the left corner is the Pearson correlation coefficient between posterior mean ranking and the Mazziotta–Pareto rankings. We remove Sud-Sardegna province from the plot for its many missing values

Fig. 8
figure 8

Economic well-being (a) and environmental well-being (b) posterior mean rankings and Mazziotta–Pareto rankings for 2019. Note: see Fig. 7

4.2 Overall Well-Being

To provide a comprehensive assessment of the overall well-being of each province, we condense the three previously estimated composite indicators into a single composite value. The three composite indicators already account for the spatial correlation among Italian provinces. Here, we employ a spatially independent latent factor model using posterior mean estimates of the three well-being composite indicators as the model’s outcomes. Let \(\varvec{\hat{\delta }}_{i}=(\hat{\delta }_{i1},\hat{\delta }_{i2},\hat{\delta }_{i3})\) indicating the 3-dimensional vector of composite well-being indicators for each province i. We consider the following Bayesian factor model:

$$\begin{aligned} \varvec{\hat{\delta }}_{i} \mid \varvec{\mu },\varvec{\lambda } ,w_i, \Sigma \sim \text {Multivariate Normal}(\varvec{\mu } + \varvec{\lambda } w_i, \Sigma ) \quad \textit{likelihood} \\ w_i \sim \text {Normal}(0,1) \quad \textit{prior} \end{aligned}$$

We estimate the posterior distribution of factor loadings, residual standard deviation, and squared correlations, as presented in Table 4. This table reveals two key insights. Firstly, we observe a strong correlation between the overall well-being composite indicator (\(w_i\)) and the economic and social well-being composite indicators. Notably, the economic domain appears to have the highest weight among the well-being domains. This suggests that targeting economic aspects in low-developed provinces would reduce the disparities in overall well-being between Italian provinces.

Figure 9 provides a map representation to illustrate overall well-being visually on the Italian surface. The spatial distribution of overall well-being is non-random, with the northern provinces consistently exhibiting higher levels of well-being while the southern provinces consistently experience lower well-being. Additionally, overall well-being slightly increases over time in some southern and central provinces, revealing a moderate polarisation reduction.

Table 4 Overall well-being: factor loadings, residual standard deviations and squared correlations with 95% credibility intervals in 2019
Fig. 9
figure 9

Maps of provincial overall well-being composite indicator, for 2012 (top panel) and 2019 (bottom panel). Note: see Fig. 5

4.3 Macro Region Well-Being

Finally, we aggregate provinces belonging to the same macro-region m (NUTS1), for \(m = 1, \ldots , 5\), i.e. Northwest, Northeast, Center, South, and Islands, to assess the evolution of the Italian macro-regional well-being over time. We consider a hierarchical model, which requires specifying a prior distribution for the mean (\(\alpha _{m[i]}\)) for each macro-area m of the latent variable \(\hat{\delta }_{i}\). We also assume the variance (\(\sigma _{m[i]}\)) to vary across macro-areas. More formally, for the three well-being domains, the model becomesFootnote 2:

$$\begin{aligned} \hat{\delta }_{i}\mid \alpha _{m[i]}, \sigma _{m[i]} \sim \text {Normal}(\alpha _{m[i]},\sigma _{m[i ]}) \quad {likelihood} \\ \alpha _{m[i]}\sim \text {Normal}(0,1) \quad {prior}\\ \sigma _{m[i]}\mid \nu , \tau \sim \text {cauchy}(\nu ,\tau ) \quad {prior} \\ \end{aligned}$$

As standard practice, we chose a normal distribution as the prior distribution for the mean (\(\alpha _{m[i]}\)) and a Cauchy distribution for the standard deviation (\(\sigma _{m[i]}\)) of the latent factor distribution (Gelman et al., 2013), and interpret \(\alpha _m\) as the well-being level of macro-region m.

Figures 10 and 11 show each macro area time series for social, economic, environmental, and overall well-being. These figures reveal a consistent and enduring macro-territorial division that characterizes the Italian territory throughout the analyzed period. Notably, the South and Islands consistently fall below the average, while the Center, Northwest, and Northeast remain above the average. Moreover, these macro areas intersect in specific years and for particular well-being domains.

The trend in economic well-being remains relatively flat over time, with a consistent ranking of macro areas across the years. On the other hand, social well-being, illustrated on the left in Fig. 10 exhibits more interaction among macro areas over the years. The Center aligns with the Northwest, maintaining a similar trajectory until 2019, while the Northeast shows a slight upward trend. In contrast, the Islands’ social well-being experienced a decline over time, reaching a lower level in 2019 compared to the beginning of the series.

Only environmental well-being demonstrates a non-flat trend over time among the four estimated time series. In 2016, the Northwest and Northeast aligned, while the South experienced a steady decline after 2015. The Center exhibits an upward trend after 2016, and the Islands remain relatively stable.

Finally, the overall domain mirrors the evolution of social and economic well-being levels. The environmental domain contributes minimally to determining the overall well-being trend.

Fig. 10
figure 10

Social (left) and economic (right) well-being composite indicator for Italian macro territorial areas (black dotted line indicates the Italian average)

Fig. 11
figure 11

Environmental (left) and overall(right) well-being composite indicator for Italian macro territorial areas (black dotted line indicates the Italian average)

5 Concluding Remarks

This paper applies a Bayesian spatial latent factor model to propose well-being composite indicators and rankings for all Italian provinces from 2012 to 2019. Our approach differs from traditional composite indicators methodologies in several ways. First, we modeled the spatial dependence of elementary indicators, capturing potential socioeconomic spillover effects. Second, we incorporate a measure of composite indicators uncertainty related to missing data. Third, we estimate data-driven weights for elementary indicators, thus avoiding an arbitrary selection of weight exposed to subjective opinion.

Using the“Province BES”dataset from ISTAT, we examine the assumption of spatial independence in the elementary indicators by conducting global and local tests of spatial association. This initial assessment confirms positive spatial association in the“Province BES”indicators. We then categorize the indicators into three sustainable development well-being domains: social, economic, and environmental. Employing a Bayesian approach, we estimate the posterior distribution of latent variables, with their expected values interpreted as hidden well-being indicators for Italian provinces.

The study reveals significant disparities in social and economic well-being between northern and southern regions, with the northern provinces consistently demonstrating higher levels of well-being. In contrast, the environmental dimension exhibits less persistent polarization, with above-average levels observed in the South. One possible interpretation is that environmental consciousness has gained prominence more recently than socioeconomic aspects. Consequently, northern and southern provinces are experiencing increased climate awareness, with similar provincial investments in environmental well-being rates. Compared to the Mazziotta–Pareto approach, our rankings diverge, particularly at the upper end of the well-being distribution and within the environmental domain. Uncertainty in ranking estimates is also higher for provinces that are better off. These findings suggest that the government could allocate resources more effectively by targeting provinces at the lower end of the well-being ranking.

Subsequently, we reduce the three well-being dimensions into an overall well-being indicator for each Italian province. This composite indicator, driven primarily by the economic domain and with minimal weight given to environmental well-being, remains stable and clustered throughout the analyzed period. These findings emphasize the significance of economic factors in shaping overall well-being and highlight regional disparities within Italy. They also indicate that focused interventions to improve economic conditions can reduce well-being disparities between provinces.

Additionally, we extend the analysis to the NUTS-1 level, encompassing Northwest, Northeast, Center, South, and Island macro-regions, in order to provide well-being trends across the analyzed period. The results demonstrate varying degrees of heterogeneity among these macro areas.

The primary limitation of this study is the limited number of indicators available within the environmental dimension compared to the social and economic dimensions, which also suffer from more missing observations. As long as data on environmental aspects remain scarce, it will be challenging for researchers to provide robust evidence in favour of climate policy interventions. In future research, we aim to enrich the environmental analysis by integrating advanced sensor measurements of air pollution, water quality, and soil temperature into national accounts. Additionally, we plan to incorporate a subjective dimension that considers citizens’ perceptions of life satisfaction.

Overall, this study contributes to the understanding of well-being dynamics in Italy, offering valuable insights for policymakers in addressing regional disparities and focusing on targeted interventions for improved well-being outcomes.