Three-fold Fay–Herriot model for small area estimation and its diagnostics

This paper introduces a three-fold Fay–Herriot model with random effects at three hierarchical levels. Small area best linear unbiased predictors of linear indicators are derived from the new model and the corresponding mean squared errors are approximated and estimated analytically and by parametric bootstrap. The problem of influence analysis and model diagnostics is addressed by introducing measures adapted to small area estimation. Simulation experiments empirically investigate the behavior of the predictors and mean squared error estimators. The new statistical methodology is applied to Spanish living conditions survey of 2004–2008. The target is the estimation of proportions of women and men under the poverty line by province and year.

1 Introduction Fay and Herriot (1979) introduced an area-level linear mixed model (LMM) that improves the accuracy of direct estimators by incorporating auxiliary information.In small area estimation (SAE), the advantage of using procedures based on arealevel models lies in the greater facility to incorporate explanatory variables of quality in the inferential process, with respect to other approaches based on individual models.See, e.g.Chapter 16 of Morales et al. (2021).
Direct estimators only use the data for the variable of interest in the estimation domain.In those domains with small sample sizes, the direct estimators will have large variances and therefore will not be very precise.An alternative estimation procedure consists of introducing auxiliary information by means of a statistical model.Area-level linear mixed models do this and can take into account hierarchical, spatial or temporal correlation of the data.
The surveys or administrative records from previous time periods and the hierarchical structure of the data provide important sources of information that can be used to improve the efficiency of small area estimators.Some papers have generalized the Fay-Herriot model to borrow strength from time.They have considered correlated structures for sampling errors or random effects, time-varying random slopes, and state-space models.Without claiming to be exhaustive, we can cite the contributions of Pfeffermann and Burck (1990), Rao and Yu (1994), Ghosh et al. (1996), Datta et al. (1999Datta et al. ( , 2002)), You and Rao (2000), Singh et al. (2005), González-Manteiga et al. (2010).Esteban et al. (2012), Marhuenda et al. (2013), Morales et al. (2015), Benavent and Morales (2021).
Another possible generalization of the Fay-Herriot model is to take advantage of the territorial and demographic structure of the population.Rao and Yu (1994) introduced an area-level LMM with two levels of hierarchy.However, its basic model has not been extended to three levels in order to jointly model data from domains, subdomains, and time periods or subsubdomains.Torabi and Rao (2014) studied an extension of the Fay-Herriot area-level model to sub-area level, to obtain estimators for sub-areas nested within areas.Cai and Rao (2022) extended the transformation method of Cai et al. (2020) for variable selection to a three-fold linking model.They proposed two transformation-based methods for variable selection: one is parameter free and the other is parameter-dependent.However, they didn't report EBLUP estimation for the three-fold proposed model.While Krenzke et al. (2020) developed an area-level univariate and bivariate Hierarchical Bayes linear three-fold model for proportions and averages.
The scarce research on hierarchical temporal models may be due to the fact that the usual periodicity of surveys in public statistics (quarterly or annual) does not allow cross-section data obtained with the same methodology over long periods of time.For these reasons, it is difficult to fit models with complex correlation structures to data, when the number of consecutive surveys available is not large enough.This paper covers in part this gap by introducing, for the first time in the statistical literature, a three-fold Fay-Herriot model, but having in mind a real data problem of poverty estimation and mapping.

3
Three-fold Fay-Herriot model for small area estimation and… The new model is an area-level LMM with known error variances.This fact prevents, for example, the application of functions of the R packages lme4 or nlme for fitting LMMs.This is why we derive and implement a Fisher-scoring algorithm to estimate the model parameters by the residual maximum likelihood (REML) method.Based on the model, the empirical best linear unbiased predictors (EBLUP) of linear indicators are derived and the corresponding mean squared errors (MSE) are approximated.Two MSE estimators are considered.The first one is based on the analytical MSE approximation and the second one is based on parametric bootstrap.In the first case, we follow (Prasad and Rao 1990) and (Datta and Lahiri 2000).In the second case, we follow (Hall and Maiti 2006) and (González-Manteiga et al. 2008).
Influence and diagnostics analysis are tools to validate a model and to get valuable information to interpret underlying data relationships.The aim is to investigate if there are unfavorable cases by some established tolerance limits.Generally, exceeding those thresholds of attention may give an idea about the potential lack of fit of the model adopted.Some statistics in the literature are traditionally employed for detecting several aspects of these warnings, and then for assessing their theoretical and empirical implications.We follow (Demidenko and Stukel 2005;Zewotir and Galpin 2007;Christensen et al. 1992;Calvin and Sedransk 1991) to introduce leverages, Cook's distances, Covratio statistics, and residuals.
Nevertheless, when LMMs are applied to SAE problems, the influence and diagnostic tools should also take into account the impact of deleting a domain in the EBLUPs and in the estimators of the corresponding MSEs.As far as we know, this SAE-specific influence analysis has not been yet investigated.This paper introduces new tools to analyze the influence of domain deletion in the SAE results.Further, it also gives measures of model efficiency by taking into account the correlations between residuals and perturbations and the variability of the variance component estimators.The new diagnostic tools are applied to the three-fold Fay-Herriot model selected to analyze poverty data.
The eradication of poverty is a priority in 21st-century societies.This has led to the emergence of many studies to investigate statistical methodology that allows poverty to be mapped at different levels of territorial aggregation.Regarding estimation techniques in small areas, we can cite some works.Molina and Rao (2010) proposed empirical best predictors (EBPs) based on a nested error regression model.Hobza and Morales (2016), Hobza et al. (2018) fitted a logit mixed model to unit-level poverty data and derived EBPs for poverty proportions.Tzavidis et al. (2008), Marchetti et al. (2012) introduced robust estimators of poverty indicators by using a quantile regression approach.Marchetti and Secondi (2017), Giusti et al. (2017), Tzavidis et al. (2015) gave interesting contributions to mapping poverty and health using Italian data while Tonutti et al. (2022) assessed anti-poverty policy at the Italian local level, and Guadarrama et al. (2022) based their poverty predictions on temporal linear mixed models.In the case of area-level models, Esteban et al. (2012), Marhuenda et al. (2013), Morales et al. (2015).However, Boubeta et al. (2016Boubeta et al. ( , 2017)), López-Vizcaíno et al. (2013, 2015), Chandra et al. (2017) employed generalized linear mixed models, with Poisson or multinomial distributions.The books edited by Pratesi (2016), Betti and Lemmi (2013) give more contributions to the small area estimation of poverty indicators.
This paper introduces new statistical methodology for poverty mapping and presents an application to data from the Spanish living condition survey (SLCS).This survey is designed to obtain reliable direct estimators in NUTS 2 type regions (autonomous communities), but the sample sizes are quite small in NUTS 3 type territories (provinces), according to the current NUTS classification of EURO-STAT 2016.The aim of the study is to estimate the proportions of poverty by sex in the Spanish provinces during the years 2004-2008.For this reason, this paper introduces an area-level LMM with random effects in the province, in the province crossed with sex and in the crosses of province, sex and year.A three-fold Fay-Herriot model with time correlated AR(1) random effect is fitted for the application to SLCS data, but the results suggest to prefer a simpler model without complex temporal correlation structure, since we only have data for 5 periods (years).
The paper is organized as follows.Section 2 introduces the new three-fold Fay-Herriot model, the fitting algorithm to calculate the REML estimators of the model parameters, the specific influence and diagnostic tools, the EBLUPs of domain linear indicators and the analytic and bootstrap estimators of the corresponding MSEs.Section 3 presents some simulation experiments to investigate the performance of the fitting algorithm, the predictors, and the estimators of MSEs.In addition, the BLUPs and EBLUPs based on of the one-fold, two-fold and three-fold Fay-Herriot models are investigated.Section 4 deals with the application to real data and the mapping of poverty proportions.Section 5 gives some conclusions.The paper contains three appendixes.Appendix A derives the final expression of the analytic estimator of the MSEs.Appendix B defines the three-fold Fay-Herriot model, in case of AR(1) time effects, and presents the REML procedure for the estimation of the variance component parameters.Appendix C gives the list of Spanish provinces with the corresponding acronyms.

The model
Let us consider the three-fold Fay-Herriot model where y drt is a direct estimator of the characteristic of interest and x drt is a row vec- tor containing the aggregated population values of p auxiliary variables.The subscripts d, r and t represent domains, subdomains, and subsubdomains respectively.For example, d, r and t may represent area (province), category (sex group), and time period (year) respectively.We assume that u (2.1) Three-fold Fay-Herriot model for small area estimation and… generalizes the two-fold Fay-Herriot model described in Section 17.2 of Morales et al. (2021).At the domain level d and subdomain level r, we define the matrices and vectors where 1 T denotes the T × 1 vector of ones, I T is the T × T identity matrix and col and diag are the columns and diagonal matrix operators respectively.
We can write model (2.1) in the subdomain form The variance matrix of y dr is At the domain level d, we define the vectors and matrices We can write model (2.1) in the domain form

Let us define the
. We can write model (2.2) in the linear mixed model form The variance matrix of y d is At the population level, we define the vectors and matrices , 1≤t≤T (e drt ) ∼ N T (0, V e,dr ), 1≤r≤R (e dr ) ∼ N RT (0, V e,d ). (2.2) 1 3 We can write model (2.1) in the population form where the vectors u 1 , u 2 , u 3 and e are mutually independent.The variance matrix of y is Let us define the DRT × (D + DR + DRT) matrix Z = (Z 1 , Z 2 , Z 3 ) and the . We can write model (2.3) in the linear mixed model form If we assume that  2 1 > 0 ,  2 2 > 0 , and  2 3 > 0 are known, then the best linear unbi- ased estimator (BLUE) of and the best linear unbiased predictor (BLUP) of u are The empirical versions, EBLUE of and EBLUP of u , are obtained by plugging estimators σ2 1 , σ2 2 and σ2 3 in the place of 2 1 , 2 2 and 2 3 , i.e. , (2.3) (2.4) Three-fold Fay-Herriot model for small area estimation and… where We are interested in predicting the linear combination of fixed and random effects and the BLUP of can be written in the form

Estimation of variance component parameters
This Section gives a Fisher-scoring algorithm to calculate the REML estimators of 2 1 , 2 2 and 2 3 .The REML log-likelihood is By taking derivatives of l reml with respect to a , we have the elements of the score vector S = S( ) = (S 1 , S 2 , S 3 ) � , i.e.
By taking second derivatives with respect to a and b , taking expectations and changing the sign, we obtain the components of the Fisher information matrix F = F( ) = F ab a,b=1,2,3 , where The updating formula of the Fisher-scoring algorithm is As algorithm seeds, we may take (0) The REML estimator of is where V = V( ̂ ) was defined above.Under regularity assumptions, the asymptotic distributions of ̂ and ̂ are Asymptotic confidence intervals, at the level 1 − , for a and j are where F −1 ( ̂ ) = ( ab ) a,b=1,2,3 , (X � V −1 ( ̂ )X) −1 = (q ij ) i,j=1,…,p and z is the -quantile of the standard normal distribution.If βj =  0 , then the p-value for testing H 0 ∶ j = 0 is

Influence analysis and model diagnostics
Influence and diagnostics analysis are tools to validate a model, but also to get valuable information to interpret underlying data relationships.This section assumes that model parameters are estimated by the REML method.The aim is to investigate if there are unfavorable cases by some established tolerance limits.
1 3 Three-fold Fay-Herriot model for small area estimation and… Generally, exceeding those thresholds of attention may give an idea about the potential lack of fit of the model adopted.Some statistics in the literature are traditionally employed for detecting several aspects of these warnings, and then for assessing their theoretical and empirical implications.We focused on some diagnostic measures, like the leverage points, the impact of domain deletion on fixed effects, the Cook's distances, and the residual analysis.We present some additional valuations, given by some peculiarity of the area-level LMMs for SAE.
In these types of models, like in the Fay-Herriot model or in the model (2.3), the variance of the regression error is given, instead of the case of the standard LMM.This fact has some interesting consequences, in terms of the measure of the general efficiency of the model.This section presents some influence and diagnostics tools adapted to the model (2.3).In particular, Sect.2.3.4 introduces new influence statistics derived from the estimators of the MSEs of the EBLUPs, as the model efficiency and the mse-ratio.

Leverages
By following Demidenko and Stukel (2005) (Demidenko and Stukel 2005).In particular, if the row-vectors x � drt ∼ N( x , x ) of the matrix X d follow a multinormal distribution, so that x ( Belsley et al. (1980)): (2.6) With D > 50 , and p > 10 , the 95% value of the F distribution is less than 2 (Bol- len and Jackman 1990).It is easy to see that equation 2.7 gives 1 when Then the cut-off value for the leverage of fixed effects seems fairly defined by twice the average RT tr(L f ,d ) = p D , as happens for the linear regression model.
Those domains with leverages greater than the cut-off points reveal a potentially significant influence of the direct estimator on the value of the EBLUP vector ̂ d .In the linear regression with a sample of n observations, i = 1, ..., n , there are some accepted critical values for the Cook's distance Δ i , when the observation i is deleted.One of this is Δ i > 1 , that corresponds to "distances"between ̂ and ̂ (i) that lie beyond a 50% confidence region, considered as too large.The most used cutoff values are Δ i > 4 n , or Δ i > 4 n−p .These values are derived starting from the alternative formula

Domain deletion
with L i the diagonal element of the leverage matrix, and t i the i-th studentized residual.Substituting the L i with the mean leverage L i = p n , the last critical value gives , when the regression studentized residuals exceed the usual critical value t i = 2 .Using the formulation of Banerjee and Frees (1997) of the confi- dence ellipsoid at ( 1 − )100% for the ̂ 's, i.e. ( − ̂ ) � (X � V −1 X) −1 ( − ̂ ) ≤ pF 1− , for a linear mixed effects model we get the following critical region for Δ d : (2.7) (2.9) Three-fold Fay-Herriot model for small area estimation and… This critical value corresponds to the similar formula of the threshold defined in Cook (1977) for the linear regression model.Christensen et al. (1992)

Residuals
We use the approach of Calvin and Sedransk (1991) to analyze the model residuals.The three-fold Fay-Herriot model is a block-diagonal LMM with correlated observations, by their nested hierarchical structure.The method employs the analysis of the model-scaled residuals, after using the Cholesky root of the covariance matrix of the vector of target variables.Then, the new model has a uniform dispersion matrix.Given the block-diagonal covariance matrix V = V( ̂ ) and the correspondent Cholesky root V = Ĉ� Ĉ , the transformed model is . The residuals of model (2.12) are and the externally studentized residuals rm ch,drt are (2.10) (2.12) where v drt is the diagonal element of the matrix σ2 is the estimated variance of the regression model (2.12), treated as a fixed effect linear model, with the data (y drt , x drt ) omitted.Furthermore, the asymptotic distribution of rm ch,drt is Student's t with DRT − p − 1 degrees of freedom.

MSEs and correlations
The efficiency of the EBLUP in the small area estimation is usually evaluated by comparing the average MSE's of the different models under investigation, given some application data.When theoretical studies or simulations are available, it is possible to compare the effectiveness of a model with respect to the population parameters given by a data generating process.Nevertheless, due to the circumstance that employing the BLUP approach is justified by the presence of large sampling variances in the small areas, one may study the behavior of the EBLUP in terms of the reduction of the estimated variability of the direct estimator, towards the prediction of the MSE of the EBLUP.Then, a normalized measure of the overall reduction of the sampling variances by the MSE's may be quite useful, when assessing the efficiency of the linear predictor of a small area estimation model.Let us consider the conditional residuals r c = y − ̃ , so that Appendix A calculates the MSE of the BLUP ̃ and obtains the formula (A.1), i.e.MSE( ̃ ) = V e − V e PV e .Therefore, it holds that For ease of exposition, we write . By using the alternative formula (2.13) Three-fold Fay-Herriot model for small area estimation and… Moreover, r c = V e Py ∼ N(0, V e PV e ) , and

If corr(r c
i , e i ) approaches 1, this suggests that the correspondent conditional residual r c i can be considered a worthy replacement of the sampling error e i .Given 0 ≤ The residuals r c i can be written as Zewotir and Galpin (2007) in the framework of mixed models, here we observe that if This means that small areas with small c * ij give small residuals, and then they represent high leverage values.When c * ii ⟶ 0 the correspondent small areas make little contributions to , and the predicted MSE tends to equal the estimated sampling variance.Conversely, with c * ii ⟶ −2 i , we have corr(r c i , e i ) ⟶ 1 , and the conditional residuals as This condition gives the major possiblecontribution of the i-th small area to the value of the index , and the best performance of the model in terms of the reduction of the expected MSE of the EBLUP.
A measure of the model efficiency, say , is then based on tr corr (r c , e) , i.e.
where the last sum in i can be calculated by adding the diagonal elements of the square matrix Taking the MSE estimator mse( μi ) = mse( μi ) + 2g 3,i ( ̂ ) from Sect.2.4, we define the mse-ratio (2.15) A dispersion graph plotting ( i , i ) indicates jointly the efficiency of the model and the contribution of the variability of the variance component estimators to the estimation of the MSEs of the EBLUPs μi , i = 1, … , DRT.

Estimation of mean squared errors
This Section considers the problem of estimating the MSE of the EBLUP of We give an analytic and a bootstrap MSE estimator.

Analytic estimator
For approximating the MSE of μdrt , we follow Section 17.2.3 of Morales et al. (2021).
We obtain the formulas where where ̂ is the REML estimator.Appendix A gives the calculations of g 1drt , g 2drt and g 3drt .Alternatively, we introduce a parametric bootstrap procedure for estimating the MSE( μdrt ).

Parametric bootstrap estimator
The parametric bootstrap procedure for estimating the MSE of μdrt has the following steps.

Output: for
Three-fold Fay-Herriot model for small area estimation and… 3. Calculate also the corresponding relative performance measures in %.The relative bias (RBIAS), the relative root-MSE (RRMSE), the average absolute relative bias (ARBIAS), and the average relative root-MSE (ARRMSE) are Table 1 presents the results of Simulation 1 for R = T = 10 .It shows that the bias is always close to zero and that the MSE decreases when the number of domains D increases, so the REML estimators seem to be consistent.From the last line, one can observe that the MSE of the EBLUP is almost constant with increasing D. This behavior is expected since as the number of small areas increases, the number of terms that we have to predict also increases.So that the ratio of what we have to predict to sample size is constant.

Simulation 1a
Simulation 1a compares the BLUPs and EBLUPs based on of the one-fold, twofold and three-fold Fay-Herriot models.We generate the data in the same way as in Simulation 1.This is to say, the data generating model is the FH3 model The one-fold Fay-Herriot model (FH1) is If we define the new index i = (d, r, t) , then we can write The two-fold Fay-Herriot model ( FH2) is If we define the new indexes i = (d, r) and j = t , then we can write We take 1 and 2 1 = 0.1, 0.5, 1, 1.5, 2 .Tables 2 presents the simulation results of absolute (top left)  3 presents the simulation results of absolute (top left) and relative (top right) Bias and root-MSE (bottom left) and relative root-MSE (bottom right) for EBLUPs.Compared to the BLUPs the results for the EBLUPs are mitigated although the same trend remains for FH1, FH2 and FH3.

Simulation 2
Simulation 2 studies the behavior of the estimators of the MSE of the EBLUP, introduced in Sect.2.4.Both the analytic and the parametric bootstrap estimators are investigated.More concretely, we investigate the behavior of the estimators mse( μdrt ) and mse * ( μdrt ) .To do this, such estimators are compared with the empirical MSE of μdrt obtained at the output of Simulation 1.The procedure is as follows.

Output: for
The bias (BIAS), the root mean squared error (RE), the average absolute bias (AAB), the average root mean squared error (ARE), the relative bias (RB), the relative root mean squared error (RRE), the average absolute relative bias (AARB) and the average relative root mean squared error (ARRE) are 4. Output: , a = 0, 1,  Three-fold Fay-Herriot model for small area estimation and… Figure 1 plots the boxplots with the DRT values of RB 1 drt (left) and RRE 1 drt (right), with D = 50 and R = T = 10 for B = 100, 200, 300, 400 .We observe that the para- metric bootstrap method presents a small relative bias around zero and a relative root MSE which mean decreases from 0.73% to 0.54% as B increases.We recommend applying this method with at least B = 300 replications, even if the calculation times can be heavy.
Table 4 presents the results of the performance measures of Simulation 2 for R = T = 10 , B = 300 and several values of D. It shows that AAB a and ARE a , a = 0, 1 , tend to zero as D grows.The analytical estimator mse 0 drt has a very good behavior.It is basically unbiased and its average quadratic error is smaller than that of the basic bootstrap estimator mse 1 drt .

Model and poverty predictions
This Section presents an application to real data with the main goal to estimate the poverty incidence (proportion of individuals under poverty line) in Spanish domains under an area-level three-fold Fay-Herriot model.We use data from the SLCS corresponding to the years 2004-2008, with D = 52 domains (provinces), R = 2 sub- domains (sexes) and T = 5 time periods (years).The target variable y drt is de Hájeck direct estimator of the poverty proportion and the error variance 2 drt is the estimate of the design-based variance of the direct estimator, as described in Section 2.5 of Morales et al. (2021).
As usually done in SAE, the auxiliary variables are related to demographic characteristics, socioeconomic status, education, and immigration status which are relevant for government policymaking and program planning, also at smaller geographical levels.It is essential to take into account the interpretation of the relationship drt , a = 0, 1. between auxiliary and dependent variables, and the fact that it could vary with the level of aggregation.We take the auxiliary variables from the Spanish Labour Force Survey (SLFS).The SLFS is published quarterly, includes nearly 65,000 dwellings, equivalent to approximately 160,000 people.Each quarterly SLFS sample size is larger than the size of an annual SLCS sample.Further, to increase the precision of the estimates, we jointly use the data from the four trimesters of years 2004-2008 to calculate the auxiliary variables by using the formula of the Hájek direct estimator of a domain mean.The constructed auxiliary variables are the domain proportions of people in the categories of the following factors: • citizenship: with 2 categories denoted by cit1 for Spanish and cit2 for Not Spanish.
• labor: labor situation with 4 categories taking the values lab0 for Below 16 years, lab1 for Employed, lab2 for Unemployed and lab3 for Inactive.Three-fold Fay-Herriot model for small area estimation and… In the strict sense, the explanatory variables have measurement errors independent of sampling errors.It would therefore be necessary to generalize to the FH3 model, the approaches of Ybarra and Lohr ( 2008), Burgard et al. (2020Burgard et al. ( , 2022)).This task exceeds the objectives of the current research.In this sense, this section presents an illustrative application of the statistical methodology introduced in Sect. 2. Based on the preliminary exploratory analysis, the only significant covariates of the selected model were the ones included in Table 5.The model has two statistically significant variables that have a relevant meaning in the socio-economic sense.These variables are cit2 (not Spanish citizens) and lab2 (unemployed).Table 5 pre the signs of the regression parameters we interpret that there is an inverse relationship between poverty proportion and the category cit2 of the explanatory variable.That is, poverty incidence tends to be smaller in those domains with a larger proportion of population in the subset defined by citizenship not Spanish.On the other hand, poverty incidence tends to be larger in those domains with a larger proportion of population in the subset defined by lab2 , i.e. in the category of unemployed peo- ple.All the p-values are lower than 0.05 for all the considered auxiliary variables.Table 5 also gives the asymptotic confidence intervals (CIs) for the regression parameters at the 95% confidence level.The rows with labels INF and SUP contain the lower and upper limits respectively.By observing these confidence intervals, we conclude that all the regression parameters are significantly different from zero.
Table 6 presents the CIs for the variance parameters at the 95% confidence level.The columns with labels INF and SUP contain the lower and upper limits respectively.The column with label 0 ∈ CI contains T (true) if 0 belongs to the CI and F (false) otherwise.We observe that the CIs for all the variances do not contain the origin 0 in any case, so the variances are significantly positive.Table 6 also gives the CIs for the difference of variances 2 1 − 2 2 , 2 1 − 2 3 and 2 2 − 2 3 .The three variances can be considered two by two as different at the 95% confidence level is significantly greater than zero.Therefore, we can recommend this model for deriving predictors of poverty indicators.
Figure 2 plots the direct and EBLUP poverty proportion estimates for all domains (province × sex × year).Concerning root-MSEs, the figure shows that  Three-fold Fay-Herriot model for small area estimation and… the EBLUPs have lower MSEs than the direct estimators.Therefore it is worthwhile using model-based predictors instead of direct estimators.Figure 3 plots the direct and EBLUP poverty proportion estimates for men (left) and women (right) in the year 2008.This figure shows that they follow a similar (parallel) pattern.
Figure 4 plots the RMSEs of the direct estimators and the EBLUPs of poverty proportions for men (left) and women (right) in the year 2008.This figure shows that the model-based predictors have lower root-MSEs than the direct estimators.
Figure 5 plots the Spanish provinces in 4 colored categories depending on the values of the EBLUPs (in % ) of the poverty proportions.We observe that the Span- ish regions where the proportion of the population under the poverty line is smallest are those situated in the north and east, like Cataluña, Navarra and Basque country.On the other hand, the Spanish regions with higher poverty proportions are those situated in the center-south, like Andalucía and Extremadura.In an intermediate position, we can find regions that are in the center-north of Spain, like Castilla León, Cantabria or Asturias.
We also fitted the SLCS data to a simpler two-fold Fay-Herriot model as a competitor.On the basis of the likelihood-ratio test (LRT), at 95% we rejected the null hypothesis that the simpler model is consistent with the data, and therefore we selected the three-fold Fay-Herriot model.The Akaike information criterion (AIC) is 1329.43 for the FH2 model versus 1331.673 for the FH3 model.We additionally calculated the Schwarz's Bayesian information criterion (SBC) which result 1318.928 for the FH2 model versus 1315.92 for the FH3 model.From the results of the test LRT and the SBC measure we can consider the more complex three-fold Fay-Herriot model as the best model for the fitting.In order to check if used data from the SLCS can be modelled with a time correlated effect, we fitted them on a three-fold Fay-Herriot model with AR(1) time effects.Details on model definition, REML algorithm for variance components estimation and confidence intervals are given in Appendix B. Table 7 presents the regression parameters and their corresponding p-values.It shows that estimated regression coefficients are quite similar to the ones obtained fitting the model with independent time effects as shown in Table 5.Table 8 report the 95% confidence intervals for the variance parameters of the random effects.The results show that fitting the three-fold Fay-Herriot model with correlated time effects leads to a lost in the significance for the variance of the second random effect u 2 .Moreover, even if it is significant, the correlation param- eter is close to zero. Figure 6 shows the comparison between the raw residuals (y drt − μdrt ) of model with independent time effects and the raw residuals of model with AR(1) correlated time effects.The evidence from this figure is that there is no difference in the distribution of the residuals from the two models.So, for the parsimony principle we prefer the simpler model with independent time effects over the more complex model with time correlation parameter .Finally, for being more confident about the selected model with independent time effects as a true generating model, we will give suitable diagnostics.Three-fold Fay-Herriot model for small area estimation and…

Model diagnostics
The employment of small area estimation models suggests investigating how the model diagnostics can be addressed to analyze some special questions that may be of interest to survey practitioners, that are typical of this field of investigation.For instance, traditional mixed model diagnostics methods can be supplied by the measures of the efficiency of the small area estimation model applied, or indexes that highlight the influence of data on the estimation of the mean squared error of the Eblup.The evaluation of the data structure, concerning their impact on the estimation of fixed and random-area effects, as well as on the covariance parameters estimates, are some of these diagnostic measures.For example, the evaluation of the influence of the data on the covariance parameters estimates may be relevant, in terms of the evaluation of the weight of the regression-synthetic estimation part of the model.In the sequel, we are interested in purely exploring the impact of observations on the fitting and the evaluation of the model, rather than remove permanently particular domains from the estimation process.Figure 7 (left) plots the raw residuals of the fitted three-fold Fay-Herriot model against the direct estimates y drt .Figure 7 (right) shows the raw residuals sorted (in ascending order) by the variances of the direct estimator.It shows that the greater the variances of the direct estimators are, the higher the residuals of the model are.
Figure 8 plots the raw residuals for men (left) and women (right) at the year 2008.This figure shows that the model smooths the predictions.Positive men residuals indicates that the EBLUPs tend to be lower than the corresponding direct estimates.Analogously, negative women residuals indicates that the EBLUPs tend to be greater than the corresponding direct estimates.
Figure 9 (left) plots the fixed-effects and the random-effects average leverages points L f ,d and L r,d , defined in (2.6), of the domains (provinces), with the cor- responding thresholds.Table 9 of Appendix C gives the acronyms of the Spanish provinces.After averaging by sub-domains, Almería (AL) is the most influential on fixed-effects because exceeds the second cut-off value 2 × rank (X)∕DRT (see subsection 2.3.1).Barcelona (B) is on the border of the second-level threshold as leverage point for the random-effects.Nobre and Singer (2007) suggest evaluating the product , in order to measure leverages of the random effects.This means that, in the present application, a small average contribution in a specific domain, given the sampling variance of the direct estimator, can lead to relevant random-effects leverage values.Barcelona reports simultaneously low-level of sampling variances and moderate fixedeffects leverages, which is enough to justify the result reported.
Figure 9 (right) shows the joint variation by the covariates in the model, by plotting ̂ (d) − ̂ , defined in (2.8).The "non Spanish citizenship"slope parame- ter estimate is close to −0.27 , and the variation due to omitting Provinces has a Three-fold Fay-Herriot model for small area estimation and… moderate consequence on the fixed-effect estimation.The most explicit change is due to Burgos (BU) and Castellón (CS), which, when it is omitted, causes the major possible increase of the estimates of the fixed-effect parameter.The provinces of Madrid (M), Santa Cruz de Tenerife (TF) and Almería (AL), when omitted, cause the most important decrease of the corresponding fixed-effect estimate.This corresponds to improve the negative average impact on the model dependent variable, namely the poverty proportion direct estimate.
The variable "unemployed" (with an estimated slope parameter close to 0.44) has generally more impact on the estimates.Cáceres (CC) and Cádiz (CA) are the provinces that decrease the parameter estimate.This deals with the lowest possible value of the fixed-effect when a domain is deleted.When these provinces are omitted, the estimate of the coefficient estimate of the "Unemployed" variable undergoes the greatest decrease.Santa Cruz de Tenerife causes the major increase in the estimated parameter.When this province is deleted from the data, the impact of the "Unemployed" covariate on the dependent variable is very high.Figure 10 shows the studentized residuals (2.13) with the confidence limits at ±2 (left) and a Normal Q-Q (quantile-quantile) plot of the studentized residuals (2.13) (right).Few labels are reported for some points to identify as outliers, which are coded as follows: the province abbreviation, letter 'M' for male or 'F' for female, the year from 2004 to 2008 for simplicity denoted from 1 to 5, separated with dots.From a graphical analysis of the figure we can conclude that the assumption of normality is reliable.The studentized residual quantiles present a linear trend in relation to the quantiles of the N(0, 1) distribution.Furthermore, with a confidence level of 95% all points belong to the confidence region, with only two points falling outside, which are referred to Tenerife (TF) province.
Figure 11 (left) plots jointly the fixed-effects leverages of the model (2.12) and the residuals rm ch,drt , defined in (2.13), together with their cut-off thresholds.Again some labels are reported, which are coded as before: the province abbreviation, letter 'M' for male or 'F' for female, the year from 2004 to 2008 for simplicity denoted from 1 to 5, separated with dots.Leverage extreme points confirm that Almería (AL) as one of the most influential province, but the scaled residuals reveal that Málaga (MA), Coruña (C), and Valladolid (VA) (in some years), are quite very influential on the fixed effects estimates of the "reduced" model.In general, the residuals rm ch,drt are averaged by the Cholesky decomposition, and then they can be masked by the correspondent smoothing effect (Calvin and Sedransk, 1991).This fact may not highlight some observations in the residual analysis.In the present case, we have evidence of provinces of Málaga, Coruña, and Valladolid in the residual plot of the rm ch,drt 's.This means that in the presence of two different smoothing given by averaging the fixed-effects leverages by the sub-domains and by the model variance decomposition, the first is the prevailing one.
Figure 11 (right) plots the Cook's distances defined in (2.9), and highlights that the most influential provinces due to extreme residual points are Santa Cruz de Tenerife (TF), Castellón (CS), and Cáceres (CC).Due to very high fixed-effects leverage value, the most influential is Almería (AL).
Figure 12 (left) shows values of ( ̂ (d) ) and the mse( ̂ (d) ) , introduced in (2.10) and (2.11) respectively, which points out that Barcelona (B), when pulled out of the data, causes the worst average loss in terms of efficiency of the model.Further, Santa Three-fold Fay-Herriot model for small area estimation and… Cruz de Tenerife (TF) when deleted produces the best gain in model efficiency.This fact causes simultaneously the maximum departure from the set of the covariance parameters estimates.
For the cut-off value = 0 , Fig. 12 (right) reports the values of the ratio where mse( ̂ ) was given by (2.11).Plotting ( ̂ (d) ) defined in (2.10) and ( ̂ (d) ) indi- cates if varying (by deletion) the estimates ̂ , the model increases (decreases) the efficiency of the estimation.There are some important cases that reports highly negative ( ̂ (d) )'s, and this demonstrates again that Barcelona has the major negative rel- ative impact on the average of the domain estimates of the prediction mean squared error, when deleted.The relative impact is positive when Soria (SO) is omitted.Figure 13 plots the efficiencies i versus the mse-ratios i , defined in (2.14) and (2.15) respectively.This figure shows to which provinces the model attributes more efficiency i in the estimation.In our case study, we got = 0.741251 , the vertical bar in the figure.In particular, provinces with the lowest values of i (here it is convenient to set drt = i ) are those with smallest sampling variances (i.e., 2 i , see the subsection 2.3.4), and then this situation gives more weight to the first component (i.e., g 1,i ( ̂ ) ) of the estimated MSE(̃ ) , that is related to the variance of the direct estimator.The cor- respondent low values of corr(r c i , e i ) = i are justified by the fact that MSE(̃ ) = g 1,i ( ̂ ) + g 2,i ( ̂ ) , almost approaching the sampling variance 2 i .Conversely, the highest values of i are provided by the provinces with c i = 4 i c * ii ≃ 2 i , that implies c * ii ≃ −2 i (see again the subsection 2.3.4).Indeed, we observe that when c * ii ⟶ −2 i we get c * ij ⟶ 0 ( i ≠ j ), and, consequently, i ∕y i ⟶ 1 , and corr(r c i , e i ) ⟶ 1 .Ávila (AV), Cuenca (CU), Palencia (P), Segovia (SG), Soria (SO), and Zamora (ZA) are the most efficient in terms of i and a low value of the third component 2g 3 , i.e. in terms of the greatest rela- tive difference between the sampling variance and the MSE, respect to the sampling variance itself (see the formula of i in 2.14).Moreover, provinces of Guipuzcoa (SS) and Cantabria (S) have some of the most relevant values in terms of the third component 2g 3 of the MSE of the EBLUP (see in 2.15) Guipúzcoa (SS) and Gerona (GI) have some of the relevant values in terms of the third component of the MSE of the EBLUP.

Conclusions
This paper introduces the new three-fold Fay-Herriot model, which accounts for three levels of hierarchy to jointly model data from domains, subdomains, and time periods or subsubdomains.For calculating the REML estimators of the model parameters, a Fisher-scoring algorithm is implemented.Under the new model, the EBLUPs of linear indicators are derived and the corresponding MSEs are approximated.Two MSE estimators are proposed.The first one is analytic and based on the , MSE approximation.The second one relies on parametric bootstrap.The simulation experiments empirically investigate the REML estimators, the EBLUPs and the MSE estimators.A three-fold Fay-Herriot model with AR(1) time effects is also introduced in Appendix B, but it is not finally selected in the application to real data.Several measures of diagnostics are also proposed, not only to assess the general efficiency of the model but also to get valuable information to interpret underlying data relationships.Domain deletion diagnostics and Cook's distance -both based on marginal residuals and leverages-allow to evaluate the influence of each domain on the estimates.The former through the shift in the estimators of regression parameters, the latter highlighting which area is the most influential in the analysis.Moreover, regarding the MSE of the EBLUPs, the average MSE estimators are computed iteratively removing one domain at a time, and then are combined in a ratio to assess the impact of each domain in the performance of the MSE estimates.The analysis of model residuals is achieved by employing the scaled residuals obtained through a Cholesky root of the covariance matrix of the original data, to appraise the connection of the selected covariates with the response variable.An efficiency measure of the model is instead proposed on the basis of the difference between the sampling variances and the g 1 and g 2 com- ponents of the mse and, finally, a mse-ratio is accomplished to fix the variability of the estimation of the variance components to the predicted MSE.The reported diagnostics are then combined together in nice graphical representations which give great support, especially in the interpretation of the results within the application to real data.
It is worth reflecting on some conclusions about the model diagnostics presented, and in particular, on the measures introduced.The model studied is an extension of the Fay-Herriot model, where are present areas in a three-fold matrix of random effects design.The model belongs to the class of LMMs with block-diagonal structure.We have two main features of this model, with relevant consequences in terms of evaluation of the results.The first is that the model has more than one observation unit per area, unlike the basic Fay-Herriot model.This is because of the presence of the variable of the gender, and also to repeated observations of the same area with respect to five-time instants.The second is that since we apply an area-level model to the data, the variance of the residual mixed model errors is given.Both these model characteristics are considered in the diagnostics and influence analysis reported.In particular, we analyze the model marginal residuals through their reduction to ordinary regression residuals, due to correlated observations.This procedure depicts the areas starting only from the fixedeffects matrix, and can play a significant role in the assessment of the connection of the selected covariates with the response variable.This allows also to study the model (2.12) as a theme-based model, through its scaled residuals r m ch , in addition to the considerations related to the model-based estimation.Anyway, in an effort to eliminate the instability induced by the sampling variances in the model estimation process, the linear relation at the basis of the study can be further assessed by the diagnostic properties of the model reduced by the Cholesky root.In particular, using the Cholesky decomposition for the residuals, make some information about the data (i.e.some influential provinces) visible that would otherwise remain masked.
A further and prominent aspect of small area estimation models is related to the best reduction of the MSE of the area level predictors.Following the usual linear mixedmodel Covratio diagnostics, we introduced the use of some new plots, in order to rate 1 3 Three-fold Fay-Herriot model for small area estimation and… possible favorable changes in the estimation of the model.This also with a joint evaluation of the level of the departure of the covariance parameters from the complete model and the change in the MSE.Finally, because in the area-level models the residual error variance is not to be estimated, model conditional residuals can be used to illustrate the level of efficiency of the model.The model efficiency index is then given by the joint consideration of conditional residuals and model errors, obtained by averaging their correlations.
The new small area model is applied to the Spanish living condition survey data of the periods 2004-2008.The target is to estimate poverty incidence by province, by sex and by year.Since the data were available for 5 periods, complex temporal correlation structures are not considered.The statistical inferences on the presented model show good results, both in the significance level of the estimated parameters and in the minimization of the MSEs of the EBLUP estimates.Then with a map is depicted the distribution of the EBLUP estimates for the proportion of poverty between the Spanish provinces.The model introduced retains its features in terms of assessment of the impact of the demographic and the economic predictors used on the poverty proportion.Being the observations of the Spanish Provinces, evaluated in domains of the sexes and repeated by time instants, the model offers the opportunity to read analytically the considerable differences that distinguish various territories of Spain.Finally, we remark that the simulations and the application of the model to the real data have been carried out with the programming language R. The developed software is available on request.

A.1 Basic calculations
The mean squared error matrix of ̃ , with diagonal elements MSE( and: Since ZV u Z � = V − V e , U = XM(V − V e ) , we observe that VM � X � = XMV , and V e M � X � = V e V −1 XMV .Therefore, we have where

If we write
Three-fold Fay-Herriot model for small area estimation and… Note that a = col It holds that A.4 Calculation of g 3drt () It holds that where The derivatives of b ′ are 1 3 Three-fold Fay-Herriot model for small area estimation and… where The components of (∇b � )V(∇b � ) � take the form 1 3 Three-fold Fay-Herriot model for small area estimation and… Finally, we have where F ab is the generic element of the REML Fisher information matrix that was calculated for deriving the updating formula of the Fisher-scoring algorithm.
B Appendix: Three-fold Fay-Herriot model with AR(1) time effects

B.1 The model
Let us consider the model where y drt is a direct estimator of the characteristic of interest and x drt is a row vec- tor containing the aggregated population values of p auxiliary variables.The subscripts d, r and t represent domains, subdomains and subsubdomains respectively.For example, d, r and t may represent area, category (for example, sex-age group) and time period respectively.We assume that u where 1 T denotes the T × 1 vector of ones and I T is the T × T identity matrix.We can write model (B.1) in the subdomain form q 11 q 12 q 13 q 21 q 22 q 23 q 31 q 32 q 33 where ̂ and û are given in (B.5).Similarly as before, we define the BLUP and EBLUP vectors Let P X = X(X � V −1 X) −1 X � V −1 be the projection matrix in the p-dimensional subspace given by the column vectors of X in the metric of V −1 , and define P = V −1 (I − P X ) = V −1 − V −1 X(X � V −1 X) −1 X � V −1 as the projection matrix in the residuals complement subspace of y .Then, we may write the model BLUP of as
( Ωdr ()), 1 3 Three-fold Fay-Herriot model for small area estimation and… Further, we have By taking derivatives of l reml with respect to a , we have the elements of the score vector S = S( ) = (S 1 , S 2 , S 3 , S 4 ) � , i.e.
By taking second derivatives with respect to a and b , taking expectations and changing the sign, we obtain the components of the Fisher information matrix F = F( ) = F ab a,b=1,2,3,4 , where Domain deletion diagnostics allow knowing relevant shifts in the estimators of regression parameters.Let ̂ (d) be the estimator of calculated without using the data from domain d and let r m d = y d − X d ̂ the vector of marginal residuals of domain d.It holds that The Cook's distance, studies the influence of the domain d, based jointly on marginal residuals and leverages, traditionally with a critical value 4/D.
where ̂ and û are given in (2.5).In matrix form, we have drt = a � (X + Zu) , where a = a drt = col th ))) is a vector hav- ing a one in the cell t + (r − 1)T + (d − 1)RT and having zeros in the remaining cells.The symbol d denotes the Kronecker's delta, i.e. d = 1 if d = and d = 0 other- wise.
the vector e * (b) = col 1≤d≤D drt )).(e) Calculate the bootstrap vector (f) Fit the assumed model to the bootstrap vector y * (b) , calculate the estimators ̂ * (b) , ̂ * (b) of the model parameters, the true value of the mixed parameter * SimulationsThree simulation experiments are carried out under the three-fold Fay-Herriot model.The data generation process is inspired by the simulation experiments presented in Chapter 17 ofMorales et al. (2021) for the FH2 model.The simulations investigate the behavior of the REML estimators of model parameters, the EBLUPs of domain means, and the bootstrap MSE estimators.The data generation process isFor d = 1, … , D , r = 1, … , R , t = 1, …,T , the explanatory and dependent vari- ables are

3. 1
Simulations Simulation 1 investigates the bias and the MSE of the REML estimators of the model parameters and of the EBLUPs.The simulation steps are 1.Repeat I = 10 3 times ( i = 1, … , I) 1.1.G e n e r a t e a s a m p l e o f s i z e D R T a n d c a l c u l a t e (i)

Fig. 2
Fig. 2 Estimates of poverty proportions (left) and root-MSEs (right) by province, sex, and year

Fig. 3 Fig. 4
Fig. 3 Direct and EBLUP poverty estimates for men (left) and women (right) in 2008

Fig. 5
Fig. 5 Map of poverty proportion for men (top) and women (bottom), year 2008

Fig. 6 Fig. 7
Fig. 6Comparing raw residuals of model with independent time effects and raw residuals of model with AR(1) correlated time effects

Fig. 13 i
Fig. 13 i vs i , with province abbreviations

(
rs tk )) .Then,we have A.3 Calculation of g 2drt() are mutually independent, with  2 drt > 0 known and Model (B.1) is a generalization of the two-fold Fay-Herriot model with correlated time effects described in Section 17.3 ofMorales et al. (2021).At the domain level d and subdomain level r, we define the matrices and vectors
. The corresponding EBLUP of drt is obtained by substituting by an estimator ̂ = ( σ2 and û are given in (2.5).The EBLUP of drt is the predictor of the pop- ulation mean Y drt , i.e.Ŷeblup drt= μdrt .The synthetic estimator, μsyn drt = x drt β , can be employed for out-of sample domains.The BLUP and EBLUP vectors are introduced the Covratio statistics where ̂ and ̂ (d) are the REML estimators of calculated with and without the data of domain d and F is the REML Fisher information matrix.A value of ( ̂ (d) ) close to zero indicates that the influence of domain d on the estimation of the variance parameters is negligible.We are thus interested in detecting domains d with large departures of ( ̂ (d) ) from zero.With regards to SAE, it is interesting to evaluate the effect of domain deletion in the MSE of the EBLUP.For this sake, we calculate the MSE estimators with and without domain d and we evaluate and compare the following two different averages of MSE estimators

Table 2
Bias (top)and RMSE (bottom) results of BLUPs

Table 3
Bias (top) and RMSE (bottom) results of EBLUPsThree-fold Fay-Herriot model for small area estimation and… and relative (top right) Bias and root-MSE (bottom left) and relative root-MSE (bottom right) for BLUPs.As 2 1 increases both FH1 and FH2 increase in Bias and MSE, instead FH3 model always remains stable.Table

Table 4
AAB a , AARB a , ARE a and ARRE a , a = 0, 1 , with R = T = 10 and B = 300

Table 9
List of Spanish Provinces with corresponding acronyms