Single- and two-stage cross-sectional and time series benchmarking procedures for small area estimation

Pfeffermann, Danny; Sikov, Anna; Tiller, Richard

doi:10.1007/s11749-014-0398-y

Single- and two-stage cross-sectional and time series benchmarking procedures for small area estimation

Invited Paper
Published: 21 October 2014

Volume 23, pages 631–666, (2014)
Cite this article

TEST Aims and scope Submit manuscript

Danny Pfeffermann^1,2,3,
Anna Sikov³ &
Richard Tiller⁴

465 Accesses
14 Citations
Explore all metrics

Abstract

This article is divided into two parts. In the first part, we review and study the properties of single-stage cross-sectional and time series benchmarking procedures that have been proposed in the literature in the context of small area estimation. We compare cross-sectional and time series benchmarking empirically, using data generated from a time series model which complies with the familiar Fay–Herriot model at any given time point. In the second part, we review cross-sectional methods proposed for benchmarking hierarchical small areas and develop a new two-stage benchmarking procedure for hierarchical time series models. The latter procedure is applied to monthly unemployment estimates in Census Divisions and States of the USA.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

M-Quantile Small Area Estimation for Panel Data

Small area estimation under a temporal bivariate area-level linear mixed model with independent time effects

Article 31 March 2020

Joint tests for dynamic and spatial effects in short panels with fixed effects and heteroskedasticity

Article 23 September 2020

References

Battese GE, Harter RM, Fuller WA (1988) An error components model for prediction of county crop area using survey and satellite data. J Am Stat Assoc 83:28–36
Article Google Scholar
Bell WR, Datta GS, Ghosh M (2012) Benchmarking small area estimators. Biometrika 100:189–202
Article MathSciNet Google Scholar
Butar F, Lahiri P (2003) On measures of uncertainty of empirical Bayes small area estimators. J Stat Plan Inference 112:63–76
Article MATH MathSciNet Google Scholar
Cholette P, Dagum EB (1994) Benchmarking time series with autocorrelated survey errors. Int Stat Rev 62:365–377
Article MATH Google Scholar
Dagum EB, Cholette P (2006) Benchmarking, temporal distribution and reconciliation methods for time series data. Springer, New York
Google Scholar
Datta GS, Ghosh M, Steorts R, Maples J (2011) Bayesian benchmarking with applications to small area estimation. Test 20:574–588
Article MATH MathSciNet Google Scholar
Di Fonso T, Marini R (2011) Simultaneous and two-step reconciliation of systems of time series: methodological and practical issues. J R Stat Soc Ser C 60:143–164
Article Google Scholar
Doran HE (1992) Constraining Kalman filter and smoothing estimates to satisfy time-varying restrictions. Rev Econ Stat 74:568–572
Article Google Scholar
Durbin J, Quenneville B (1997) Benchmarking by state space models. Int Stat Rev 65:23–48
Article MATH Google Scholar
Fay RE, Herriot RA (1979) Estimates of income for small places: an application of James–Stein procedures to census data. J Am Stat Assoc 74:269–277
Article MathSciNet Google Scholar
Ghosh M, Steorts RC (2013) Two-stage Bayesian benchmarking as applied to small area estimation. Test 22:670–687
Article MATH MathSciNet Google Scholar
Hall P, Maiti T (2006) On parametric bootstrap methods for small area prediction. J R Stat Soc B 68:221–238
Harvey A (1989) Forecasting structural time series with the Kalman filter. Cambridge University Press, Cambridge
Google Scholar
Hillmer SC, Trabelsi A (1987) Benchmarking of economic time series. J Am Stat Assoc 82:1064–1071
Article MATH MathSciNet Google Scholar
Isaki CT, Tsay JH, Fuller WA (2000) Estimation of census adjustment factors. Survey Methodol 26:31–42
Google Scholar
Lahiri P (1990) “Adjusted” Bayes and empirical Bayes estimation in finite population sampling. Sankhya 52, series B, pp 50–66
Nandram B, Sayit H (2011) A Bayesian analysis of small area probabilities under a constraint. Survey Methodol 37:137–152
Google Scholar
Pfeffermann D (2013) New important developments in small area estimation. Stat Sci 28:40–68
Article MathSciNet Google Scholar
Pfeffermann D, Nathan G (1981) Regression analysis of data from a cluster sample. J Am Stat Assoc 76:681–689
Article MATH Google Scholar
Pfeffermann D, Burck L (1990) Robust small area estimation combining time series and cross-sectional data. Survey Methodol 16:217–237
Google Scholar
Pfeffermann D, Barnard CH (1991) Some new estimators for small area means with application to the assessment of farmland values. J Business Econ Stat 9:73–83
Google Scholar
Pfeffermann D, Tiller RL (2006) Small area estimation with state-space models subject to benchmark constraints. J Am Stat Assoc 101:1387–1397
Article MATH MathSciNet Google Scholar
Prasad N, Rao JNK (1990) The estimation of prediction mean squared error of small area estimators. J Am Stat Assoc 85:163–171
Article MATH MathSciNet Google Scholar
Rao JNK (2003) Small area estimation. Wiley, New York
Book MATH Google Scholar
Ugarte MD, Militino AF, Goicoa T (2009) Benchmarked estimates in small areas using linear mixed models with restrictions. Test 18:342–364
Steorts RC, Ghosh M (2013) On estimation of mean squared errors of benchmarked empirical Bayes Estimators. Stat Sin (2014, in press)
Wang J, Fuller WA, Qu Y (2008) Small area estimation under a restriction. Survey Methodol 34:29–36
Google Scholar
You Y, Rao JNK (2002) A pseudo-empirical best linear unbiased prediction approach to small area estimation using survey weights. Can J Stat 30:431–439
Article MATH MathSciNet Google Scholar
You Y, Rao JNK, Dick P (2004) Benchmarking hierarchical Bayes small area estimators in the Canadian census undercoverage estimation. Stat Transit 6:631–640
Google Scholar
You Y, Rao JNK, Hidiroglou M (2013) On the performance of self benchmarked small area estimators under the Fay–Herriot area level model. Survey Methodol 39:217–229

Download references

Acknowledgments

We are very grateful to three anonymous reviewers for providing many excellent comments, which enhanced the quality of this article.

Author information

Authors and Affiliations

Southampton Statistical Sciences Research Institute, University of Southampton, Southampton, SO17 1BJ, UK
Danny Pfeffermann
Central Bureau of Statistics, Jerusalem, Israel
Danny Pfeffermann
Department of Statistics, Hebrew University, 91905, Jerusalem, Israel
Danny Pfeffermann & Anna Sikov
Statistical Methods Staff, Bureau of Labor Statistics, Washington, DC, 20212, USA
Richard Tiller

Authors

Danny Pfeffermann
View author publications
You can also search for this author in PubMed Google Scholar
Anna Sikov
View author publications
You can also search for this author in PubMed Google Scholar
Richard Tiller
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Danny Pfeffermann.

Additional information

This invited paper is discussed in comments available at: doi:10.1007/s11749-014-0382-6; doi:10.1007/s11749-014-0384-4; doi:10.1007/s11749-014-0386-2; doi:10.1007/s11749-014-0400-8.

Appendices

Appendix A: Computation of $\tilde{\Sigma }_{tt}^d =E(\tilde{{\mathbf {e}}}_t^d {\tilde{{\mathbf {e}}}}{'}_t^d )$

The matrix $\tilde{\Sigma }_{tt}^d =\left[ {{\begin{array}{cc} {\Sigma _{tt}^d }&{} {{\mathbf {h}}_{tt}^d }\\ {{{\mathbf {h}}}{'}_{tt}^d } &{} {v_{tt}^d }\\ \end{array} }} \right] $ has as its main block the $S\times S$ diagonal V–C matrix of the State sampling errors, $\Sigma _{tt}^d =E({\mathbf {e}}_t^d {{\mathbf {e}}}{'}_t^d )=\hbox {diag}[\sigma _{ds,tt}^2 ]$, where ${\mathbf {e}}_t^d =(e_{d1,t} ,\ldots ,e_{dS,t})^{\prime }$ (the direct CPS sampling errors are independent between the States). The computation of the other elements of $\tilde{\Sigma }_{tt}^d $; ${\mathbf {h}}_{tt}^d =E({\mathbf {e}}_t^d r_{dt}^\mathrm{bmk} )$ and $v_{tt}^d =\hbox {Var}(r_{dt}^\mathrm{bmk} )$ requires revisiting the first-stage benchmarking.

Consider first the computation of $v_{tt}^d $. Denote $\varvec{l}_{dt}^\mathrm{bmk} =(\hat{\varvec{\alpha }}_{dt}^\mathrm{bmk} -\varvec{\alpha }_{dt} )$, such that $r_{dt}^\mathrm{bmk} ={{\mathbf {z}}}^{\prime }_{dt} \varvec{l}_{dt}^\mathrm{bmk} $ and $v_{tt}^d ={{\mathbf {z}}}^{\prime }_{dt} E(\varvec{l}_{dt}^\mathrm{bmk} \varvec{l}_{dt}^{\mathrm{bmk}^{\prime }}){\mathbf {z}}_{dt}$. Let $\varvec{l}_t^\mathrm{bmk} =(\tilde{\varvec{\alpha }}_t^\mathrm{bmk} -\varvec{\alpha }_t )=(\varvec{l}_{1t}^{\mathrm{bmk}{^\prime }} ,\ldots ,\varvec{l}_{Dt}^{\mathrm{bmk}^{\prime }})^{\prime }$ and $J_d $ be the indicator matrix of zeroes and ones satisfying, $\varvec{l}_{dt}^\mathrm{bmk} =J_d \varvec{l}_t^\mathrm{bmk} $. It follows that $v_{tt}^d ={{\mathbf {z}}}^{\prime }_{dt} J_d P_t^\mathrm{bmk} {J}^{\prime }_d {\mathbf {z}}_{dt} $, where $P_t^\mathrm{bmk} =E(\varvec{l}_t^\mathrm{bmk} \varvec{l}_t^{\mathrm{bmk}^{\prime }})$ is obtained recursively as defined below (15).

Next consider the computation of ${\mathbf {h}}_{tt}^d =E({\mathbf {e}}_t^d r_{dt}^\mathrm{bmk} )$. By Eq. (D.2) in P–T, the error $(\tilde{{\mathbf {\alpha }}}_t^\mathrm{bmk} -{\mathbf {\alpha }}_t )$ when predicting the division state vectors can be written as

$$\begin{aligned} \varvec{l}_t^\mathrm{bmk} =\left( \tilde{\varvec{\alpha }}_t^\mathrm{bmk} -\varvec{\alpha }_t\right) =G_t \tilde{T}\left( \tilde{\varvec{\alpha }}_{t-1}^\mathrm{bmk} -\tilde{\varvec{\alpha }}_{t-1}\right) -G_t \varvec{\eta }_t +K_t \tilde{{\mathbf {e}}}_t, \end{aligned}$$

(35)

with $G_t $ and $K_t $ defined in P–T and $\tilde{{\mathbf {e}}}_t =(e_{1t} ,\ldots ,e_{Dt} ,\sum \nolimits _{d=1}^D {b_{dt} e_{dt} } )^{\prime }$. Let $U_t =G_t \tilde{T}$ such that (35) can be written as $\varvec{l}_t^\mathrm{bmk} =U_t \varvec{l}_{t-1}^\mathrm{bmk} -G_t \varvec{\eta }_t +K_t \tilde{\varvec{e}}_t $. Repeated substitutions in the last equation yields,

$$\begin{aligned} \varvec{l}_t^\mathrm{bmk}&=A_{t,2} \varvec{l}_1^\mathrm{bmk} +B_{t,2} \tilde{{\mathbf {e}}}_2 +\cdots +B_{t,t-1} \tilde{{\mathbf {e}}}_{t-1}\nonumber \\&\quad +K_t \tilde{{\mathbf {e}}}_t -D_{t,2} \varvec{\eta }_2 -\cdots -D_{t,t-1} \varvec{\eta }_{t-1} -G_t \varvec{\eta }_t,\quad t=2,3,\ldots , \end{aligned}$$

(36)

where $A_{t,j} =U_t \times U_{t-1} \times \dots \times U_j ,\,\,j=2,\ldots ,t,\,\,B_{t,j} =A_{t,j+1} K_j ,\,\,D_{t,j} =A_{t,j+1} G_j $. Suppose that the GLS filter is initialized at time $t=1$ by $\tilde{T}\tilde{\varvec{\alpha }}_0^\mathrm{bmk}$, independently of the rest of the series. Then, $C_{1,0}^\mathrm{bmk} =E[(\tilde{T}\tilde{\varvec{\alpha }}_0^\mathrm{bmk} -\varvec{\alpha }_1 ){\tilde{{\mathbf {e}}}}{'}_{1,0} ]=0$ and by Eq. (D.1) in P–T, $\varvec{l}_1^\mathrm{bmk} =[I-P_{1\vert 0}^\mathrm{bmk} {\tilde{Z}}{'}_1 R_{1,0}^{-1} \tilde{Z}_1 ](\tilde{T}\tilde{\varvec{\alpha }}_0^\mathrm{bmk} -\tilde{\varvec{\alpha }}_1 )+P_{1\vert 0}^\mathrm{bmk} {\tilde{Z}}{'}_1 R_{1,0}^{-1} \tilde{{\mathbf {e}}}_1,$ where $R_{1,0}^\mathrm{bmk} =[\tilde{Z}_1 P_{1\vert 0}^\mathrm{bmk} {\tilde{Z}}^{\prime }_1 +\tilde{\Sigma }_{11,0} ]$. Denoting $K_{1\vert 0} =P_{1\vert 0}^\mathrm{bmk} {\tilde{Z}}^{\prime }_1 R_{1,0}^{-1},\,\,\,B_{t,1} =A_{t,2} K_{1\vert 0} ,\,\,\,D_{t,1} =A_{t,2} (I-K_{1\vert 0} \tilde{Z}_1 ),\,\,\,B_{t,t} =K_t ,\,\,\,D_{t,t} =G_t $ and ignoring the term $D_{t,1} \tilde{T}(\tilde{\varvec{\alpha }}_0^\mathrm{bmk} -\varvec{\alpha }_0 )$ which does not enter into any of the computations that follow, (independent of the rest of the series), Eq. (36) can be rewritten as

$$\begin{aligned} \varvec{l}_t^\mathrm{bmk} =\sum \limits _{j=1}^t {B_{t,j} } \tilde{{\mathbf {e}}}_j -\sum \limits _{j=1}^t {D_{t,j} } \varvec{\eta }_{j}. \end{aligned}$$

(37)

The relationship in (37) allows us to express the division prediction error $r_{dt}^\mathrm{bmk} ={{\mathbf {z}}}^{\prime }_{dt} J_d \varvec{l}_t^\mathrm{bmk} $ as a difference between linear functions of the division sampling errors and the state vector errors, which are independent of the sampling errors. We thus have,

$$\begin{aligned} {\mathbf {h}}_{tt}^d =E\left( {\mathbf {e}}_t^d r_{dt}^\mathrm{bmk} \right) =E\left[ {\mathbf {e}}_t^d {l}{'}_t^\mathrm{bmk} {J}{'}_d {\mathbf {z}}_{dt} )\right] =E\left( {\mathbf {e}}_t^d \sum \limits _{j=1}^t {{\tilde{{\mathbf {e}}}}{'}_j {B}^{\prime }_{t,j} } \right) {J}^{\prime }_d {\mathbf {z}}_{dt}. \end{aligned}$$

(38)

Now, ${\mathbf {e}}_t^d =(e_{d1,t},\ldots ,e_{dS,t})^{\prime }$, ${\tilde{{\mathbf {e}}}}{'}_j =(e_{1j} ,\ldots ,e_{Dj} ,\sum \nolimits _{k=1}^D {b_{kj} e_{kj} } )$ and $e_{dj} =\sum \nolimits _{s=1}^S {b_{ds,j} e_{ds,j} } $, implying

$$\begin{aligned} E({\mathbf {e}}_t^d {\tilde{{\mathbf {e}}}}{'}_j )\!=\!\left[ {\mathbf {0}}_{(1)},\ldots ,{\mathbf {0}}_{(d-1)} ,E\left( {\mathbf {e}}_t^d e_{dj} \right) ,{\mathbf {0}}_{(d+1)},\ldots ,{\mathbf {0}}_{(D)} ,b_{dj} E\left( {\mathbf {e}}_t^d e_{dj} \right) \right] \!=\!\tilde{H}_{t,j}^{d},\nonumber \\ \end{aligned}$$

(39)

where ${\mathbf {0}}_{(k)} $ is the null vector of length S in position (column) $k$, and

$$\begin{aligned} E({\mathbf {e}}_t^d e_{dj} )=E\left[ {\mathbf {e}}_t^d \left( \sum \limits _{s=1}^S {b_{ds,j} e_{ds,j} } \right) \right] =\left[ b_{d1,j} \sigma _{d1,tj}^2 ,\ldots ,b_{dS,j} \sigma _{dS,tj}^2\right] ^{\prime }. \end{aligned}$$

(40)

Substituting (40) in (39) and then in (38) gives the expression for the vector ${\mathbf {h}}_{tt}^d $,

$$\begin{aligned} {\mathbf {h}}_{tt}^d =E\left( {\mathbf {e}}_t^d r_{dt}^\mathrm{bmk} \right) =\sum \limits _{j=1}^t \left( \tilde{H}_{t,j}^d {B}{'}_{t,j}\right) {J}{'}_d {\mathbf {z}}_{dt}. \end{aligned}$$

(41)

Appendix B: Computation of $C_{dt}^\mathrm{bmk} =E[(\tilde{T}^d\tilde{\varvec{\alpha }}_{t-1}^{d,\mathrm{bmk}} -\varvec{\alpha }_t^d ){\tilde{{\mathbf {e}}}}{'}_t^d ]$

The computation of the covariance matrix $C_{dt}^\mathrm{bmk} $ is more involved since it requires computing the covariances between the division benchmark error and the State sampling errors. We first express the prediction error $(\tilde{T}^d\tilde{\varvec{\alpha }}_t^{d,\mathrm{bmk}} -\varvec{\alpha }_{t+1}^d)$ corresponding to time $t+1$ as a function of the sampling errors and the state vector errors, similarly to (37). Under the model, $\varvec{\alpha }_{t+1}^d =\tilde{T}^d\varvec{\alpha }_t^d +\varvec{\eta }_{t+1}^d $. Hence,

$$\begin{aligned} {\mathbf {m}}_t^d =\left( \tilde{T}^d \tilde{\varvec{\alpha }}_t^{d,\mathrm{bmk}} -{\varvec{\alpha }}_{t+1}^d \right) =\tilde{T}^d\left( \tilde{\varvec{\alpha }}_t^{d,\mathrm{bmk}} -\varvec{\alpha }_t^d\right) -\varvec{\eta }_{t+1}^d =\tilde{T}^d \varvec{l}_t^{d,\mathrm{bmk}} -\varvec{\eta }_{t+1}^d,\quad \quad \end{aligned}$$

(42)

where $\varvec{l}_t^{d,\mathrm{bmk}} =(\varvec{l}_{t1}^{{d,\mathrm{bmk}}{^\prime }} ,\varvec{l}_{t2}^{{d,\mathrm{bmk}}^{\prime }} ,\ldots ,\varvec{l}_{tS}^{{d,\mathrm{bmk}}^{\prime }})^{\prime }$ is the vector of the benchmark errors for the state vectors in the division. Now, using a similar decomposition to Eq. (D.2) in P–T, we have,

$$\begin{aligned} {\mathbf {m}}_t^d =\tilde{T}^d\left( G_t^d {\mathbf {m}}_{t-1}^d +K_t^d \tilde{{\mathbf {e}}}_t^d\right) -\varvec{\eta }_{t+1}^d =W_t^d {\mathbf {m}}_{t-1}^d +\tilde{T}^d K_t^d \tilde{{\mathbf {e}}}_t^d -\varvec{\eta }_{t+1}^d , \end{aligned}$$

(43)

where $W_t^d =\tilde{T}^dG_t^d $, $\tilde{{\mathbf {e}}}_t^d =(e_{d1,t} ,\ldots ,e_{dS,t} ,r_{d,t}^\mathrm{bmk})^{\prime }=({{\mathbf {e}}}{'}_t^d ,r_{d,t}^\mathrm{bmk} )^{\prime }$ and $r_{d,t}^\mathrm{bmk} ={{\mathbf {z}}}^{\prime }_{dt} J_d \varvec{l}_t^\mathrm{bmk} $ is the division benchmark error at time $t$ with $\varvec{l}_t^\mathrm{bmk} $ defined by (37). The matrices $G_t^d $ and $K_t^d $ are defined similarly to $G_t $ and $K_t $ in Eq. (D.2) of P–T, but referring now to the States in a given division instead of to the divisions. By repeated substitutions in (43) and ignoring the term $\tilde{T}^d(\tilde{\varvec{\alpha } }_0^{d,\mathrm{bmk}} -\varvec{\alpha }_0^d )$ for time $t=0$ which drops out in each of the computations that follow (independent of the rest of the series), we obtain

$$\begin{aligned} {\mathbf {m}}_t^d =\sum \limits _{k=1}^t {B_{t,k}^d } \tilde{{\mathbf {e}}}_k^d -\sum \limits _{k=1}^{t+1} {D_{t,k}^d } \varvec{\eta }_k^d , \end{aligned}$$

(44)

where $D_{t,k}^d =W_t^d \times W_{t-1}^d \times \ldots \times W_k^d ,\,\,k=2,\ldots ,t,\,\,D_{t,t+1}^d =I_{Sq} ,\,\,B_{t,k}^d =D_{t,k+1}^d \tilde{T}^dK_k^d ,\,\,k=1,\ldots ,t-1$, $I_{Sq} $ is the identity matrix of order $Sq$ ($q=\dim (\eta _{ds} ))$ and $B_{t,t}^d =\tilde{T}^dK_t^d $.

At time $t+1$ we need to compute $C_{d,t+1}^\mathrm{bmk} =E[(\tilde{T}^d \tilde{\varvec{\alpha }}_t^{d,\mathrm{bmk}} -\varvec{\alpha }_{t+1}^d ){\tilde{{\mathbf {e}}}}{'}_{t+1}^d ]=E({\mathbf {m}}_t^d \tilde{{{\mathbf {e}}}{'}}_{t+1}^d )$. By (44)

$$\begin{aligned} C_{d,t+1}^\mathrm{bmk}&= \sum \limits _{k=1}^t {B_{t,k}^d } E\left( \tilde{{\mathbf {e}}}_k^d {\tilde{{\mathbf {e}}}}{'}_{t+1}^d\right) -\sum \limits _{k=1}^{t+1} {D_{t,k}^d } E\left( \varvec{\eta }_k^d {\tilde{{\mathbf {e}}}}{'}_{t+1}^d\right) \nonumber \\&= \sum \limits _{k=1}^t {B_{t,k}^d } E\left[ \left( {{\mathbf {e}}}{'}_k^d ,r_{d,k}^\mathrm{bmk}\right) ^{\prime }\times \left( {{\mathbf {e}}}{'}_{t+1}^d ,r_{d,t+1}^\mathrm{bmk} \right) \right] \nonumber \\&-\sum \limits _{k=1}^{t+1} {D_{t,k}^d } E\left[ \varvec{\eta }_k^d \times \left( {{\mathbf {e}}}{'}_{t+1}^d ,r_{d,t+1}^\mathrm{bmk}\right) \right] . \end{aligned}$$

(45)

Next, we evaluate each of the expectations in (45). Define

$$\begin{aligned} \tilde{\Sigma }_{k,t+1}^d =E\left( \tilde{{\mathbf {e}}}_k^d {\tilde{{\mathbf {e}}}}{'}_{t+1}^d \right) =\left[ {{\begin{array}{c@{\quad }c} {\Sigma _{k,t+1}^d }&{} {{\mathbf {h}}_{t+1,k}^d }\\ {{{\mathbf {h}}}{'}_{k,t+1}^d }&{} {v_{k,t+1} }\\ \end{array} }} \right] . \end{aligned}$$

(46)

By (29),

$$\begin{aligned} \Sigma _{k,t+1}^d =E\left( {\mathbf {e}}_k^d {{\mathbf {e}}}{'}_{t+1}^d \right) =\hbox {Diag}\left[ \sigma _{d1,k,t+1}^2,\ldots ,\sigma _{dS,k,t+1}^2 \right] . \end{aligned}$$

(47)

By (37) and (39) and noting that $E({\mathbf {e}}_j^d {\varvec{\eta }}^{\prime }_k )=0$ for all $(j,k)$,

$$\begin{aligned} {\mathbf {h}}_{t+1,k}^d&= E\left( {\mathbf {e}}_k^d r_{d,t+1}^\mathrm{bmk} \right) =E\left[ {\mathbf {e}}_k^d \left( \sum \limits _{j=1}^{t+1} {{\tilde{{\mathbf {e}}}}'_j {B}'_{t+1,j} } \right) \right] {J}'_d {\mathbf {z}}_{d,t+1} =\sum \limits _{j=1}^{t+1} {\tilde{H}_{k,j}^d {B}^{\prime }_{t+1,j} } {J}^{\prime }_d {\mathbf {z}}_{d,t+1},\nonumber \\ \end{aligned}$$

(48)

$$\begin{aligned} {{\mathbf {h}}}{'}_{k,t+1}^d&= E\left( r_{dk}^\mathrm{bmk} {{\mathbf {e}}}{'}_{t+1}^d\right) ={{\mathbf {z}}}'_{dk} J_d \sum \limits _{j=1}^k {B_{k,j} } E\left( \tilde{{\mathbf {e}}}_j {{\mathbf {e}}}{'}_{t+1}^d\right) ={{\mathbf {z}}}{'}_{dk} J_d \sum \limits _{j=1}^k {B_{k,j} \tilde{H}_{j,t+1}^d }. \end{aligned}$$

(49)

Remark 18

${{\mathbf {h}}}{'}_{k,t+1}^d \ne {{\mathbf {h}}}{'}_{t+1,k}^d $.

Recalling that the error vectors $\tilde{{\mathbf {e}}}_j =(e_{1j} ,\ldots ,e_{Dj} ,\sum \nolimits _{d=1}^D {b_{dj} e_{dj} })^{\prime }$ are only functions of the division sampling errors and thus independent of the state error vectors $\{\varvec{\eta }_l\}$, and that under the model $E(\varvec{\eta }_j {\varvec{\eta }}^{\prime }_j )=Q$; $E(\varvec{\eta }_j {\varvec{\eta }}^{\prime }_{\,l} )=0$, $j\ne l$, and defining $\tilde{\Sigma }_{i,j} =E(\tilde{{\mathbf {e}}}_i {\tilde{{\mathbf {e}}}}{'}_j )$,

$$\begin{aligned} v_{k,t+1}&=E\left( r_{dk}^\mathrm{bmk} r_{d,t+1}^\mathrm{bmk}\right) ={{\mathbf {z}}}{'}_{dk} J_d E\left[ \left( \sum \limits _{j=1}^k {B_{k,j}} \tilde{{\mathbf {e}}}_j-\sum \limits _{j=1}^k {D_{k,j} } \varvec{\eta }_j \right) \right. \nonumber \\&\quad \qquad \qquad \qquad \qquad \qquad \left. \times \left( \sum \limits _{l=1}^{t+1} {\tilde{{{\mathbf {e}}}{'}}_l {B}'_{t+1,l} } -\sum \limits _{l=1}^{t+1} {{\varvec{\eta }}{'}_{\,l} {D}'_{t+1,l} }\right) \right] {J}'_d {\mathbf {z}}_{d,t+1}\nonumber \\&={{\mathbf {z}}}^{\prime }_{dk} J_d \left[ \sum \limits _{j=1}^k {\sum \limits _{l=1}^{t+1} {B_{k,j} E\left( \tilde{{\mathbf {e}}}_j \tilde{{{\mathbf {e}}}}{'}_l \right) {B}'_{t+1,l} } } +\sum \limits _{j=1}^k {\sum \limits _{l=1}^{t+1} {D_{k,j} E\left( \varvec{\eta }_j {\varvec{\eta }}^{\prime }_{\,l}\right) {D}^{\prime }_{t+1,l} } } \right] {J}^{\prime }_d {\mathbf {z}}_{d,t+1} \nonumber \\&={{\mathbf {z}}}^{\prime }_{dk} J_d \left[ \sum \limits _{j=1}^k {\sum \limits _{l=1}^{t+1} {B_{k,j} \tilde{\Sigma }_{j,l} {B}^{\prime }_{t+1,l} } } +\sum \limits _{j=1}^k {D_{k,j} Q{D}^{\prime }_{t+1,j} } \right] {J}^{\prime }_d {\mathbf {z}}_{d,t+1}. \end{aligned}$$

(50)

Remark 19

Equation (50) refers to the first step of the benchmarking process.

Finally, by (37) and noting that $E(\varvec{\eta }_k^d {\varvec{\eta } }^{\prime }_j )=0$ for $j\ne k$,

$$\begin{aligned} E\left( \varvec{\eta }_k^d {\tilde{{\mathbf {e}}}}{'}_{t+1}^d\right)&= E\left[ \varvec{\eta }_k^d \left( {{\mathbf {e}}}{'}_{t+1}^d ,r_{d,t+1}^\mathrm{bmk} \right) \right] =\left[ 0_{Sq\times S} ,E\left( {\mathbf {\eta }}_k^d r_{d,t+1}^\mathrm{bmk} \right) \right] \nonumber \\&= \left[ 0_{Sq\times S} ,-E\left( \varvec{\eta }_k^d {\varvec{\eta }}^{\prime }_k \right) {D}^{\prime }_{t+1,k} {J}^{\prime }_d {\mathbf {z}}_{d,t+1} \right] , \end{aligned}$$

(51)

where $0_{Sq\times S} $ is the null matrix of dimension $Sq\times S$. We assume throughout that $\varvec{\alpha }_{dk} = \sum \nolimits _{s=1}^S {b_{ds,k} \varvec{\alpha }_{ds,k}}$, implying $\varvec{\eta }_k^d =J_d \varvec{\eta }_k = \sum \nolimits _{s=1}^S {b_{ds,k} \varvec{\eta }_{ds,k} } $ and hence,

(52)

where is the matrix of dimension $Sq\times Dq$ with $[b_{d1,k} Q_{d1,k},\ldots ,b_{dS,k} Q_{dSk}]^{\prime }$ in columns $(d-1)q+1,\ldots ,dq$ and zeroes elsewhere; $Q_{ds,k} =E(\varvec{\eta }_{ds,k} {\varvec{\eta }}^{\prime }_{ds,k} )$ (Eq. 29).

Substituting (47)–(50) in (46), and then (46) and (52) in (45) completes the computation of the covariance matrix $C_{d,t+1}^\mathrm{bmk} $.

Appendix C: Computation of $P_t^{d,\mathrm{bmk}} =E[(\hat{\varvec{\alpha }}_t^{d,\mathrm{bmk}} -\varvec{\alpha }_t^d )(\hat{\varvec{\alpha }}_t^{d,\mathrm{bmk}} -\varvec{\alpha }_t^d )^{\prime }]$

The computation of the true V–C matrix of the benchmarked state prediction errors follows the same steps as the computation of the V–C matrix $P_t^\mathrm{bmk} $ in the first-stage benchmarking (Appendix in P–T). First, rewrite the benchmarked predictor (33) as,

$$\begin{aligned} \tilde{\varvec{\alpha }}_t^{d,\mathrm{bmk}}&= \left[ I-\left( P_{t\vert t-1}^{\,d,\mathrm{bmk}} \tilde{{Z}'}_t^d -C_{dt,0}^\mathrm{bmk} \right) R_{dt}^{-1} \tilde{Z}_t^d \right] \tilde{T}_d \tilde{\varvec{\alpha }}_{t-1}^{d,\mathrm{bmk}}\nonumber \\&+\left( P_{t\vert t-1}^{\,d,\mathrm{bmk}} \tilde{{Z}'}_t^d -C_{dt,0}^\mathrm{bmk} \right) R_{dt}^{-1} \tilde{{\mathbf {y}}}_t^d =G_t^d \tilde{T}_d \tilde{\varvec{\alpha }}_{t-1}^{d,\mathrm{bmk}} +K_t^d \tilde{{\mathbf {y}}}_t^d. \end{aligned}$$

(53)

Next, substitute $\tilde{{\mathbf {y}}}_t^d =\tilde{Z}_t^d \varvec{\alpha }_t^d +\tilde{{\mathbf {e}}}_t^d $ and decompose $\varvec{\alpha }_t^d =G_t^d \varvec{\alpha }_t^d +K_t^d Z_t^d \varvec{\alpha }_t^d $, implying

$$\begin{aligned} \tilde{\varvec{\alpha }}_t^{d,\mathrm{bmk}} -\varvec{\alpha }_t^d =G_t^d \left( \tilde{T}_d \tilde{\varvec{\alpha } }_{t-1}^{d,\mathrm{bmk}} -\varvec{\alpha }_t^d\right) +K_t^d \tilde{{\mathbf {e}}}_t^{d}. \end{aligned}$$

(54)

The expression for $P_t^{d,\mathrm{bmk}} $ in (34) follows straightforwardly.

Appendix D: Derivation of Eq. (24)

In what follows we drop for convenience the area index $i$ from the notation. Let $\beta _t =T\beta _{t-1} +\eta _t $, where $\eta _t$ is white noise. Repeated substitutions imply

$$\begin{aligned} \beta _t =T^t\beta _0 +\sum \limits _{l=1}^t {T^{t-l}\eta _l }. \end{aligned}$$

(55)

For the model defined by (23b)–(23c), $\beta _t =({\beta }{'}_t^{(1)} ,{\beta }{'}_t^{(2)})^{\prime }$, where $\beta _t^{(1)} =(L_t ,R_t )^{\prime }$ and $\beta _t^{(2)} =(S_{1,t} ,S_{1,t}^*,\ldots ,S_{5,t} ,S_{5,t}^*,S_{6,t} )^{\prime }$. Accordingly, let $\eta _t =({\eta }{'}_t^{(1)} ,{\eta }{'}_t^{(2)} )^{\prime }$, where $\eta _t^{(1)} =(\eta _{_{Lt}} ,\eta _{_{Rt}})^{\prime }$, $\eta _t^{(2)} =(\eta _{_{1t}} ,\eta _{_{1t}}^*,\ldots ,\eta _{_{5t}} ,\eta _{_{5t}}^*,\eta _{_{6t}} )^{\prime }$. Define $h=(1,0,1,0,\ldots 1,0,1)^{\prime }$ such that by (23c) $S_t ={h}^{\prime }\beta _t^{(2)} $. Then, by (23b) the population value at time $t$ is

$$\begin{aligned} Y_t =x_t \beta _1 +L_t +S_t =x_t \beta _1 +(1,0)\beta _t^{(1)} +{h}^{\prime }\beta _t^{(2)}. \end{aligned}$$

(56)

Under the model (23b)–(23c), $T=\left[ {{\begin{array}{c@{\quad }c} {T_{(1)} }&{} 0 \\ 0&{} {T_{(2)} } \\ \end{array} }} \right] $, where $T_{^{(1)}} =\left[ {{\begin{array}{c@{\quad }c} 1&{} 1 \\ 0&{} 1\\ \end{array} }} \right] $ and

$$\begin{aligned} T_{^{(2)}} =\left[ {{\begin{array}{c@{\quad }c@{\quad }c@{\quad }c@{\quad }c@{\quad }c@{\quad }c} {\mathrm{\cos }\frac{\pi }{6}} &{} {\mathrm{sin}\frac{\pi }{6}} &{} 0 &{} 0 &{} 0 &{} 0 \\ {-\mathrm{sin}\frac{\pi }{6}} &{} {\mathrm{cos}\frac{\pi }{6}} &{} &{} &{} &{} 0 \\ 0 &{} &{} 0 &{} &{} &{} 0 \\ 0 &{} &{} &{} {\mathrm{cos}\frac{5\pi }{6}} &{} {\mathrm{sin}\frac{5\pi }{6}} &{} 0 \\ 0 &{} &{} &{} {-\mathrm{sin}\frac{5\pi }{6}} &{} {\mathrm{cos}\frac{5\pi }{6}} &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} {\mathrm{cos}\pi } \\ \end{array} }} \right] . \end{aligned}$$

At time $t, T^t=\left[ {{\begin{array}{c@{\quad }c} {T_{(1)}^t } &{} 0 \\ 0 \ &{} {T_{(2)}^t }\\ \end{array} }} \right] $ where $T_{(1)}^t =\left[ {{\begin{array}{c@{\quad }c} 1 &{} 1 \\ 0 &{} 1 \\ \end{array} }} \right] ^t=\left[ {{\begin{array}{c@{\quad }c} 1 &{} t \\ 0 &{} 1 \\ \end{array} }} \right] $ and $T_{(2)}^t $ is obtained from $T_{(2)} $ by replacing every angle $\frac{j\pi }{6}$ in $T_{(2)} $ by $\frac{j\pi t}{6}$, $j=1,\ldots ,6$.

It follows that $T_{(1)}^t \beta _0^{(1)} =(L_0 +tR_0 ,R_0 )^{\prime }$, and by familiar trigonometric equalities ${h}^{\prime }\beta _t^{(2)} ={h}^{\prime }T_{(2)}^t \beta _0^{(2)} =(\cos \frac{\pi }{6}t,\sin \frac{\pi }{6}t,\ldots ,\cos \pi t)\beta _0^{(2)} =\sum \nolimits _{j=1}^6 {S_{j0} \cos \frac{\pi j}{6}t} +\sum \nolimits _{j=1}^5 {S_{j0}^*\sin \frac{\pi j}{6}t}.$

Similar computations imply,

$$\begin{aligned} T_{(1)}^{t-l} \eta _l^{(1)}&= (\eta _{_{Ll}} +(t-l)\eta _{_{lR}} ,\eta _{_{lR}} )^{\prime };\\{h}^{\prime }T_{(2)}^{t-l} \eta _l^{(2)}&= \sum \limits _{l=1}^6 {\eta _{_{l,j}} \cos \frac{\pi l}{6}(t} -l)+\sum \limits _{l=1}^5 {\eta _{_{l,j}}^*\sin \frac{\pi l}{6}(t-l)}. \end{aligned}$$

By (55) and (56)

$$\begin{aligned} Y_t -x_t \beta _1&= (1,0)\beta _t^{(1)} +{h}'\beta _t^{(2)} =L_0 +tR_0 +R_0 +\sum \limits _{j=1}^6 {S_{j0} \cos \frac{\pi j}{6}t} \nonumber \\&+\sum \limits _{j=1}^5 {S_{j0}^*\sin \frac{\pi j}{6}t}+\sum \limits _{l=1}^t {\eta _{_{Ll}} } +\sum \limits _{l=1}^t {(t-l)\eta _{_{Rl}} } +\sum \limits _{l=1}^t {\eta _{_{Rl}} } \nonumber \\&+\sum \limits _{l=1}^t {\left[ \sum \limits _{j=1}^6 {\eta _{jl} \cos \frac{\pi j}{6}(t} -l)+\sum \limits _{j=1}^5 {\eta _{jl}^*\sin \frac{\pi j}{6}(t-l)}\right] }. \end{aligned}$$

(57)

Equation (24) follows from (57).

Appendix E: Census Divisions and States in the USA

Census divisions	States
New England	Connecticut, Maine, Massachusetts, New Hampshire, Rhode Island, Vermont
Middle Atlantic	New Jersey, New York, Pennsylvania
East North Central	Illinois, Indiana, Michigan, Ohio, Wisconsin
West North Central	Iowa, Kansas, Minnesota, Missouri, Nebraska, North Dakota, South Dakota
South Atlantic	Delaware, District of Columbia, Florida, Georgia, Maryland, North Carolina, South Carolina, Virginia, West Virginia
East South Central	Alabama, Kentucky, Mississippi, Tennessee
West South Central	Arkansas, Louisiana, Oklahoma, Texas
Mountain	Arizona, Colorado, Idaho, Montana, Nevada, New Mexico, Utah, Wyoming
Pacific	Alaska, California, Hawaii, Oregon, Washington

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pfeffermann, D., Sikov, A. & Tiller, R. Single- and two-stage cross-sectional and time series benchmarking procedures for small area estimation. TEST 23, 631–666 (2014). https://doi.org/10.1007/s11749-014-0398-y

Download citation

Published: 21 October 2014
Issue Date: December 2014
DOI: https://doi.org/10.1007/s11749-014-0398-y

Keywords

Mathematics Subject Classification

37M10

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Single- and two-stage cross-sectional and time series benchmarking procedures for small area estimation

Abstract

Access this article

Similar content being viewed by others

M-Quantile Small Area Estimation for Panel Data

Small area estimation under a temporal bivariate area-level linear mixed model with independent time effects

Joint tests for dynamic and spatial effects in short panels with fixed effects and heteroskedasticity

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendices

Appendix A: Computation of \(\tilde{\Sigma }_{tt}^d =E(\tilde{{\mathbf {e}}}_t^d {\tilde{{\mathbf {e}}}}{'}_t^d )\)

Appendix B: Computation of \(C_{dt}^\mathrm{bmk} =E[(\tilde{T}^d\tilde{\varvec{\alpha }}_{t-1}^{d,\mathrm{bmk}} -\varvec{\alpha }_t^d ){\tilde{{\mathbf {e}}}}{'}_t^d ]\)

Remark 18

Remark 19

Appendix C: Computation of \(P_t^{d,\mathrm{bmk}} =E[(\hat{\varvec{\alpha }}_t^{d,\mathrm{bmk}} -\varvec{\alpha }_t^d )(\hat{\varvec{\alpha }}_t^{d,\mathrm{bmk}} -\varvec{\alpha }_t^d )^{\prime }]\)

Appendix D: Derivation of Eq. (24)

Appendix E: Census Divisions and States in the USA

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Single- and two-stage cross-sectional and time series benchmarking procedures for small area estimation

Abstract

Access this article

Similar content being viewed by others

M-Quantile Small Area Estimation for Panel Data

Small area estimation under a temporal bivariate area-level linear mixed model with independent time effects

Joint tests for dynamic and spatial effects in short panels with fixed effects and heteroskedasticity

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendices

Appendix A: Computation of \(\tilde{\Sigma }_{tt}^d =E(\tilde{{\mathbf {e}}}_t^d {\tilde{{\mathbf {e}}}}{'}_t^d )\)

Appendix B: Computation of \(C_{dt}^\mathrm{bmk} =E[(\tilde{T}^d\tilde{\varvec{\alpha }}_{t-1}^{d,\mathrm{bmk}} -\varvec{\alpha }_t^d ){\tilde{{\mathbf {e}}}}{'}_t^d ]\)

Remark 18

Remark 19

Appendix C: Computation of \(P_t^{d,\mathrm{bmk}} =E[(\hat{\varvec{\alpha }}_t^{d,\mathrm{bmk}} -\varvec{\alpha }_t^d )(\hat{\varvec{\alpha }}_t^{d,\mathrm{bmk}} -\varvec{\alpha }_t^d )^{\prime }]\)

Appendix D: Derivation of Eq. (24)

Appendix E: Census Divisions and States in the USA

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation