Skip to main content
Log in

Expectation-robust algorithm and estimating equations for means and dispersion matrix with missing data

  • Published:
Annals of the Institute of Statistical Mathematics Aims and scope Submit manuscript

Abstract

Means and covariance/dispersion matrix are the building blocks for many statistical analyses. By naturally extending the score functions based on a multivariate \(t\)-distribution to estimating equations, this article defines a class of M-estimators of means and dispersion matrix for samples with missing data. An expectation-robust (ER) algorithm solving the estimating equations is obtained. The obtained relationship between the ER algorithm and the corresponding estimating equations allows us to obtain consistent standard errors when robust means and dispersion matrix are further analyzed. Estimating equations corresponding to existing ER algorithms for computing M- and S-estimators are also identified. Monte Carlo results show that robust methods outperform the normal-distribution-based maximum likelihood when the population distribution has heavy tails or when data are contaminated. Applications of the results to robust analysis of linear regression and growth curve models are discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. The EM algorithm presented here is slightly different from that in Little (1988), where a conditional normal distribution is used.

  2. An S-estimator is not defined by estimating equations but by minimizing \(|\mathbf{\Sigma }|\) under a proper constraint.

  3. The true distribution of the observed \(\mathbf{x}_i\) will be different from the corresponding marginal distributions of \(\mathbf{x}\) when the missing values are either missing at random or not at random.

References

  • Cheng, T. C., Victoria-Feser, M. P. (2002). High-breakdown estimation of multivariate mean and covariance with missing observations. British Journal of Mathematical and Statistical Psychology, 55, 317–335.

  • Devlin, S. J., Gnanadesikan, R., Kettenring, J. R. (1981). Robust estimation of dispersion matrices and principal components. Journal of the American Statistical Association, 76, 354–362.

  • Efron, B., Tibshirani, R. J. (1993). An Introduction to the bootstrap. New York: Chapman & Hall.

  • Godambe, V. P. (1960). An optimum property of regular maximum likelihood estimation. Annals of Mathematical Statistics, 31, 1208–1211.

    Article  MathSciNet  Google Scholar 

  • Godambe, V. P. (Ed.). (1991). Estimating functions. New York: Oxford University Press.

    MATH  Google Scholar 

  • Green, P. J. (1984). Iteratively reweighted least squares for maximum likelihood estimation, and some robust and resistent alternatives (with discussion). Journal of the Royal Statistical Society B, 46, 149–192.

    MATH  Google Scholar 

  • Hampel, F. R., Ronchetti, E. M., Rousseeuw, P. J., Stahel, W. A. (1986). Robust statistics: The approach based on influence functions. New York: Wiley.

  • Heritier, S., Cantoni, E., Copt, S., Victoria-Feser, M. P. (2009). Robust methods in biostatistics. Southern Gate: Wiley.

  • Huber, P. J. (1967). The behavior of maximum likelihood estimates under nonstandard conditions. Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability (Vol. I, pp. 221–233). Oakland: University of California Press.

  • Huber, P. J. (1981). Robust statistics. New York: John Wiley.

    Book  MATH  Google Scholar 

  • Johnson, R. A., Wichern, D. W. (2002). Applied multivariate statistical analysis (5th ed.). New Jersey: Prentice-Hall.

  • Kano, Y., Berkane, M., Bentler, P. M. (1993). Statistical inference based on pseudo-maximum likelihood estimators in elliptical populations. Journal of the American Statistical Association, 88, 135–143.

  • Kelley, C. T. (2003). Solving nonlinear equations with Newton’s method. Philadelphia: SIAM.

    Book  MATH  Google Scholar 

  • Kent, J. T., Tyler, D. E., Vardi, Y. (1994). A curious likelihood identity for the multivariate t-distribution. Communications in Statistics Simulation and Computation, 23, 441–453.

  • Liang, K. Y., Zeger, S. L. (1986). Longitudinal data analysis using generalized linear models. Biometrika, 73, 13–22.

  • Little, R. J. A. (1988). Robust estimation of the mean and covariance matrix from data with missing values. Applied Statistics, 37, 23–38.

    Article  MathSciNet  MATH  Google Scholar 

  • Little, R. J. A., Schluchter, M. D. (1985). Maximum likelihood estimation for mixed continuous and categorical data with missing values. Biometrika, 72, 497–512.

  • Little, R. J. A., Smith, P. J. (1987). Editing and imputing for quantitative survey data. Journal of the American Statistical Association, 82, 58–68.

  • Liu, C. (1997). ML estimation of the multivariate \(t\) distribution and the EM algorithm. Journal of Multivariate Analysis, 63, 296–312.

    Article  MathSciNet  MATH  Google Scholar 

  • Lopuhaä, H. P. (1989). On the relation between S-estimators and M-estimators of multivariate location and covariances. Annals of Statistics, 17, 1662–1683.

    Article  MathSciNet  MATH  Google Scholar 

  • Maronna, R. A. (1976). Robust M-estimators of multivariate location and scatter. Annals of Statistics, 4, 51–67.

    Article  MathSciNet  MATH  Google Scholar 

  • Maronna, R. A., Martin, R. D., Yohai, V. J. (2006). Robust statistics: theory and methods. New York: Wiley.

  • Maronna, R., Zamar, R. (2002). Robust estimates of location and dispersion for high-dimensional datasets. Technometrics, 44, 307–317.

  • Mehrotra, D. V. (1995). Robust elementwise estimation of a dispersion matrix. Biometrics, 51, 1344–1351.

    Article  MATH  Google Scholar 

  • Meng, X. L., van Dyk, D. A. (1997). The EM algorithm: an old folk song sung to a fast new tune (with discussion). Journal of the Royal Statistical Society B, 59, 511–567.

  • Micceri, T. (1989). The unicorn, the normal curve, and other improbable creatures. Psychological Bulletin, 105, 156–166.

    Article  Google Scholar 

  • Poon, W. Y., Poon, Y. S. (2002). Influential observations in the estimation of mean vector and covariance matrix. British Journal of Mathematical and Statistical Psychology, 55, 177–192.

  • Prentice, R. L., Zhao, L. P. (1991). Estimating equations for parameters in means and covariances of multivariate discrete and continuous responses. Biometrics, 47, 825–839.

  • Richardson, A. M., Welsh, A. H. (1995). Robust restricted maximum likelihood in mixed linear models. Biometrics, 51, 1429–1439.

  • Rocke, D. M. (1996). Robustness properties of S-estimators of multivariate location and shape in high dimension. Annals of Statistics, 24, 1327–1345.

    Article  MathSciNet  MATH  Google Scholar 

  • Rubin, D. B. (1976). Inference and missing data (with discussions). Biometrika, 63, 581–592.

    Article  MathSciNet  MATH  Google Scholar 

  • Ruppert, D. (1992). Computing S estimators for regression and multivariate location/dispersion. Journal of Computational and Graphical Statistics, 1, 253–270.

    Google Scholar 

  • Savalei, V., Falk, C. (2014). Robust two-stage approach outperforms robust FIML with incomplete nonnormal data. Structural Equation Modeling, 21, 280–302.

  • Schott, J. (2005). Matrix analysis for statistics (2nd ed.). New York: Wiley.

    MATH  Google Scholar 

  • Sen, P. K. (1968). Estimates of the regression coefficient based on Kendalls tau. Journal of the American Statistical Association, 63, 1379–1389.

    Article  MathSciNet  MATH  Google Scholar 

  • Theil, H. (1950). Rank invariant method for linear and polynomial regression analysis. Indagationes Mathematicae, 12, 85–91.

  • Tyler, D. E. (1991). Some issues in the robust estimation of multivariate location and scatter. In W. Stahel S. Weisberg (Eds.), Directions in robust statistics and diagnostics part II (pp. 327–336). New York: Springer-Verlag.

  • Wilcox, R. R. (1998). A note on the Theil-Sen regression estimator when the regressor is random and the error term is heteroscedastic. Biometrical Journal, 40, 261–268.

    Article  MATH  Google Scholar 

  • Wilcox, R. R. (2012). Introduction to robust estimation and hypothesis testing (3rd ed.). Waltham: Academic Press.

    MATH  Google Scholar 

  • Yuan, K.-H., Jennrich, R. I. (1998). Asymptotics of estimating equations under natural conditions. Journal of Multivariate Analysis, 65, 245–260.

  • Yuan, K.-H., Zhang, Z. (2012). Robust structural equation modeling with missing data and auxiliary variables. Psychometrika, 77, 803–826.

  • Yuan, K.-H., Bentler, P. M., Chan, W. (2004). Structural equation modeling with heavy tailed distributions. Psychometrika, 69, 421–436.

  • Yuan, K.-H., Wallentin, F., Bentler, P. M. (2012). ML versus MI for missing data with violation of distribution conditions. Sociological Methods & Research, 41, 598–629.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ke-Hai Yuan.

Appendix A

Appendix A

This appendix shows that the converged values of the ER algorithm in (2), (3), (12) and (13) satisfy (8) and (9). For simple notation we use \({\varvec{\mu }}\) and \(\mathbf{\Sigma }\) to denote the converged values and rewrite (12) and (13) as

$$\begin{aligned} \sum _{i=1}^nw_{i1}(\hat{\mathbf{x}}_{ic}-{\varvec{\mu }})=\mathbf{0}\end{aligned}$$
(28)

and

$$\begin{aligned} \sum _{i=1}^n \left[ w_{i2}(\hat{\mathbf{x}}_{ci}-{\varvec{\mu }}) (\hat{\mathbf{x}}_{ic}-{\varvec{\mu }})'+w_{i3}(\mathbf{C}_i-\mathbf{\Sigma })\right] =\mathbf{0}, \end{aligned}$$
(29)

where

$$\begin{aligned} \mathbf{C}_i=\left( \begin{array}{ll} \mathbf{0}&{}\quad \mathbf{0}\\ \mathbf{0}&{}\quad \mathbf{C}_{imm} \end{array} \right) \end{aligned}$$

with \(\mathbf{C}_{imm}=\mathbf{\Sigma }_{imm}-\mathbf{\Sigma }_{imo}\mathbf{\Sigma }_i^{-1}\mathbf{\Sigma }_{iom}\) being the converged \(\mathbf{C}_{imm}^{(j)}\) in (3). Notice that

$$\begin{aligned} \hat{\mathbf{x}}_{ic}-{\varvec{\mu }}=\left( \begin{array}{c} \mathbf{x}_i-{\varvec{\mu }}_i\\ \mathbf{\Sigma }_{imo}\mathbf{\Sigma }_{i}^{-1}(\mathbf{x}_i-{\varvec{\mu }}_{i}) \end{array} \right) \end{aligned}$$

and

$$\begin{aligned} \mathbf{\Sigma }^{-1}=\left( \begin{array}{ll} \mathbf{I}_{p_i}&{}\quad -\mathbf{\Sigma }_{i}^{-1}\mathbf{\Sigma }_{iom}\\ \mathbf{0}&{}\quad \mathbf{I}_{q_i} \end{array} \right) \left( \begin{array}{ll} \mathbf{\Sigma }_{i}^{-1}&{}\quad \mathbf{0}\\ \mathbf{0}&{}\quad \mathbf{\Sigma }_{i(m|o)}^{-1} \end{array} \right) \left( \begin{array}{ll} \mathbf{I}_{p_i}&{}\quad \mathbf{0}\\ -\mathbf{\Sigma }_{imo}\mathbf{\Sigma }_{i}^{-1}&{}\quad \mathbf{I}_{q_i} \end{array} \right) , \end{aligned}$$

where \(q_i=p-p_i\) and \(\mathbf{\Sigma }_{i(m|o)}=\mathbf{C}_{imm}\). Direct matrix multiplication yields

$$\begin{aligned} \mathbf{\Sigma }^{-1}(\hat{\mathbf{x}}_{ic}-{\varvec{\mu }})= \left( \begin{array}{l} \mathbf{\Sigma }_{i}^{-1}(\mathbf{x}_i-{\varvec{\mu }}_{i})\\ \mathbf{0}\end{array} \right) . \end{aligned}$$
(30)

The equivalence of (8) and (28) follows from (30) and \(\partial {\varvec{\mu }}_i/\partial {\varvec{\mu }}'=(\mathbf{I}_{p_i}, \mathbf{0})\).

For showing equivalence of (9) and (29), let \(\mathbf{H}_i=(\hat{\mathbf{x}}_{ic}-{\varvec{\mu }})(\hat{\mathbf{x}}_{ic}-{\varvec{\mu }})'\). When \(p_i<p\),

$$\begin{aligned} \mathbf{H}_i=\left( \begin{array}{ll} \mathbf{H}_{ioo}&{}\quad \mathbf{H}_{ioo}\mathbf{\Sigma }_{i}^{-1}\mathbf{\Sigma }_{iom}\\ \mathbf{\Sigma }_{imo}\mathbf{\Sigma }_{i}^{-1}\mathbf{H}_{ioo} &{}\quad \mathbf{\Sigma }_{imo}\mathbf{\Sigma }_{i}^{-1}\mathbf{H}_{ioo}\mathbf{\Sigma }_{i}^{-1}\mathbf{\Sigma }_{iom} \end{array} \right) , \end{aligned}$$

where \(\mathbf{H}_{ioo}=(\mathbf{x}_i-{\varvec{\mu }}_i)(\mathbf{x}_i-{\varvec{\mu }}_i)'\). Matrix multiplications yield

$$\begin{aligned} \mathbf{\Sigma }^{-1}\mathbf{H}_i\mathbf{\Sigma }^{-1} = \left( \begin{array}{l@{\quad }l} \mathbf{\Sigma }_{i}^{-1}\mathbf{H}_{ioo}\mathbf{\Sigma }_{i}^{-1}&{}\mathbf{0}\\ \mathbf{0}&{}\mathbf{0}\end{array} \right) \end{aligned}$$
(31)

and

$$\begin{aligned} \mathbf{\Sigma }^{-1}\mathbf{C}_i\mathbf{\Sigma }^{-1}= \left( \begin{array}{l@{\quad }l} \mathbf{\Sigma }_{i}^{-1}\mathbf{\Sigma }_{iom}\mathbf{\Sigma }_{i(m|o)}^{-1} \mathbf{\Sigma }_{imo}\mathbf{\Sigma }_{i}^{-1} &{}-\mathbf{\Sigma }_{i}^{-1}\mathbf{\Sigma }_{iom}\mathbf{\Sigma }_{i(m|o)}^{-1}\\ -\mathbf{\Sigma }_{i(m|o)}^{-1}\mathbf{\Sigma }_{imo}\mathbf{\Sigma }_{i}^{-1} &{}\mathbf{\Sigma }_{i(m|o)}^{-1} \end{array} \right) . \end{aligned}$$

Notice that

$$\begin{aligned} \mathbf{\Sigma }^{-1}= \left( \begin{array}{ll} \mathbf{\Sigma }_{i}^{-1}+\mathbf{\Sigma }_{i}^{-1}\mathbf{\Sigma }_{iom}\mathbf{\Sigma }_{i(m|o)}^{-1} \mathbf{\Sigma }_{imo}\mathbf{\Sigma }_{i}^{-1} &{}-\mathbf{\Sigma }_{i}^{-1}\mathbf{\Sigma }_{iom}\mathbf{\Sigma }_{i(m|o)}^{-1}\\ -\mathbf{\Sigma }_{i(m|o)}^{-1}\mathbf{\Sigma }_{imo}\mathbf{\Sigma }_{i}^{-1} &{}\mathbf{\Sigma }_{i(m|o)}^{-1} \end{array} \right) . \end{aligned}$$

There exists

$$\begin{aligned} \mathbf{\Sigma }^{-1}=\mathbf{\Sigma }^{-1}\mathbf{C}_i\mathbf{\Sigma }^{-1}+ \left( \begin{array}{ll} \mathbf{\Sigma }_{i}^{-1}&{}\mathbf{0}\\ \mathbf{0}&{}\mathbf{0}\end{array} \right) . \end{aligned}$$
(32)

The equivalence of (9) and (29) follows from (31), (32), and by noticing that (9) can be rewritten as

$$\begin{aligned} \sum _{i=1}^n\mathrm{tr}\left\{ \left[ w_{i2}\mathbf{\Sigma }_i^{-1}(\mathbf{x}_i-{\varvec{\mu }}_i) (\mathbf{x}_i-{\varvec{\mu }}_i)'\mathbf{\Sigma }_i^{-1}-w_{i3}\mathbf{\Sigma }_i^{-1}\right] (d\mathbf{\Sigma }_i)\right\} =0, \end{aligned}$$

where \(d\mathbf{\Sigma }_i\) is the differential of \(\mathbf{\Sigma }_i\).

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yuan, KH., Chan, W. & Tian, Y. Expectation-robust algorithm and estimating equations for means and dispersion matrix with missing data. Ann Inst Stat Math 68, 329–351 (2016). https://doi.org/10.1007/s10463-014-0498-1

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10463-014-0498-1

Keywords

Navigation