Abstract
Means and covariance/dispersion matrix are the building blocks for many statistical analyses. By naturally extending the score functions based on a multivariate \(t\)-distribution to estimating equations, this article defines a class of M-estimators of means and dispersion matrix for samples with missing data. An expectation-robust (ER) algorithm solving the estimating equations is obtained. The obtained relationship between the ER algorithm and the corresponding estimating equations allows us to obtain consistent standard errors when robust means and dispersion matrix are further analyzed. Estimating equations corresponding to existing ER algorithms for computing M- and S-estimators are also identified. Monte Carlo results show that robust methods outperform the normal-distribution-based maximum likelihood when the population distribution has heavy tails or when data are contaminated. Applications of the results to robust analysis of linear regression and growth curve models are discussed.
Similar content being viewed by others
Notes
The EM algorithm presented here is slightly different from that in Little (1988), where a conditional normal distribution is used.
An S-estimator is not defined by estimating equations but by minimizing \(|\mathbf{\Sigma }|\) under a proper constraint.
The true distribution of the observed \(\mathbf{x}_i\) will be different from the corresponding marginal distributions of \(\mathbf{x}\) when the missing values are either missing at random or not at random.
References
Cheng, T. C., Victoria-Feser, M. P. (2002). High-breakdown estimation of multivariate mean and covariance with missing observations. British Journal of Mathematical and Statistical Psychology, 55, 317–335.
Devlin, S. J., Gnanadesikan, R., Kettenring, J. R. (1981). Robust estimation of dispersion matrices and principal components. Journal of the American Statistical Association, 76, 354–362.
Efron, B., Tibshirani, R. J. (1993). An Introduction to the bootstrap. New York: Chapman & Hall.
Godambe, V. P. (1960). An optimum property of regular maximum likelihood estimation. Annals of Mathematical Statistics, 31, 1208–1211.
Godambe, V. P. (Ed.). (1991). Estimating functions. New York: Oxford University Press.
Green, P. J. (1984). Iteratively reweighted least squares for maximum likelihood estimation, and some robust and resistent alternatives (with discussion). Journal of the Royal Statistical Society B, 46, 149–192.
Hampel, F. R., Ronchetti, E. M., Rousseeuw, P. J., Stahel, W. A. (1986). Robust statistics: The approach based on influence functions. New York: Wiley.
Heritier, S., Cantoni, E., Copt, S., Victoria-Feser, M. P. (2009). Robust methods in biostatistics. Southern Gate: Wiley.
Huber, P. J. (1967). The behavior of maximum likelihood estimates under nonstandard conditions. Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability (Vol. I, pp. 221–233). Oakland: University of California Press.
Huber, P. J. (1981). Robust statistics. New York: John Wiley.
Johnson, R. A., Wichern, D. W. (2002). Applied multivariate statistical analysis (5th ed.). New Jersey: Prentice-Hall.
Kano, Y., Berkane, M., Bentler, P. M. (1993). Statistical inference based on pseudo-maximum likelihood estimators in elliptical populations. Journal of the American Statistical Association, 88, 135–143.
Kelley, C. T. (2003). Solving nonlinear equations with Newton’s method. Philadelphia: SIAM.
Kent, J. T., Tyler, D. E., Vardi, Y. (1994). A curious likelihood identity for the multivariate t-distribution. Communications in Statistics Simulation and Computation, 23, 441–453.
Liang, K. Y., Zeger, S. L. (1986). Longitudinal data analysis using generalized linear models. Biometrika, 73, 13–22.
Little, R. J. A. (1988). Robust estimation of the mean and covariance matrix from data with missing values. Applied Statistics, 37, 23–38.
Little, R. J. A., Schluchter, M. D. (1985). Maximum likelihood estimation for mixed continuous and categorical data with missing values. Biometrika, 72, 497–512.
Little, R. J. A., Smith, P. J. (1987). Editing and imputing for quantitative survey data. Journal of the American Statistical Association, 82, 58–68.
Liu, C. (1997). ML estimation of the multivariate \(t\) distribution and the EM algorithm. Journal of Multivariate Analysis, 63, 296–312.
Lopuhaä, H. P. (1989). On the relation between S-estimators and M-estimators of multivariate location and covariances. Annals of Statistics, 17, 1662–1683.
Maronna, R. A. (1976). Robust M-estimators of multivariate location and scatter. Annals of Statistics, 4, 51–67.
Maronna, R. A., Martin, R. D., Yohai, V. J. (2006). Robust statistics: theory and methods. New York: Wiley.
Maronna, R., Zamar, R. (2002). Robust estimates of location and dispersion for high-dimensional datasets. Technometrics, 44, 307–317.
Mehrotra, D. V. (1995). Robust elementwise estimation of a dispersion matrix. Biometrics, 51, 1344–1351.
Meng, X. L., van Dyk, D. A. (1997). The EM algorithm: an old folk song sung to a fast new tune (with discussion). Journal of the Royal Statistical Society B, 59, 511–567.
Micceri, T. (1989). The unicorn, the normal curve, and other improbable creatures. Psychological Bulletin, 105, 156–166.
Poon, W. Y., Poon, Y. S. (2002). Influential observations in the estimation of mean vector and covariance matrix. British Journal of Mathematical and Statistical Psychology, 55, 177–192.
Prentice, R. L., Zhao, L. P. (1991). Estimating equations for parameters in means and covariances of multivariate discrete and continuous responses. Biometrics, 47, 825–839.
Richardson, A. M., Welsh, A. H. (1995). Robust restricted maximum likelihood in mixed linear models. Biometrics, 51, 1429–1439.
Rocke, D. M. (1996). Robustness properties of S-estimators of multivariate location and shape in high dimension. Annals of Statistics, 24, 1327–1345.
Rubin, D. B. (1976). Inference and missing data (with discussions). Biometrika, 63, 581–592.
Ruppert, D. (1992). Computing S estimators for regression and multivariate location/dispersion. Journal of Computational and Graphical Statistics, 1, 253–270.
Savalei, V., Falk, C. (2014). Robust two-stage approach outperforms robust FIML with incomplete nonnormal data. Structural Equation Modeling, 21, 280–302.
Schott, J. (2005). Matrix analysis for statistics (2nd ed.). New York: Wiley.
Sen, P. K. (1968). Estimates of the regression coefficient based on Kendalls tau. Journal of the American Statistical Association, 63, 1379–1389.
Theil, H. (1950). Rank invariant method for linear and polynomial regression analysis. Indagationes Mathematicae, 12, 85–91.
Tyler, D. E. (1991). Some issues in the robust estimation of multivariate location and scatter. In W. Stahel S. Weisberg (Eds.), Directions in robust statistics and diagnostics part II (pp. 327–336). New York: Springer-Verlag.
Wilcox, R. R. (1998). A note on the Theil-Sen regression estimator when the regressor is random and the error term is heteroscedastic. Biometrical Journal, 40, 261–268.
Wilcox, R. R. (2012). Introduction to robust estimation and hypothesis testing (3rd ed.). Waltham: Academic Press.
Yuan, K.-H., Jennrich, R. I. (1998). Asymptotics of estimating equations under natural conditions. Journal of Multivariate Analysis, 65, 245–260.
Yuan, K.-H., Zhang, Z. (2012). Robust structural equation modeling with missing data and auxiliary variables. Psychometrika, 77, 803–826.
Yuan, K.-H., Bentler, P. M., Chan, W. (2004). Structural equation modeling with heavy tailed distributions. Psychometrika, 69, 421–436.
Yuan, K.-H., Wallentin, F., Bentler, P. M. (2012). ML versus MI for missing data with violation of distribution conditions. Sociological Methods & Research, 41, 598–629.
Author information
Authors and Affiliations
Corresponding author
Appendix A
Appendix A
This appendix shows that the converged values of the ER algorithm in (2), (3), (12) and (13) satisfy (8) and (9). For simple notation we use \({\varvec{\mu }}\) and \(\mathbf{\Sigma }\) to denote the converged values and rewrite (12) and (13) as
and
where
with \(\mathbf{C}_{imm}=\mathbf{\Sigma }_{imm}-\mathbf{\Sigma }_{imo}\mathbf{\Sigma }_i^{-1}\mathbf{\Sigma }_{iom}\) being the converged \(\mathbf{C}_{imm}^{(j)}\) in (3). Notice that
and
where \(q_i=p-p_i\) and \(\mathbf{\Sigma }_{i(m|o)}=\mathbf{C}_{imm}\). Direct matrix multiplication yields
The equivalence of (8) and (28) follows from (30) and \(\partial {\varvec{\mu }}_i/\partial {\varvec{\mu }}'=(\mathbf{I}_{p_i}, \mathbf{0})\).
For showing equivalence of (9) and (29), let \(\mathbf{H}_i=(\hat{\mathbf{x}}_{ic}-{\varvec{\mu }})(\hat{\mathbf{x}}_{ic}-{\varvec{\mu }})'\). When \(p_i<p\),
where \(\mathbf{H}_{ioo}=(\mathbf{x}_i-{\varvec{\mu }}_i)(\mathbf{x}_i-{\varvec{\mu }}_i)'\). Matrix multiplications yield
and
Notice that
There exists
The equivalence of (9) and (29) follows from (31), (32), and by noticing that (9) can be rewritten as
where \(d\mathbf{\Sigma }_i\) is the differential of \(\mathbf{\Sigma }_i\).
About this article
Cite this article
Yuan, KH., Chan, W. & Tian, Y. Expectation-robust algorithm and estimating equations for means and dispersion matrix with missing data. Ann Inst Stat Math 68, 329–351 (2016). https://doi.org/10.1007/s10463-014-0498-1
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10463-014-0498-1