Missing Data Mechanisms and Homogeneity of Means and Variances–Covariances

Yuan, Ke-Hai; Jamshidian, Mortaza; Kano, Yutaka

doi:10.1007/s11336-018-9609-x

Missing Data Mechanisms and Homogeneity of Means and Variances–Covariances

Published: 12 March 2018

Volume 83, pages 425–442, (2018)
Cite this article

Psychometrika Aims and scope Submit manuscript

689 Accesses
4 Citations
Explore all metrics

Abstract

Unless data are missing completely at random (MCAR), proper methodology is crucial for the analysis of incomplete data. Consequently, methods for effectively testing the MCAR mechanism become important, and procedures were developed via testing the homogeneity of means and variances–covariances across the observed patterns (e.g., Kim & Bentler in Psychometrika 67:609–624, 2002; Little in J Am Stat Assoc 83:1198–1202, 1988). The current article shows that the population counterparts of the sample means and covariances of a given pattern of the observed data depend on the underlying structure that generates the data, and the normal-distribution-based maximum likelihood estimates for different patterns of the observed sample can converge to the same values even when data are missing at random or missing not at random, although the values may not equal those of the underlying population distribution. The results imply that statistics developed for testing the homogeneity of means and covariances cannot be safely used for testing the MCAR mechanism even when the population distribution is multivariate normal.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Notes

At the concluding section on page 23, Thoemmes and Enders (2007) stated “One possible explanation for these results is that the sample size conditions we investigated (\(N = 100\) and 500) were not large enough for the test’s asymptotic properties to be evidenced."
The formulations in Eq. (10) are for simplicity in presentation. They can be replaced by \(x_1=\mu _1+\sigma _1z_1\) and \(x_2=\sigma _2[\rho z_1+(1-\rho ^2)^{1/2}z_2]+\mu _2\), and the results in the stated theorems still hold, because maximum likelihood estimates as well as sample means and covariances are equivariant.
Thoemmes and Enders (2007) excluded the jth variable \(x_{ij}\) in predicting \(r_{ij}\), i.e., only let \(x_{i1}\), \(x_{i2}\), \(\ldots \), \(x_{i(j-1)}\), \(x_{i(j+1)}\), \(\ldots \), \(x_{ip}\) be the covariates.

References

Anderson, T. W. (1957). Maximum likelihood estimates for the multivariate normal distribution when some observations are missing. Journal of the American Statistical Association, 52, 200–203.
Article Google Scholar
Bentler, P. M. (2006). EQS 6 structural equations program manual. Encino, CA: Multivariate Software.
Google Scholar
Blanca, M. J., Arnau, J., Löpez-Montiel, D., Bono, R., & Bendayan, R. (2015). Skewness and kurtosis in real data samples. Methodology, 9, 78–84.
Article Google Scholar
Bradley, J. V. (1978). Robustness? British Journal of Mathematical and Statistical Psychology, 31, 144–152.
Article Google Scholar
Chen, H. Y., & Little, R. (1999). A test of missing completely at random for generalised estimating equations with missing data. Biometrika, 86, 1–13.
Article Google Scholar
Enders, C. K. (2010). Applied missing data analysis. New York: Guilford.
Google Scholar
Galati, J. C., & Seaton, K. A. (2016). MCAR is not necessary for the complete cases to constitute a simple random subsample of the target sample. Statistical Methods in Medical Research, 25, 1527–1534.
Article PubMed Google Scholar
Hawkins, D. M. (1981). A new test for multivariate normality and homoscedasticity. Technometrics, 23, 105–110.
Article Google Scholar
Jamshidian, M., & Jalal, S. (2010). Tests of homoscedasticity, normality and missing completely at random for incomplete multivariate data. Psychometrika, 75, 649–674.
Article PubMed PubMed Central Google Scholar
Jamshidian, M., Jalal, S., & Jansen, C. (2014). MissMech: An R Package for testing homoscedasticity, multivariate normality, and missing completely at random (MCAR). Journal of Statistical Software, 56, 1–31.
Article Google Scholar
Jöreskog, K. G. (1971). Simultaneous factor analysis in several populations. Psychometrika, 36, 409–426.
Article Google Scholar
Kano, Y., & Takai, K. (2011). Analysis of NMAR missing data without specifying missing-data mechanisms in a linear latent variate model. Journal of Multivariate Analysis, 102, 1241–1255.
Article Google Scholar
Kim, K. H., & Bentler, P. M. (2002). Tests of homogeneity of means and covariance matrices for multivariate incomplete data. Psychometrika, 67, 609–624.
Article Google Scholar
Li, J., & Yu, Y. (2015). A nonparametric test of missing completely at random for incomplete multivariate data. Psychometrika, 80(3), 707–726.
Article PubMed Google Scholar
Little, R. J. A. (1988). A test of missing completely at random for multivariate data with missing values. Journal of the American Statistical Association, 83, 1198–1202.
Article Google Scholar
Little, R. J. A., & Rubin, D. B. (2002). Statistical analysis with missing data (2nd ed.). New York: Wiley.
Book Google Scholar
Micceri, T. (1989). The unicorn, the normal curve, and other improbable creatures. Psychological Bulletin, 105, 156–166.
Article Google Scholar
Park, T., & Davis, C. S. (1993). A test of the missing data mechanism for repeated categorical data. Biometrics, 49, 631–638.
Article PubMed Google Scholar
Park, T., & Lee, S.-Y. (1997). A test of missing completely at random for longitudinal data with missing observations. Statistics in Medicine, 16, 1859–1871.
Article PubMed Google Scholar
Qu, A., & Song, P. X. K. (2002). Testing ignorable missingness in estimating equation approaches for longitudinal data. Biometrika, 89, 841–850.
Article Google Scholar
Rubin, D. B. (1976). Inference and missing data (with discussions). Biometrika, 63, 581–592.
Article Google Scholar
Sörbom, D. (1974). A general method for studying differences in factor means and factor structures between groups. British Journal of Mathematical and Statistical Psychology, 27, 229–239.
Article Google Scholar
Tang, M., & Bentler, P. M. (1998). Theory and method for constrained estimation in structural equation models with incomplete data. Computational Statistics and Data Analysis, 27, 257–270.
Article Google Scholar
Thoemmes, F., & Enders, C. K. (2007). A structural equation model for testing whether data are missing completely at random. In Paper Presented at the Annual Meeting of the American Educational Research Association. IL: Chicago.
Yuan, K.-H. (2009). Normal distribution based pseudo ML for missing data: With applications to mean and covariance structure analysis. Journal of Multivariate Analysis, 100, 1900–1918.
Article Google Scholar
Yuan, K.-H., Chan, W., & Tian, Y. (2016). Expectation-robust algorithm and estimating equations for means and dispersion matrix with missing data. Annals of the Institute of Statistical Mathematics, 68, 329–351.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Nanjing University of Posts and Telecommunications, Nanjing, China
Ke-Hai Yuan
University of Notre Dame, Notre Dame, USA
Ke-Hai Yuan
California State University, Fullerton, Fullerton, USA
Mortaza Jamshidian
Osaka University, Suita, Japan
Yutaka Kano

Authors

Ke-Hai Yuan
View author publications
You can also search for this author in PubMed Google Scholar
Mortaza Jamshidian
View author publications
You can also search for this author in PubMed Google Scholar
Yutaka Kano
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ke-Hai Yuan.

Additional information

The research was supported by the National Science Foundation under Grant No. SES-1461355.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yuan, KH., Jamshidian, M. & Kano, Y. Missing Data Mechanisms and Homogeneity of Means and Variances–Covariances. Psychometrika 83, 425–442 (2018). https://doi.org/10.1007/s11336-018-9609-x

Download citation

Received: 05 November 2016
Revised: 09 February 2018
Published: 12 March 2018
Issue Date: June 2018
DOI: https://doi.org/10.1007/s11336-018-9609-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Missing Data Mechanisms and Homogeneity of Means and Variances–Covariances

Abstract

Access this article

Similar content being viewed by others

Ignoring Non-ignorable Missingness

Missing Data Theory

Planned Missing Data Designs I: The 3-Form Design

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Missing Data Mechanisms and Homogeneity of Means and Variances–Covariances

Abstract

Access this article

Similar content being viewed by others

Ignoring Non-ignorable Missingness

Missing Data Theory

Planned Missing Data Designs I: The 3-Form Design

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation