Abstract
In the independent setup with multivariate responses, the data become incomplete when partial responses, such as responses on some variables as opposed to all variables, are available from some individuals. The main challenge here is obtaining valid inferences such as unbiased and consistent estimates of mean parameters of all response variables by using available responses. Typically, unbalanced correlation matrices are formed and moments or likelihood analysis based on the available responses are employed for such inferences. Various imputation techniques also have been used. In the longitudinal setup, when a univariate response is repeatedly collected from an individual, these repeated responses become correlated and the responses form a multivariate distribution. In this setup, it may happen that a portion of responses are not available from some individuals under study. These non-responses may be monotonic or intermittent. Also the response may be missing following a mechanism such as missing completely at random (MCAR), missing at random (MAR), or missing non-ignorably. In a longitudinal regression setup, the covariates may also be missing, but typically they are known for all time periods. Obtaining unbiased and consistent regression estimates specially when longitudinal responses are missing following MAR or ignorable mechanism becomes a challenge. This happens because one requires to accommodate both longitudinal correlations and missing mechanism to develop a proper inference tool. Over the last three decades some progress has been made toward this mainly by taking partial care of missing mechanism in developing estimation techniques. But overall, they fall short and may still produce biased and hence inconsistent estimates. The purpose of this paper is to outline these perspectives in a comprehensive manner so that real progress and challenges are understood in order to develop proper inference techniques.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Birmingham, J., Rotnitzky, A., Fitzmaurice, G.M.: Pattern-mixture and selection models for analysing longitudinal data with monotone missing patterns. J. R. Stat. Soc. Ser. B 65, 275–297 (2003)
Crowder, M.: On the use of a working correlation matrix in using generalized linear models for repeated measures. Biometrika 82, 407–410 (1995)
Fitzmaurice, G.M., Laird, N.M., Zahner, G.E.P.: Multivariate logistic models for incomplete binary responses. J. Am. Stat. Assoc. 91, 99–108 (1996)
Ibrahim, J.G., Lipsitz, S.R., Chen, M.H.: Missing covariates in generalized linear models when the missing data mechanism is non-ignorable. J. R. Stat. Soc. Ser. B 61, 173–190 (1999)
Ibrahim, J.G., Chen, M.H., Lipsitz, S.R.: Missing responses in generalized linear mixed models when the missing data mechanism is non-ignorable. Biometrika 88, 551–564 (2001)
Krishnamoorthy, K., Pannala, M.K.: Confidence estimation of a normal mean vector with incomplete data. Can. J. Stat. 27, 395–407 (1999)
Laird, N.M.: Missing data in longitudinal studies. Stat. Med. 7, 305–315 (1988)
Liang, K.-Y., Zeger, S.L.: Longitudinal data analysis using generalized linear models. Biometrika 73, 13–22 (1986)
Little, R.J.A.: A test of missing completely at random for multivariate data with missing values. J. Am. Stat. Assoc. 83, 1198–1202 (1988)
Little, R.J.A.: Modeling the drop-out mechanism in repeated-measures studies. J. Am. Stat. Assoc. 90, 1112–1121 (1995)
Little, R.J.A., Rubin, D.B.: Statistical Analysis with Missing Data. Wiley, New York (1987)
Lord, F.M.: Estimation of parameters from incomplete data. J. Am. Stat. Assoc. 50, 870–876 (1995)
Mallick, T., Farrell, P.J., Sutradhar, B.C.: Consistent estimation in incomplete longitudinal binary models. In: Sutradhar, B.C. (ed.) ISS-2012 Proceedings Volume On Longitudinal Data Analysis Subject to Measurement Errors, Missing Values, and/or Outliers. Springer Lecture Notes Series, pp. 125146. Springer, New York, (2013)
Mehta, J.S., Gurland, J.: A test of equality of means in the presence of correlation and missing values. Biometrika 60, 211–213 (1973)
Meng, X.L.: Multiple-imputation inferences with uncongenial sources of input. Stat. Sci. 9, 538–573 (1994)
Morrison, D.F.: A test for equality of means of correlated variates with missing data on one response. Biometrika 60, 101–105 (1973)
Naik, U.D.: On testing equality of means of correlated variables with incomplete data. Biometrika 62, 615–622 (1975)
Paik, M.C.: The generalized estimating equation approach when data are not missing completely at random. J. Am. Stat. Assoc. 92, 1320–1329 (1997)
Preisser, J.S., Lohman, K.K., Rathouz, P.J.: Performance of weighted estimating equations for longitudinal binary data with drop-outs missing at random. Stat. Med. 21, 3035–3054 (2002)
Robins, J.M., Rotnitzky, A., Zhao, L.P.: Analysis of semiparametric regression models for repeated outcomes in the presence of missing data. J. Am. Stat. Assoc. 90, 106–121 (1995)
Rotnitzky, A., Robins, J.M., Scharfstein, D.O.: Semi-parametric regression for repeated outcomes with nonignorable nonresponse. J. Am. Stat. Assoc. 93, 1321–1339 (1998)
Rubin, D.B.: Inference and missing data (with discussion). Biometrika 63, 581–592 (1976)
Rubin, D.B., Schenker, N.: Multiple imputation for interval estimation from simple random sample with ignorable nonresponses. J. Am. Stat. Assoc. 81, 366–374 (1986)
Sneddon, G., Sutradhar, B.C.: On semi-parametric familial longitudinal models. Statist. Prob. Lett. 69, 369–379 (2004)
Sutradhar, B.C.: An overview on regression models for discrete longitudinal responses. Stat. Sci. 18, 377–393 (2003)
Sutradhar, B.C.: Inferences in generalized linear longitudinal mixed models. Can. J. Stat. 38, 174–196 (2010), Special issue
Sutradhar, B.C.: Dynamic Mixed Models for Familial Longitudinal Data. Springer, New York (2011)
Sutradhar, B.C., Das, K.: On the efficiency of regression estimators in generalized linear models for longitudinal data. Biometrika 86, 459–65 (1999)
Sutradhar, B.C., Mallick, T.S.: Modified weights based generalized quasilikelihood inferences in incomplete longitudinal binary models. Can. J. Stat. 38, 217–231 (2010), Special issue
Troxel, A.B., Lipsitz, S.R., Harrington, D.P.: Marginal models for the analysis of longitudinal measurements subject to non-ignorable and non-monotonic missing data. Biometrika 85, 661–672 (1988)
Troxel, A.B., Lipsitz, S.R., Brennan, T.A.: Weighted estimating equations with nonignorably missing response data. Biometrics 53, 857–869 (1997)
Wang, Y.-G.: Estimating equations with nonignorably missing response data. Biometrics 55, 984–989 (1999)
Acknowledgment
The author fondly acknowledges the stimulating discussion by the audience of the symposium and wishes to thank for their comments and suggestions.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer Science+Business Media New York
About this paper
Cite this paper
Sutradhar, B.C. (2013). Inference Progress in Missing Data Analysis from Independent to Longitudinal Setup. In: Sutradhar, B. (eds) ISS-2012 Proceedings Volume On Longitudinal Data Analysis Subject to Measurement Errors, Missing Values, and/or Outliers. Lecture Notes in Statistics(), vol 211. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-6871-4_5
Download citation
DOI: https://doi.org/10.1007/978-1-4614-6871-4_5
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-6870-7
Online ISBN: 978-1-4614-6871-4
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)