Working correlation structure selection in GEE analysis
- 163 Downloads
Abstract
The method of generalized estimating equations models the association between the repeated observations on a subject. It is important to select an appropriate working correlation structure for the repeated measures per subject in order to enhance efficiency of estimation of the regression parameter. Some existing selection criteria choose the structure for which the covariance matrix estimator and the specified working covariance matrix are closest. So, we define a new criterion based on this idea for selecting a working correlation structure. Also, we compare our criterion with some existing criteria to identify the true correlation structure via simulations for Poisson, binomial and normal responses, and exchangeable or AR(1) intracluster correlation structure. We assume that for each subject, the number of observations remains the same. Furthermore, we also illustrate the performance of our criterion using two data sets. Finally, we conclude that our approach is a good selection criterion in most of the different considered settings.
Keywords
Longitudinal data Generalized estimating equations Working correlation structureNotes
Acknowledgements
This work was partially supported by Grant MTM2013-40778-R. The authors would like to thank the Editor and two Referees for the helpful comments and suggestions.
References
- Belsley DA, Kuh E, Welsch RE (1980) Regression diagnostics: identifying influential data and sources of collinearity. Wiley, New YorkCrossRefMATHGoogle Scholar
- Cui J (2007) QIC program and model selection in GEE analyses. Stata J 7:209–220Google Scholar
- Cui J, Qian G (2007) Selection of working correlation structure and best model in GEE analyses of longitudinal data. Commun Stat Simul Comput 36:987–996MathSciNetCrossRefMATHGoogle Scholar
- Diggle PJ, Liang K-Y, Zeger S (2002) Analysis of longitudinal data. Oxford statistical science series 25. Oxford University Press, OxfordGoogle Scholar
- Fitzmaurice GM (1995) A caveat concerning independence estimating structure and the best model in GEE analysis of longitudinal data. Biometrics 51(1):309–317MathSciNetCrossRefGoogle Scholar
- Fitzmaurice GM, Laird NM, Rotnitzky AG (1993) Regression models for discrete longitudinal responses: rejoinder. Stat Sci 8(3):306–309MathSciNetCrossRefMATHGoogle Scholar
- Gosho M, Hamada C, Yoshimura I (2011) Criterion for selection of a working correlation structure in the generalized estimating equation approach for longitudinal balanced data. Commun Stat Theory Methods 40:3875–3878MathSciNetCrossRefMATHGoogle Scholar
- Hardin JW, Hilbe JM (2012) Generalized estimating equations. Chapman and Hall/CRC, Boca RatonMATHGoogle Scholar
- Hin LY, Wang YG (2009) Working-correlation-structure identification in generalized estimating equations. Stat Med 28:642–658MathSciNetCrossRefGoogle Scholar
- Hin LY, Carey VJ, Wang YG (2007) Criteria for working-correlation-structure selection in GEE. Assessment via simulation. Am Stat 61:360–364MathSciNetCrossRefGoogle Scholar
- Jang MJ (2011) Working correlation selection in generalized estimating equations. Dissertation, University of IowaGoogle Scholar
- Jokinen J (2006) Fast algorithm for likelihood-based analysis of repeated categorical responses. Comput Stat Data Anal 51(3):1509–1522MathSciNetCrossRefMATHGoogle Scholar
- Kuk AYC (2003) A generalized estimating equation approach to modelling foetal responses in developmental toxicity studies when the number of implants is dose dependent. Appl Stat 52:51–61MathSciNetMATHGoogle Scholar
- Liang KY, Zeger SL (1986) Longitudinal data analysis using generalized linear models. Biometrika 73:13–22MathSciNetCrossRefMATHGoogle Scholar
- Mancl LA, Leroux BG (1996) Efficiency of regression estimates for clustered data. Biometrics 52:500–511CrossRefMATHGoogle Scholar
- Montgomery DC, Myers RH, Viking G (2001) Generalized linear models: with applications in engineering and the sciences. Wiley, New YorkGoogle Scholar
- Pan W (2001) Akaike’s information criterion in generalized estimating equations. Biometrics 57:120–125MathSciNetCrossRefMATHGoogle Scholar
- Pardo MC, Alonso R (2012) Influence measures based on the volume of confidence ellipsoids for GEE. Biom J 54(4):552–567MathSciNetCrossRefMATHGoogle Scholar
- Park CG, Shin DW (1998) An algorithm for generating correlated random variables in a class of infinitely divisible distributions. J Stat Comput Simul 61:127–139CrossRefMATHGoogle Scholar
- Park CG, Park T, Shin DW (1996) A simple method for generating correlated binary variates. Am Stat 50(4):306–310MathSciNetGoogle Scholar
- Park T, Davis CS, Li N (1998) Alternative GEE estimation procedures for discrete longitudinal data. Comput Stat Data Anal 28:243–256CrossRefMATHGoogle Scholar
- Prentice RL (1988) Correlated binary regression with covariates specific to each binary observation. Biometrics 44(4):1033–1048MathSciNetCrossRefMATHGoogle Scholar
- Rotnitzky A, Jewell NP (1990) Hypothesis testing of regression parameters in semiparametric generalized linear models for cluster correlated data. Biometrika 77:485–497MathSciNetCrossRefMATHGoogle Scholar
- Shults J, Chaganty RN (1998) Analysis of serially correlated data using quasi-least squares. Biometrics 54:1622–1630CrossRefMATHGoogle Scholar
- Shults J, Mazurick CA, Landis JR (2006) Analysis of repeated bouts of measurements in the framework of generalized estimating equations. Stat Med 25:4114–4128MathSciNetCrossRefGoogle Scholar
- Shults J, Sun W, Tu X, Kim H, Amsterdam J, Hilbe JM, Ten-Have T (2009) A comparison of several approaches for choosing between working correlation structures in generalized estimating equation analysis of longitudinal binary data. Stat Med 28:2338–2355MathSciNetCrossRefGoogle Scholar
- Sutradhar BC, Das K (1999) On the efficiency of regression estimators in generalized linear models for longitudinal data. Biometrika 86:459–465MathSciNetCrossRefMATHGoogle Scholar
- Sutradhar BC, Das K (2000) On the accuracy of efficiency of estimating equation approach. Biometrics 56:622–625CrossRefMATHGoogle Scholar
- Wang YG, Carey V (2003) Working correlation structure misspecification. Estimation and covariate design: implications for generalized estimating equations performance. Biometrika 90:29–41MathSciNetCrossRefMATHGoogle Scholar
- Yang H, Guo C, Lv L (2016) Variable selection for generalized varying coefficient models with longitudinal data. Stat Pap 57(1):115–132MathSciNetCrossRefMATHGoogle Scholar
- Zeger SL, Liang KY (1986) Longitudinal data analysis for discrete and continuous outcomes. Biometrics 42:121–130CrossRefGoogle Scholar
- Zorn CJW (2001) Generalized estimating equation models for correlated data: a review with applications. Am J Polit Sci 45(2):470–490CrossRefGoogle Scholar