Working correlation structure selection in generalized estimating equations
- 45 Downloads
Abstract
Selecting an appropriate correlation structure in analyzing longitudinal data can greatly improve the efficiency of parameter estimation, which leads to more reliable statistical inference. A number of such criteria have been proposed in the literature from different perspectives. However, little is known about the relative performance of these criteria. We review and evaluate these criteria by carrying out extensive simulation studies. Surprisingly, we find that the AIC and the BIC based on either the Gaussian working likelihood or the empirical likelihood outperform the others.
Keywords
Correlation information criterion Empirical likelihood Longitudinal data Model selectionNotes
Acknowledgements
This research was funded by the Australian Research Council Discovery Projects (DP130100766 and DP160104292). L. Fu’s research was partly supported by the National Science Foundation of China (Grant Nos. 11201365 and 11301408) and the Doctoral Programs Foundation of Ministry of Education of China (Grant No. 2012020112005)and the Fundamental Research Funds for the Central Universities (Grant No. xjj2017180).
Supplementary material
References
- Carey VJ, Wang Y-G (2011) Working covariance model selection for generalized estimating equations. Stat Med 30:3117–3124MathSciNetCrossRefGoogle Scholar
- Chen J, Lazar NA (2012) Selection of working correlation structure in generalized estimating equations via empirical likelihood. J Comput Graph Stat 21:18–41MathSciNetCrossRefGoogle Scholar
- Davidson R, MacKinnon JG (2004) Econometric theory and methods. Oxford University Press, OxfordGoogle Scholar
- Diggle PJ, Heagerty PJ, Liang KL, Zeger SL (2002) Analysis of longitudinal data. Oxford University Press, OxfordMATHGoogle Scholar
- Erhardt V (2009) Generate correlated count random variables. R package version 1.4. https://cran.rproject.org/web/packages/corcounts/index.html
- Fitzmaurice GM (1995) A caveat concerning independence estimating equations with multivariate binary data. Biometircs 51:309–317CrossRefMATHGoogle Scholar
- Genz A, Bretz F, Miwa T, Mi X, Leisch F, Scheipl F, Bornkamp B, Hothorn T (2014) Multivariate normal and t distributions. R package version 0.9-9997. http://CRAN.R-project.org/package=bindata
- Gosho M, Hamada C, Yoshimura I (2011) Criterion for the selection of a working correlation structure in the generalized estimating equation approach for longitudinal data. Commun Stat Theory Methods 40:3839–3856MathSciNetCrossRefMATHGoogle Scholar
- Gosho M, Hamada C, Yoshimura I (2014) Selection of working correlation structure in weighted generalized estimating equation mehod for incomplete longitudinal data. Commun Stat Theory Methods 43:62–81MATHGoogle Scholar
- Hansen LP (1982) Large sample properties of generalized method of moments estimators. Econometrica 50:1029–1054MathSciNetCrossRefMATHGoogle Scholar
- Hin L-Y, Carey VJ, Wang Y-G (2007) Criteria for working-correlation-structure selection in GEE: assessment via simulation. Am Assoc 61:360–364MathSciNetGoogle Scholar
- Hin L-Y, Wang Y-G (2009) Working correlation structure identification in generalized estimating equations. Stat Med 28:642–658MathSciNetCrossRefGoogle Scholar
- Leisch F, Weingessel A, Hornik K (2011) Generation of artificial binary data. R package version 0.9-19. http://CRAN.R-project.org/package=bindata
- Liang KY, Zeger SL (1986) Longitudinal data analysis using generalized linear models. Biometrika 73:13–22MathSciNetCrossRefMATHGoogle Scholar
- McCullagh P, Nelder J (1989) Generalized linear models. Chapman & Hall, LondonCrossRefMATHGoogle Scholar
- McDonald BW (1993) Estimating logistic regression parameters for bivariate binary data. J R Stat Soc Ser B 55:391–397MathSciNetMATHGoogle Scholar
- Owen AB (2001) Empirical likelihood. Chapman and Hall-CRC, New YorkCrossRefMATHGoogle Scholar
- Pan W (2001) Akaike’s information criterion in generalized estimating equations. Biometrics 57:120–125MathSciNetCrossRefMATHGoogle Scholar
- Rotnitzky A, Jewell NP (1990) Hypothesis testing of regression parameteres in semparametric generalized linear models for clustere correlated data. Biomtrika 77:485–497CrossRefMATHGoogle Scholar
- Shults J, Chaganty NR (1998) Analysis of serially correlated data using quasi-least square. Biometrics 54:1622–1630CrossRefMATHGoogle Scholar
- Shults J, Sun W, Tu X, Kim HF, Amsterdam J, Hilbe JM, Ten-Have T (2009) A comparison of several approaches for choosing between working correlation structures in generalzied estimating equation analysis of longitudinal binary data. Stat Med 28:2338–2355MathSciNetCrossRefGoogle Scholar
- Thall PF, Vail SC (1990) Some covariance models for longitudinal count data with overdispersion. Biometrics 46:657–671MathSciNetCrossRefMATHGoogle Scholar
- Wang Y-G, Carey VJ (2003) Working correlation structure misspecification, estimation and covariate design: implications for generalised estimating equations performance. Biometrika 90:29–41MathSciNetCrossRefMATHGoogle Scholar
- Wang Y-G, Carey VJ (2004) Unbiased estimating equations from working correlation models for irregularly timed repeated measures. J Am Stat Assoc 99:845–853MathSciNetCrossRefMATHGoogle Scholar
- Wang Y-G, Hin L-Y (2010) Modeling strategies in longitudinal data analysis: covariate, variance function and correlations structure selection. Comput Stat Data Anal 54:3359–3370MathSciNetCrossRefMATHGoogle Scholar
- Wang Y-G, Zhao Y (2007) A modified pseudolikelihood approach for analysis of longitudinal data. Biometrics 63:681–689MathSciNetCrossRefMATHGoogle Scholar
- Zhu X, Zhu Z (2013) Comparison of criteria to select working correlation matrix in generalized estimating equations. Chin J Appl Probab Stat 29:515–530MathSciNetMATHGoogle Scholar