Working correlation structure selection in GEE analysis

Regular Article
  • 163 Downloads

Abstract

The method of generalized estimating equations models the association between the repeated observations on a subject. It is important to select an appropriate working correlation structure for the repeated measures per subject in order to enhance efficiency of estimation of the regression parameter. Some existing selection criteria choose the structure for which the covariance matrix estimator and the specified working covariance matrix are closest. So, we define a new criterion based on this idea for selecting a working correlation structure. Also, we compare our criterion with some existing criteria to identify the true correlation structure via simulations for Poisson, binomial and normal responses, and exchangeable or AR(1) intracluster correlation structure. We assume that for each subject, the number of observations remains the same. Furthermore, we also illustrate the performance of our criterion using two data sets. Finally, we conclude that our approach is a good selection criterion in most of the different considered settings.

Keywords

Longitudinal data Generalized estimating equations Working correlation structure 

Notes

Acknowledgements

This work was partially supported by Grant MTM2013-40778-R. The authors would like to thank the Editor and two Referees for the helpful comments and suggestions.

References

  1. Belsley DA, Kuh E, Welsch RE (1980) Regression diagnostics: identifying influential data and sources of collinearity. Wiley, New YorkCrossRefMATHGoogle Scholar
  2. Cui J (2007) QIC program and model selection in GEE analyses. Stata J 7:209–220Google Scholar
  3. Cui J, Qian G (2007) Selection of working correlation structure and best model in GEE analyses of longitudinal data. Commun Stat Simul Comput 36:987–996MathSciNetCrossRefMATHGoogle Scholar
  4. Diggle PJ, Liang K-Y, Zeger S (2002) Analysis of longitudinal data. Oxford statistical science series 25. Oxford University Press, OxfordGoogle Scholar
  5. Fitzmaurice GM (1995) A caveat concerning independence estimating structure and the best model in GEE analysis of longitudinal data. Biometrics 51(1):309–317MathSciNetCrossRefGoogle Scholar
  6. Fitzmaurice GM, Laird NM, Rotnitzky AG (1993) Regression models for discrete longitudinal responses: rejoinder. Stat Sci 8(3):306–309MathSciNetCrossRefMATHGoogle Scholar
  7. Gosho M, Hamada C, Yoshimura I (2011) Criterion for selection of a working correlation structure in the generalized estimating equation approach for longitudinal balanced data. Commun Stat Theory Methods 40:3875–3878MathSciNetCrossRefMATHGoogle Scholar
  8. Hardin JW, Hilbe JM (2012) Generalized estimating equations. Chapman and Hall/CRC, Boca RatonMATHGoogle Scholar
  9. Hin LY, Wang YG (2009) Working-correlation-structure identification in generalized estimating equations. Stat Med 28:642–658MathSciNetCrossRefGoogle Scholar
  10. Hin LY, Carey VJ, Wang YG (2007) Criteria for working-correlation-structure selection in GEE. Assessment via simulation. Am Stat 61:360–364MathSciNetCrossRefGoogle Scholar
  11. Jang MJ (2011) Working correlation selection in generalized estimating equations. Dissertation, University of IowaGoogle Scholar
  12. Jokinen J (2006) Fast algorithm for likelihood-based analysis of repeated categorical responses. Comput Stat Data Anal 51(3):1509–1522MathSciNetCrossRefMATHGoogle Scholar
  13. Kuk AYC (2003) A generalized estimating equation approach to modelling foetal responses in developmental toxicity studies when the number of implants is dose dependent. Appl Stat 52:51–61MathSciNetMATHGoogle Scholar
  14. Liang KY, Zeger SL (1986) Longitudinal data analysis using generalized linear models. Biometrika 73:13–22MathSciNetCrossRefMATHGoogle Scholar
  15. Mancl LA, Leroux BG (1996) Efficiency of regression estimates for clustered data. Biometrics 52:500–511CrossRefMATHGoogle Scholar
  16. Montgomery DC, Myers RH, Viking G (2001) Generalized linear models: with applications in engineering and the sciences. Wiley, New YorkGoogle Scholar
  17. Pan W (2001) Akaike’s information criterion in generalized estimating equations. Biometrics 57:120–125MathSciNetCrossRefMATHGoogle Scholar
  18. Pardo MC, Alonso R (2012) Influence measures based on the volume of confidence ellipsoids for GEE. Biom J 54(4):552–567MathSciNetCrossRefMATHGoogle Scholar
  19. Park CG, Shin DW (1998) An algorithm for generating correlated random variables in a class of infinitely divisible distributions. J Stat Comput Simul 61:127–139CrossRefMATHGoogle Scholar
  20. Park CG, Park T, Shin DW (1996) A simple method for generating correlated binary variates. Am Stat 50(4):306–310MathSciNetGoogle Scholar
  21. Park T, Davis CS, Li N (1998) Alternative GEE estimation procedures for discrete longitudinal data. Comput Stat Data Anal 28:243–256CrossRefMATHGoogle Scholar
  22. Prentice RL (1988) Correlated binary regression with covariates specific to each binary observation. Biometrics 44(4):1033–1048MathSciNetCrossRefMATHGoogle Scholar
  23. Rotnitzky A, Jewell NP (1990) Hypothesis testing of regression parameters in semiparametric generalized linear models for cluster correlated data. Biometrika 77:485–497MathSciNetCrossRefMATHGoogle Scholar
  24. Shults J, Chaganty RN (1998) Analysis of serially correlated data using quasi-least squares. Biometrics 54:1622–1630CrossRefMATHGoogle Scholar
  25. Shults J, Mazurick CA, Landis JR (2006) Analysis of repeated bouts of measurements in the framework of generalized estimating equations. Stat Med 25:4114–4128MathSciNetCrossRefGoogle Scholar
  26. Shults J, Sun W, Tu X, Kim H, Amsterdam J, Hilbe JM, Ten-Have T (2009) A comparison of several approaches for choosing between working correlation structures in generalized estimating equation analysis of longitudinal binary data. Stat Med 28:2338–2355MathSciNetCrossRefGoogle Scholar
  27. Sutradhar BC, Das K (1999) On the efficiency of regression estimators in generalized linear models for longitudinal data. Biometrika 86:459–465MathSciNetCrossRefMATHGoogle Scholar
  28. Sutradhar BC, Das K (2000) On the accuracy of efficiency of estimating equation approach. Biometrics 56:622–625CrossRefMATHGoogle Scholar
  29. Wang YG, Carey V (2003) Working correlation structure misspecification. Estimation and covariate design: implications for generalized estimating equations performance. Biometrika 90:29–41MathSciNetCrossRefMATHGoogle Scholar
  30. Yang H, Guo C, Lv L (2016) Variable selection for generalized varying coefficient models with longitudinal data. Stat Pap 57(1):115–132MathSciNetCrossRefMATHGoogle Scholar
  31. Zeger SL, Liang KY (1986) Longitudinal data analysis for discrete and continuous outcomes. Biometrics 42:121–130CrossRefGoogle Scholar
  32. Zorn CJW (2001) Generalized estimating equation models for correlated data: a review with applications. Am J Polit Sci 45(2):470–490CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2017

Authors and Affiliations

  1. 1.Department of Statistics and O.R. (I)Complutense University of MadridMadridSpain

Personalised recommendations