Skip to main content

Generalized Estimating Equations

  • Reference work entry
Handbook of Epidemiology
  • 13k Accesses

Abstract

Generalized linear models (GLMs) are a standard regression approach for analyzing univariate non-normal data. In their breakthrough paper, Nelder and Wedderburn (1972) have derived GLM as a unifying approach for fitting models with dependent variables that are count data or dichotomous. GLM is nicely summarized in chapter Regression Methods for Epidemiological Analysis of this handbook, the great introductory text book of Dobson (2001) or the excellent monograph by McCullagh and Nelder (1989). Here, the user specifies a link function to relate the independent and the dependent variables. For example, in epidemiology, the standard choice for dichotomous dependent variables is the logit link function to model the mean structure, a model which is known for more than 50 years. Another advantage of the GLM approach in this situation is that the variance function does not need to be explicitly specified. It is automatically generated from the assumption that binary data are assumed to be Bernoulli distributed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 999.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 1,399.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  • Ballinger GA (2004) Using generalized estimating equations for longitudinal data analysis. Organ Res Method 7:127–150

    Article  Google Scholar 

  • Baradat P, Maillart M, Marpeau A, Slak MF, Yani A, Pastiszka P (1996) Utility of terpenes to assess population structure and mating patterns in conifers. In: Philippe B, Thomas A, Müller-Starck G (eds) Population genetics and genetic conservation of forest trees. Academic Publishing, Amsterdam, pp 5–27

    Google Scholar 

  • Belsley DA, Kuh E, Welsch RE (1980) Regression diagnostics: identifying influential data and sources of collinearity. Wiley, New Nork

    Book  Google Scholar 

  • Cantoni E (2004) A robust approach to longitudinal data analysis. Can J Statist 32:169–180

    Article  Google Scholar 

  • Cantoni E, Flemming JM, Ronchetti E (2005) Variable selection for marginal longitudinal generalized linear models. Biometrics 61:507–514

    Article  PubMed  Google Scholar 

  • Chaganty N, Joe H (2004) Efficiency of generalized estimating equations for binary responses. J R Stat Soc B 66:851–860

    Article  Google Scholar 

  • Cochran WG (1963) Sampling techniques, 2nd edn. Wiley, New York

    Google Scholar 

  • Cook RD, Weisberg S (1982) Residuals and influence in regression. Chapman and Hall, New York

    Google Scholar 

  • Cui J, Qian G (2007) Selection of working correlation structure and best model in GEE analyses of longitudinal data. Commun Stat Simul Comput 36:987–996

    Article  Google Scholar 

  • Dahmen G, Ziegler A (2004) Generalized estimating equations in controlled clinical trials: hypotheses testing. Biom J 46:214–232

    Article  Google Scholar 

  • Dahmen G, Ziegler A (2006) Independence estimating equations for controlled clinical trials with small sample size: interval estimation. Methods Inf Med 45:430–434

    CAS  PubMed  Google Scholar 

  • Dahmen G, Rochon J, König IR, Ziegler A (2004) Sample size calculations for controlled clinical trials using generalized estimating equations (GEE). Methods Inf Med 43:451–456

    CAS  PubMed  Google Scholar 

  • Davis CS (2002) Statistical methods for the analysis of repeated measurements. Springer, New York

    Google Scholar 

  • Dennis J, Schnabel R (1983) Numerical methods for unconstrained optimization and nonlinear equations. Prentice-Hall, Englewood Cliffs

    Google Scholar 

  • Diggle PJ, Liang KY, Zeger SL (1994) Analysis of longitudinal data. Clarendon Press, Oxford

    Google Scholar 

  • Dobson AJ (2001) Introduction to generalized linear models, 2nd edn. Chapman and Hall, London

    Book  Google Scholar 

  • Evans S, Li L (2005) A comparison of goodness of fit tests for the logistic GEE model. Stat Med 24:1245–1261

    Article  PubMed  Google Scholar 

  • Fahrmeir L, Pritscher L (1996) Regression analysis of forest damage by marginal models for correlated ordinal responses. Environ Ecol Stat 3:257–268

    Article  Google Scholar 

  • Fahrmeir L, Tutz G (1994) Multivariate statistical modelling based on generalized linear models. Springer, New York

    Book  Google Scholar 

  • Fitzmaurice GM, Laird NM (1993) A likelihood-based method for analysing longitudinal binary responses. Biometrika 80:141–151

    Article  Google Scholar 

  • Gail MH, Wieand S, Piantadosi S (1984) Biased estimates of treatment effect in randomized experiments with nonlinear regressions and omitted covariates. Biometrika 71:431–444

    Article  Google Scholar 

  • Gourieroux C, Monfort A (1995) Statistics and econometric models, vol 1. Cambridge University Press, Cambridge

    Book  Google Scholar 

  • Gourieroux C, Monfort A, Trognon A (1984) Pseudo maximum likelihood methods: theory. Econometrics 52:681–700

    Article  Google Scholar 

  • Greene W (1993) Econometric analysis, 2nd edn. Macmillan, New York

    Google Scholar 

  • Hammill BG, Preisser JS (2006) A SAS/IML software program for GEE and regression diagnostics. Comput Stat Data Anal 51:1197–1212

    Article  Google Scholar 

  • Hanley JA, Negassa A, Edwardes MD (2000) GEE analysis of negatively correlated binary responses: a caution. Stat Med 19:715–722

    Article  CAS  PubMed  Google Scholar 

  • Hanley JA, Negassa A, Edwardes MD, Forrester JE (2003) Statistical analysis of correlated data using generalized estimating equations: an orientation. Am J Epidemiol 157:364–375

    Article  PubMed  Google Scholar 

  • Hasturk H, Nunn M, Warbington M, Van Dyke TE (2004) Efficacy of a fluoridated hydrogen peroxide-based mouthrinse for the treatment of gingivitis: a randomized clinical trial. J Periodontol 75:57–65

    Article  CAS  Google Scholar 

  • Hin LY, Wang YG (2009) Working-correlation-structure identification in generalized estimating equations. Stat Med 28:642–658

    Article  PubMed  Google Scholar 

  • Hsieh FY, Lavori PW, Cohen HJ, Feussner JR (2003) An overview of variance inflation factors for sample-size calculation. Eval Health Prof 26:239–257

    Article  CAS  PubMed  Google Scholar 

  • Jones B, Kenward MG (1989) Design and analysis of cross-over trials. Chapman & Hall, London

    Google Scholar 

  • Jones B, Kenward MG (2003) Design and analysis of cross-over trials, 2nd edn. Chapman & Hall, London

    Google Scholar 

  • Jung KM (2008) Local influence in generalized estimating equations. Scand J Stat 35:286–294

    Article  Google Scholar 

  • Kauermann G, Carroll RJ (2001) A note on the efficiency of sandwich covariance matrix estimation. J Am Stat Assoc 96:1387–1396

    Article  Google Scholar 

  • Lechner M, Lollivier S, Magnac T (2008) Parametric binary choice models. In: Mátyás L, Sevestre P (eds) The econometrics of panel data, 3rd edn. Springer, Heidelberg, pp 215–245

    Chapter  Google Scholar 

  • Liang K-Y, Zeger SL (1986) Longitudinal data analysis using generalized linear models. Biometrika 73:13–22

    Article  Google Scholar 

  • Liang K-Y, Zeger SL, Qaqish B (1992) Multivariate regression analysis for categorical data. J R Stat Soc B 54:3–40

    Google Scholar 

  • Mancl LA, Leroux BG (1996) Efficiency of regression estimates for clustered data. Biometrics 52:500–511

    Article  CAS  PubMed  Google Scholar 

  • Martus P, Stroux A, Jünemann AM, Korth M, Jonas JB, Horn FK, Ziegler A (2004) GEE approaches to marginal regression models for medical diagnostic tests. Stat Med 23: 1377–1398

    Article  PubMed  Google Scholar 

  • McCullagh P, Nelder JA (1989) Generalized linear models, 2nd edn. Chapman & Hall, London

    Book  Google Scholar 

  • Nelder JA, Wedderburn RW (1972) Generalized linear models. J R Stat Soc A 135: 370–384

    Article  Google Scholar 

  • Nuamah IF, Qu Y, Amini SB (1996) A SAS macro for stepwise correlated binary regression. Comput Method Program Biomed 49:199–210

    Article  CAS  Google Scholar 

  • Ogungbenro K, Aarons L, Graham G (2006) Sample size calculations based on generalized estimating equations for population pharmacokinetic experiments. J Biopharm Stat 16: 135–150

    Article  PubMed  Google Scholar 

  • Paik MC (1997) The generalized estimating equation approach when data are not missing completely at random. J Am Stat Assoc 92:1320–1329

    Article  Google Scholar 

  • Pan W (2001a) Akaike’s information criterion in generalized estimating equations. Biometrics 57:120–125

    Article  CAS  PubMed  Google Scholar 

  • Pan W (2001b) Model selection in estimating equations. Biometrics 57:529–534

    Article  CAS  PubMed  Google Scholar 

  • Pan W, Connett JE (2002) Selecting the working correlation structure in generalized estimating equations with application to the lung health study. Stat Sin 12:475–490

    Google Scholar 

  • Pan W, Louis TA, Connett JE (2002) A note on marginal linear regression with correlated response data. Am Stat 54:191–195

    Google Scholar 

  • Pepe MS, Anderson GL (1994) A cautionary note on inference for marginal regression models with longitudinal data and general correlated response data. Commun Stat Simul Comput 23:939–951

    Article  Google Scholar 

  • Preisser JS, Perin J (2007) Deletion diagnostics for marginal mean and correlation model parameters in estimating equations. Stat Comput 17(4):381–393. doi:10.1007/s11222-007-9031-1

    Article  Google Scholar 

  • Preisser JS, Qaqish BF, Perin J (2008) A note on deletion diagnostics for estimating equations. Biometrika 95:509–513

    Article  Google Scholar 

  • Robins JM, Rotnitzky A, Zhao LP (1994) Estimation of regression coefficients when some regressors are not always observed. J Am Stat Assoc 89:846–866

    Article  Google Scholar 

  • Robins JM, Rotnitzky A, Zhao LP (1995) Analysis of semiparametric regression models for repeated outcomes in the presence of missing data. J Am Stat Assoc 90:106–120

    Article  Google Scholar 

  • Rochon J (1998) Application of GEE procedures for sample size calculations in repeated measures experiments. Stat Med 17:1643–1658

    Article  CAS  PubMed  Google Scholar 

  • Rotnitzky A, Wypij D (1994) A note on the bias of estimators with missing data. Biometrics 50:1163–1170

    Article  CAS  PubMed  Google Scholar 

  • Ryan L (1992) The use of generalized estimating equations for risk assessment in developmental toxicity. Risk Anal 12:439–447

    Article  CAS  PubMed  Google Scholar 

  • Stokes ME (1999) Recent advances in categorical data analysis. Paper presented at the 24th annual meeting of the SAS users group international conference, Miami Beach. http://support.sas.com/rnd/app/papers/abstracts/categorical.html

  • Tan AG, Mitchell P, Burlutsky G, Rochtchina E, Kanthan G, Islam FM, Wang JJ (2008) Retinal vessel caliber and the long-term incidence of age-related cataract: the Blue Mountains Eye Study. Ophthalmology 115:1693–1698

    Article  PubMed  Google Scholar 

  • Thomas W, Cook RD (1989) Assessing influence on regression coefficients in generalized linear models. Biometrika 76:741–749

    Article  Google Scholar 

  • Tu XM, Kowalski J, Zhang J, Lynch KG, Crits-Christoph P (2004) Power analyses for longitudinal trials and other clustered designs. Stat Med 23:2799–2815

    Article  CAS  PubMed  Google Scholar 

  • Vanscheidt W, Rabe E, Naser-Hijazi B, Ramelet AA, Partsch H, Diehm C, Schultz-Ehrenburg U, Spengel F, Wirsching M, Götz V, Schnitker J, Henneicke-von Zepelin HH (2002) The efficacy and safety of a coumarin-/troxerutin-combination (SB-LOT) in patients with chronic venous insufficiency: a double blind placebo-controlled randomised study. VASA 31: 185–190

    Article  CAS  PubMed  Google Scholar 

  • Venezuela MK, Botter DA, Sandoval MC (2007) Diagnostic techniques in generalized estimating equations. J Stat Comput Simul 77:879–888

    Article  Google Scholar 

  • Vens M, Ziegler A (2012) Generalized estimating equations and regression diagnostics for longitudinal controlled clinical trials: A case study. Comput Stat Data Anal 56(5):1232–1242. doi:10.1016/j.csda.2011.04.010

    Article  Google Scholar 

  • Wang Y-G, Carey V (2003) Working correlation structure misspecification, estimation and covariate design: implications for generalised estimating equations performance. Biometrika 90:1–24

    Article  Google Scholar 

  • Wei WH, Fung WK (1999) The mean-shift outlier model in general weighted regression and its applications. Comput Stat Data Anal 30:429–441

    Article  Google Scholar 

  • Xie F, Paik MC (1997a) Generalized estimating equation model for binary outcomes with missing covariates. Biometrics 53:1458–1466

    Article  CAS  PubMed  Google Scholar 

  • Xie F, Paik MC (1997b) Multiple imputation methods for the missing covariates in generalized estimating equation. Biometrics 53:1538–1546

    Article  CAS  PubMed  Google Scholar 

  • Yang J, Peek-Asa C, Jones MP, Nordstrom DL, Taylor C, Young TL, Zwerling C (2008) Smoke alarms by type and battery life in rural households. Am J Prev Med 35:20–24

    Article  PubMed  Google Scholar 

  • Zeger SL, Liang KY (1986) Longitudinal data analysis for discrete and continuous outcomes. Biometrics 42:121–130

    Article  CAS  PubMed  Google Scholar 

  • Zeger S, Liang K, Self S (1985) The analysis of binary longitudinal data with time-independent covariates. Biometrika 72:31–38

    Google Scholar 

  • Ziegler A (1995) The different parameterizations of the GEE1 and the GEE2. In: Seeber GUH, Francis BJ, Hatzinger R, Steckel-Berger G (eds) Statistical modelling proceedings of the 10th international workshop on statistical modelling. Lecture Notes in statistics, vol 104. Springer, Heidelberg, pp 315–324

    Google Scholar 

  • Ziegler A, Arminger G (1996) Parameter estimation and regression diagnostics using generalized estimating equations. In: Faulbaum F, Bandilla W (eds) SoftStat ’95. Advances in statistical software 5. Lucius & Lucius, Heidelberg, pp 229–237

    Google Scholar 

  • Ziegler A, Kastner C, Blettner M (1998) The generalised estimating equations: an annotated bibliography. Biom J 40:115–139

    Article  Google Scholar 

  • Ziegler A, Kastner C, Brunner D, Blettner M (2000) Familial associations of lipid profiles: a generalized estimating equations approach. Stat Med 19:3345–3357

    Article  CAS  PubMed  Google Scholar 

  • Ziegler A, Kastner C, Chang-Claude J (2003) Analysis of pregnancy and other factors on detection of human papilloma virus (HPV) infection using weighted estimating equations for follow-up data. Stat Med 22:2217–2233

    Article  PubMed  Google Scholar 

  • Ziegler A, Vens M (2010) Generalized estimating equations: Notes on the choice of the working correlation matrix. Methods Inf Med 49(5):421–425. doi:10.3414/ME10-01-0026

    Article  CAS  PubMed  Google Scholar 

  • Ziegler A (2011) Generalized estimating equations: Theory. Springer, New York.

    Book  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer Science+Business Media New York

About this entry

Cite this entry

Ziegler, A., Vens, M. (2014). Generalized Estimating Equations. In: Ahrens, W., Pigeot, I. (eds) Handbook of Epidemiology. Springer, New York, NY. https://doi.org/10.1007/978-0-387-09834-0_45

Download citation

  • DOI: https://doi.org/10.1007/978-0-387-09834-0_45

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-0-387-09833-3

  • Online ISBN: 978-0-387-09834-0

  • eBook Packages: MedicineReference Module Medicine

Publish with us

Policies and ethics