, Volume 28, Issue 1, pp 223–241 | Cite as

A robust conditional maximum likelihood estimator for generalized linear models with a dispersion parameter

  • Alfio MarazziEmail author
  • Marina Valdora
  • Victor Yohai
  • Michael Amiguet
Original Paper


Highly robust and efficient estimators for generalized linear models with a dispersion parameter are proposed. The estimators are based on three steps. In the first step, the maximum rank correlation estimator is used to consistently estimate the slopes up to a scale factor. The scale factor, the intercept, and the dispersion parameter are robustly estimated using a simple regression model. Then, randomized quantile residuals based on the initial estimators are used to define a region S such that observations out of S are considered as outliers. Finally, a conditional maximum likelihood (CML) estimator given the observations in S is computed. We show that, under the model, S tends to the whole space for increasing sample size. Therefore, the CML estimator tends to the unconditional maximum likelihood estimator and this implies that this estimator is asymptotically fully efficient. Moreover, the CML estimator maintains the high degree of robustness of the initial one. The negative binomial regression case is studied in detail.


Generalized linear model Conditional maximum likelihood Negative binomial regression Overdispersion Robust regression 

Mathematics Subject Classification

62F10 62F12 62F35 62J12 62J20 

Supplementary material

11749_2018_624_MOESM1_ESM.pdf (332 kb)
Supplementary material 1 (pdf 332 KB)


  1. Abrevaya J (1999) Computation of the maximum rank correlation estimator. Econ Lett 62:279–285MathSciNetCrossRefzbMATHGoogle Scholar
  2. Aeberhard WH, Cantoni E, Heritier S (2014) Robust inference in the negative binomial regression model with an application to falls data. Biometrics 70:920–931MathSciNetCrossRefzbMATHGoogle Scholar
  3. Agostinelli C, Marazzi A (2018) robustnegbin: robust estimates for the negative binomial regression model. R package, Preliminary versionGoogle Scholar
  4. Alfons A (2015) ccaPP: (Robust) canonical correlation analysis via projection pursuit. R package version 0.3.1Google Scholar
  5. Alfons A, Croux C, Filzmoser P (2017) Robust maximum association estimators. J Am Stat Assoc 112(517):436–445MathSciNetCrossRefGoogle Scholar
  6. Amiguet M (2011) Adaptively weighted maximum likelihood estimation of discrete distributions. Ph.D. thesis, Université de Lausanne, SwitzerlandGoogle Scholar
  7. Austin PC, Rothwell DM, Tu JV (2002) A comparison of statistical modeling strategies for analyzing length of stay after CABG surgery. Health Serv Outcomes Res Methodol 3:107–133CrossRefGoogle Scholar
  8. Cadigan NG, Chen J (2001) Properties of robust M-estimators for Poisson and negative binomial data. J Stat Comput Simul 70:273–288MathSciNetCrossRefzbMATHGoogle Scholar
  9. Cantoni E, Ronchetti E (2001) Robust inference for generalized linear models. J Am Stat Assoc 96(455):1022–1030MathSciNetCrossRefzbMATHGoogle Scholar
  10. Cantoni E, Zedini A (2009). A robust version of the hurdle model. Cahiers du département d’économétrie No 2009.07, Faculté des sciences économiques et sociales, Université de GenèveGoogle Scholar
  11. Carter EM, Potts HWW (2014) Predicting length of stay from an electronic patient record system: a primary total knee replacement example. BMC Med Inform Decis Mak 14:26CrossRefGoogle Scholar
  12. Cuesta-Albertos JA, Matrán C, Mayo-Iscar A (2008) Trimming and likelihood: robust location and dispersion estimate in the elliptical model. Ann Stat 36(5):2284–2318CrossRefzbMATHGoogle Scholar
  13. Davison AC, Snell EJ (1991) Residuals and diagnostics. In: Hinkley DV, Reid N, Snell EJ (eds) Statistical theory and modelling: in honour of Sir David Cox. Chapman and Hall, Boca Raton, pp 83–106Google Scholar
  14. Dunn PK, Smyth GK (1996) Randomized quantile residuals. J Comput Graph Stat 5(3):236–244Google Scholar
  15. Gervini D, Yohai VJ (2002) A class of robust and fully efficient regression estimators. Ann Stat 30(2):583–616MathSciNetCrossRefzbMATHGoogle Scholar
  16. Ghosh A, Basu A (2013) Robust estimation for independent non-homogeneous observations using density power divergence with applications to linear regression. Electron J Stat 7:2420–2456MathSciNetCrossRefzbMATHGoogle Scholar
  17. Ghosh A, Basu A (2016) Robust estimation in generalized linear models: the density power divergence approach. TEST 25:269–290MathSciNetCrossRefzbMATHGoogle Scholar
  18. Hampel FR, Ronchetti EM, Rousseeuw PJ, Stahel WA (1986) Robust statistics: the approach based on influence functions. Wiley, New YorkzbMATHGoogle Scholar
  19. Han AK (1987a) Non-parametric analysis of a generalized regression model: the maximum rank correlation estimator. J Econ 35(23):303–316CrossRefzbMATHGoogle Scholar
  20. Han AK (1987b) A non-parametric analysis of transformations. J Econ 35(2–3):191–209CrossRefzbMATHGoogle Scholar
  21. Heritier S, Cantoni E, Copt S, Victoria-Feser MP (2009) Robust methods in biostatistics. Wiley, ChichesterCrossRefzbMATHGoogle Scholar
  22. Hilbe JM (2008) Negative binomial regression. Cambridge University Press, CambridgezbMATHGoogle Scholar
  23. Huber PJ (1980) Robust statistics. Wiley, New YorkGoogle Scholar
  24. Künsch HR, Stefanski LA, Carroll RJ (1989) Conditionally unbiased bounded-influence estimation in general regression models, with applications to generalized linear models. J Am Stat Assoc 84(406):460–466MathSciNetzbMATHGoogle Scholar
  25. Locatelli I, Marazzi A, Yohai VJ (2010) Robust accelerated failure time regression. Comput Stat Data Anal 55(1):874–887MathSciNetCrossRefzbMATHGoogle Scholar
  26. Marazzi A, Yohai VJ (2004) Adaptively truncated maximum likelihood regression with asymmetric errors. J Stat Plan Inference 122(1–2):271–291MathSciNetCrossRefzbMATHGoogle Scholar
  27. Marazzi A, Yohai VJ (2010) Optimal robust estimates based on the Hellinger distance. Adv Data Anal Classif 4(2):169–179MathSciNetCrossRefzbMATHGoogle Scholar
  28. Marazzi A, Paccaud F, Ruffieux C, Beguin C (1998) Fitting the distribution of length of stay by parametric models. Med Care 36(6):915–927CrossRefGoogle Scholar
  29. Maronna RA, Martin RD, Yohai VJ (2006) Robust statistics theory and methods. Wiley, New YorkCrossRefzbMATHGoogle Scholar
  30. Min Y, Agresti A (2002) Modeling nonnegative data with clumping at zero: a survey. J Iran Stat Soc 1(1–2):7–33zbMATHGoogle Scholar
  31. Nelder JA, Wedderburn RWM (1972) Generalized linear models. J R Stat Soc Ser A 135(3):370–384CrossRefGoogle Scholar
  32. Rousseeuw PJ (1985) Multivariate estimation with high breakdwon point. In: Grossman W, Pflug G, Vincze I, Wertz W (eds) Mathematical statistics and applications. Reidel Publishing, Dordrecht, pp 283–297CrossRefGoogle Scholar
  33. Sherman RP (1993) The limiting distribution of the maximum rank correlation estimator. Econometrica 61(1):123–137MathSciNetCrossRefzbMATHGoogle Scholar
  34. Valdora M, Yohai VJ (2014) Robust estimation in generalized linear models. J Stat Plan Inference 146:31–48CrossRefzbMATHGoogle Scholar
  35. Yohai VJ (1987) High breakdown-point and high efficiency robust estimates for regression. Ann Stat 15(2):642–656MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Sociedad de Estadística e Investigación Operativa 2018

Authors and Affiliations

  1. 1.Institute of Social and Preventive MedicineLausanneSwitzerland
  2. 2.Nice ComputingLe Mont-sur-LausanneSwitzerland
  3. 3.Departamento de matematicas and Instituto de cálculo, Facultad de ciencias exactas y naturalesUniversidad de Buenos AiresBuenos AiresArgentina
  4. 4.CONICETBuenos AiresArgentina

Personalised recommendations