Analyzing polytomous response from a complex survey scheme, like stratified or cluster sampling is very crucial in several socio-economics applications. We present a class of minimum quasi weighted density power divergence estimators for the polytomous logistic regression model with such a complex survey. This family of semiparametric estimators is a robust generalization of the maximum quasi weighted likelihood estimator exploiting the advantages of the popular density power divergence measure. Accordingly robust estimators for the design effects are also derived. Using the new estimators, robust testing of general linear hypotheses on the regression coefficients are proposed. Their asymptotic distributions and robustness properties are theoretically studied and also empirically validated through a numerical example and an extensive Monte Carlo study.
This is a preview of subscription content, access via your institution.
Buy single article
Instant access to the full article PDF.
Tax calculation will be finalised during checkout.
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
Tax calculation will be finalised during checkout.
Agresti A (2002) Categorical data analysis, 2nd edn. Wiley, Hoboken
Alonso-Revenga JM, Martín N, Pardo L (2017) New improved estimators for overdispersion in models with clustered multinomial data and unequal cluster sizes. Stat Comput 27:193–217
Basu A, Harris IR, Hjort NL, Jones MC (1998) Robust and efficient estimation by minimizing a density power divergence. Biometrika 85:549–559
Basu A, Shioya H, Park C (2011) Statistical inference: the minimum distance approach. Chapman & Hall/CRC, Boca Raton
Basu A, Ghosh A, Mandal N Martin, Pardo L (2017) A Wald-type test statistic for testing linear hypothesis in logistic regression models based on minimum density power divergence estimator. Electron J Stat 11:2741–2772
Basu A, Ghosh A, Martin N, Pardo L (2018) Robust Wald-type tests for non-homogeneous observations based on the minimum density power divergence estimator. Metrika 81:493–522
Beaumont JF, Rivest LP (2009) Dealing with outliers in survey data, chapter 11. In: Rao (ed) Handbook of statistics, vol 29, Part A. Elsevier
Beaumont JF, Haziza D, Ruiz-Gazen A (2013) A unified approach to robust estimation in finite population sampling. Biometrika 100:555–569
Beran R (1977) Minimum Hellinger distance estimates for parametric models. Ann Stat 5:445–463
Bianco AM, Martinez E (2009) Robust testing in the logistic regression model. Comput Stat Data Anal 53:4095–4105
Bianco AM, Yohai VJ (1996) Robust estimation in the logistic regression model. In: Robust statistics, data analysis, and computer intensive methods (Schloss Thurnau, 1994), volume 109 of lecture notes in statistics. Springer, New York, pp 17–34
Binder DA (1983) On the variance of asymptotically normal estimators from complex surveys. Int Stat Rev 51:279–292
Bondell HD (2008) A characteristic function approach to the biased sampling model, with application to robust logistic regression. J Stat Plan Inference 138:742–755
Castilla E, Martin N, Pardo L (2018) Minimum phi-divergence estimators for multinomial logistic regression with complex sample design. Adv Stat Anal 102:381–411
Castilla E, Ghosh A, Martin N, Pardo L (2019) New robust statistical procedures for polytomous logistic regression models. Biometrics 74:1282–1291
Chambers RL (1986) Outlier robust finite population estimation. J Am Stat Assoc 81:1063–1069
Croux C, Haesbroeck G (2003) Implementing the Bianco and Yohai estimator for logistic regression. Comput Stat Data Anal 44:273–295
Department of Statistics (DOS) and ICF (2019) Jordan Population and Family and Health Survey 2017-18. Amman, Jordan, and Rockville, Maryland, USA: DOS and ICF. https://dhsprogram.com/publications/publication-fr346-dhs-final-reports.cfm. Accessed 20 Nov 2020
Ghosh A, Basu A (2013) Robust estimation for independent non-homogeneous observations using density power divergence with applications to linear regression. Electron J Stat 7:2420–2456
Ghosh A, Basu A (2015) Robust estimation for non-homogeneous data and the selection of the optimal tuning parameter: the density power divergence approach. J Appl Stat 42:2056–2072
Ghosh A, Basu A (2016) Robust estimation in generalized linear models: the density power divergence approach. TEST 25:269–290
Ghosh A, Basu A (2018) Robust Bounded Influence Tests for Independent but Non-Homogeneous observations. Stat Sin 28:1133–1155
Gupta AK, Kasturiratna D, Nguyen T, Pardo L (2006) A new family of BAN estimators for polytomous logistic regression models based on density power divergence measures. Stat Methods Appl 15:159–176
Gupta AK, Nguyen T, Pardo L (2008) Residuals for polytomous logistic regression models based on density power divergences test statistics. Statistics 42:495–514
Hampel FR, Ronchetti E, Rousseeuw PJ, Stahel W (1986) Robust statistics: the approach based on influence functions. Wiley, New York
Jiménez R, Shao Y (2001) On robustness and efficiency of minimum divergence estimators. Test 10:241–248
Johnson W (1985) Influence measures for logistic regression: another point of view. Biometrics 72:59–65
Lesaffre E, Albert A (1989) Multiple-group logistic regression diagnostic. Appl Stat 38:425–440
Lindsay BG (1994) Efficiency versus robustness: the case for minimum Hellinger distance and related methods. Ann Stat 22:1081–1114
McCullagh P (1980) Regression models for ordinary data. J R Stat Soc Ser B 42:109–142
Morel G (1989) Logistic regression under complex survey designs. Surv Methodol 15:203–223
Morel JG, Koehler KJ (1995) A one-step Gauss–Newton estimator for modelling categorical data with extraneous variation. J R Stat Soc Ser C 44:187–200
Morel G, Neerchal NK (2012) Overdispersion models in SAS. SAS Institute, Cary
Pardo L (2005) Statistical inference based on divergence measures. Statistics: texbooks and monographs. Chapman & Hall/CRC, New York
Raim AM, Neerchal NK, Morel JG (2015) Modeling overdispersion in R. Technical Report HPCI-2015-1 UMBCH High Performance Computing Facility, University of Maryland, Baltimore Country
Roberts G, Rao JNK, Kumer S (1987) Logistic regression analysis of sample survey data. Biometrika 74:1–12
Rousseeuw PJ, Christmann A (2003) Robustness against separation and outliers in logistic regression. Comput Stat Data Anal 43:315–332
Tambay JL (1988) An integrated approach for the treatment of outliers in sub-annual economic surveys. In: Proceedings of the section on survey research methods. American Statistical Association, pp 229–234
Toma A (2007) Minimum Hellinger distance estimators for some multivariate models: influence functions and breakdown point results. C R Math 345:353–358
Warwick J, Jones MC (2005) Choosing a robustness tuning parameter. J Stat Comput Simul 75:581–588
Wedderburn RWM (1974) Quasi-likelihood functions, generalized linear models, and the Gauss-Newton method. Biometrika 61:439–447
The authors would like to thank the reviewer for his/her helpful comments and suggestions. This research is partially supported by Grant PGC2018-005194-B-100 and Grant FPU16/0314 from Ministerio de Ciencia, Innovación y Universidades (Spain).
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
About this article
Cite this article
Castilla, E., Ghosh, A., Martin, N. et al. Robust semiparametric inference for polytomous logistic regression with complex survey design. Adv Data Anal Classif (2020). https://doi.org/10.1007/s11634-020-00430-7
- Cluster sampling
- Design effect
- Minimum quasi weighted DPD estimator
- Polytomous logistic regression model
- Pseudo minimum phi-divergence estimator
Mathematics Subject Classification