Minimum phi-divergence estimators for multinomial logistic regression with complex sample design

Castilla, Elena; Martín, Nirian; Pardo, Leandro

doi:10.1007/s10182-017-0311-6

Minimum phi-divergence estimators for multinomial logistic regression with complex sample design

Original Paper
Published: 28 October 2017

Volume 102, pages 381–411, (2018)
Cite this article

AStA Advances in Statistical Analysis Aims and scope Submit manuscript

Elena Castilla¹,
Nirian Martín¹ &
Leandro Pardo¹

283 Accesses
10 Citations
Explore all metrics

Abstract

This article develops the theoretical framework needed to study the multinomial regression model for complex sample design with pseudo-minimum phi-divergence estimators. The numerical example and the simulation study propose new estimators for the parameter of the logistic regression with overdispersed multinomial distributions for the response variables, the pseudo-minimum Cressie–Read divergence estimators, as well as new estimators for the intra-cluster correlation coefficient. The simulation study shows that the Binder’s method for the intra-cluster correlation coefficient exhibits an excellent performance when the pseudo-minimum Cressie–Read divergence estimator, with $\lambda =\frac{2}{3}$, is plugged.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Refining analytic approximation based estimation of mixed multinomial probit models by parameter selection

Article Open access 28 July 2023

Robust Wald-type tests in GLM with random design based on minimum density power divergence estimators

Article 05 August 2020

Efficient Bayesian inference for COM-Poisson regression models

Article Open access 24 April 2017

References

Agresti, A.: Categorical Data Analysis, 2nd edn. Wiley, Hoboken (2002)
Book MATH Google Scholar
Alonso-Revenga, J.M., Martín, N., Pardo, L.: New improved estimators for overdispersion in models with clustered multinomial data and unequal cluster sizes. Stat. Comput. 27, 193–217 (2017)
Article MathSciNet MATH Google Scholar
Amemiya, T.: Qualitative response models: a survey. J. Econ. Lit. 19, 1483–1536 (1981)
Google Scholar
An, A.B.: Performing logistic regression on survey data with the new SURVEYLOGISTIC procedure. In: Proceedings of the 27th Annual SAS Users Group International Conference, CD-Rom Version, Paper 258-27 (2002)
Anderson, J.A.: Separate sample logistic discrimination. Biometrika 59, 19–35 (1972)
Article MathSciNet MATH Google Scholar
Anderson, J.A.: Logistic discrimination. In: Krishnaiah, R., Kanal, L.N. (eds.) Handbook of Statistics, pp. 169–191. North-Holland Publishing Company, Amsterdam (1982)
Google Scholar
Anderson, J.A.: Regression and ordered categorical variables. J. R. Stat. Soc. Ser. B 46, 1–30 (1984)
MathSciNet MATH Google Scholar
Binder, D.A.: On the variance of asymptotically normal estimators from complex surveys. Int. Stat. Rev. 51, 279–292 (1983)
Article MathSciNet MATH Google Scholar
Engel, J.: Polytomous logistic regression. Stat. Neerl. 42, 233–252 (1988)
Article MathSciNet MATH Google Scholar
Ghosh, A., Basu, A.: Robust estimation for independent but non-homogeneous observations using density power divergence with application to linear regression. Electron. J. Stat. 7, 2420–2456 (2013)
Article MathSciNet MATH Google Scholar
Ghosh, A., Basu, A.: Robust estimation for non-homogeneous data and the selection of the optimal tuning parameter: the density power divergence approach. J. Appl. Stat. 42(9), 2056–2072 (2015)
Article MathSciNet Google Scholar
Ghosh, A., Harris, I.R., Maji, A., Basu, A., Pardo, L.: A generalized divergence for statistical inference. Bernoulli 23(4A), 2746–2783 (2017a)
Article MathSciNet MATH Google Scholar
Ghosh, A., Martin, N., Basu, A., Pardo, L.: A new class of robust two-sample Wald-type tests (2017b). arXiv:1702.04552
Gupta, A.K., Kasturiratna, D., Nguyen, T., Pardo, L.: A new family of BAN estimators for polytomous logistic regression models based on phi-divergence measures. Stat. Methods Appl. 15, 159–176 (2006a)
Article MathSciNet MATH Google Scholar
Gupta, A.K., Nguyen, T., Pardo, L.: Inference procedures for polytomous logistic regression models based on phi-divergence measures. Math. Methods Stat. 15, 269–288 (2006b)
Google Scholar
Gupta, A.K., Pardo, L.: Phi-divergences and polytomous logistic regression models: an overview. J. Stat. Plan. Inference 137, 3513–3524 (2007)
Article MathSciNet MATH Google Scholar
Gupta, A.K., Nguyen, T., Pardo, L.: Residuals for polytomous logistic regression models based on phi-divergences test statistics. Statistics 42, 495–514 (2008)
Article MathSciNet MATH Google Scholar
Hong, C., Kim, Y.: Automatic selection of the tuning parameter in the minimum density power divergence estimation. J. Korean Stat. Soc. 30, 453465 (2001)
MathSciNet Google Scholar
Lehtonen, R., Pahkinen, E.: Practical Methods for Design and Analysis of Complex Surveys. Wiley, Chchester (1995)
MATH Google Scholar
Lesaffre, E.: Logistic discrimination analysis with application in electrocardiography. Doctoral thesis, University of Leuven (1986)
Lesaffre, E., Albert, A.: Multiple-group logistic regression diagnostic. Appl. Stat. 38, 425–440 (1989)
Article MathSciNet MATH Google Scholar
Liu, I., Agresti, A.: The analysis of ordered categorical data: an overview and a survey of recent developments. With discussion and a rejoinder by the authors. Test 14, 1–73 (2005)
Article MathSciNet MATH Google Scholar
Mantel, N.: Models for complex contingency tables and polychotomous dosage response curves. Biometrics 22, 83–95 (1966)
Article Google Scholar
McCullagh, P.: Regression models for ordinary data. J. R. Stat. Soc. Ser. B 42, 109–142 (1980)
MATH Google Scholar
Molina, E.A., Skinner, C.C.J.: Pseudo-likelihood and quasi-likelihood estimation for complex sampling schemes. Comput. Stat. Data Anal. 13, 395–405 (1992)
Article MathSciNet MATH Google Scholar
Morel, G.: Logistic regression under complex survey designs. Surv. Methodol. 15, 203–223 (1989)
Google Scholar
Morel, G., Neerchal, N.K.: Overdispersion Models in SAS. SAS Institute, Cary (2012)
Google Scholar
Pardo, L.: Statistical Inference Based on Divergence Measures. Statistics: Texbooks and Monographs. Chapman & Hall/CRC, New York (2005)
Google Scholar
Rao, J.N.K., Scott, A.J.: On Chi-squared tests for multinomial contigency tables with cell proportions estimated from survey data. Ann. Stat. 6, 461–464 (1984)
Google Scholar
Rao, J.N., Thomas, D.R.: Chi-squared tests for contingency tables. In: Skinner, C.J., Holt, D., Smith, T.M.F. (eds.) Analysis of Complex Survey, pp. 89–114. Wiley, New York (1989)
Google Scholar
Roberts, G., Rao, J.N.K., Kumer, S.: Logistic regression analysis of sample survey data. Biometrika 74, 1–12 (1987)
Article MathSciNet MATH Google Scholar
SAS Institute Inc.: SAS/STAT®13.1 User’s Guide. Cary, NC (2013)
Skinner, C.J., Holt, D., Smith, T.M.F.: Analysis of Complex Surveys. Wiley, New York (1989)
MATH Google Scholar
Theil, H.: A multinomial extension of the linear logit model. Int. Econ. Rev. 10, 251–259 (1969)
Article Google Scholar
Warwick, J., Jones, M.C.: Choosing a robustness tuning parameter. J. Stat. Comput. Simul. 75, 581–588 (2005)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

We would like to thank the referees for their helpful comments and suggestions. Their comments have improved the paper. This research is partially supported by Grants MTM2012-33740, MTM2015-67057-P and ECO2015-66593-P from Ministerio de Economia y Competitividad (Spain).

Author information

Authors and Affiliations

Department of Statistics and Operations Research, Complutense University of Madrid, Madrid, Spain
Elena Castilla, Nirian Martín & Leandro Pardo

Authors

Elena Castilla
View author publications
You can also search for this author in PubMed Google Scholar
Nirian Martín
View author publications
You can also search for this author in PubMed Google Scholar
Leandro Pardo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Leandro Pardo.

A Proof of results

1.1 A.1 Proof of Theorem 1

Proof

The pseudo-minimum phi-divergence estimator of $\varvec{\beta }$, $ \widehat{\varvec{\beta }}_{\phi ,P}$, is obtained by solving the system of equations $\frac{\partial }{\partial \varvec{\beta }}d_{\phi }\left( \widehat{\varvec{p}},\varvec{\pi }\left( \varvec{\beta }\right) \right) =\varvec{0}_{dk}$, and then, it is also obtained from $ \varvec{u}_{\phi }\left( \varvec{\beta }\right) =\varvec{0}_{dk}$, where

$$\begin{aligned} \varvec{u}_{\phi }\left( \varvec{\beta }\right) =-\frac{\tau }{ \phi ^{\prime \prime }(1)}\frac{\partial }{\partial \varvec{\beta }}d_{\phi }\left( \widehat{\varvec{p}},\varvec{\pi }\left( \varvec{\beta } \right) \right) =\sum \limits _{h=1}^{H}\sum \limits _{i=1}^{n_{h}}\varvec{u} _{\phi ,hi}\left( \varvec{\beta }\right) , \end{aligned}$$

with

$$\begin{aligned} \varvec{u}_{\phi ,hi}\left( \varvec{\beta }\right)&=-\frac{ w_{hi}m_{hi}}{\phi ^{\prime \prime }(1)}\frac{\partial }{\partial \varvec{ \beta }}d_{\phi }\left( \tfrac{\widehat{\varvec{y}}_{hi}}{m_{hi}}, \varvec{\pi }_{hi}(\varvec{\beta })\right) =\frac{w_{hi}m_{hi}}{ \phi ^{\prime \prime }(1)}\sum \limits _{s=1}^{d+1}\frac{\partial \pi _{his}( \varvec{\beta })}{\partial \varvec{\beta }}f_{\phi ,his}(\tfrac{\widehat{ y}_{his}}{m_{hi}},\varvec{\beta }) \\&=\frac{w_{hi}m_{hi}}{\phi ^{\prime \prime }(1)}\frac{\partial \varvec{\pi } _{hi}^{T}(\varvec{\beta })}{\partial \varvec{\beta }}\varvec{f} _{\phi ,hi}(\tfrac{\widehat{\varvec{y}}_{hi}}{m_{hi}},\varvec{\beta }), \end{aligned}$$

and

$$\begin{aligned} \varvec{f}_{\phi ,hi}\left( \tfrac{\widehat{\varvec{y}}_{hi}}{m_{hi}}, \varvec{\beta }\right) =\left( f_{\phi ,hi1}\left( \tfrac{\widehat{y}_{hi1}}{m_{hi}}, \varvec{\beta }\right) ,\ldots ,f_{\phi ,hi,d+1}\left( \tfrac{\widehat{y}_{hi,d+1}}{m_{hi}}, \varvec{\beta }\right) \right) ^{T}. \end{aligned}$$

Since

$$\begin{aligned} \frac{\partial \varvec{\pi }_{hi}^{T}(\varvec{\beta })}{\partial \varvec{\beta }}=\left( \varvec{I}_{d\times d},\varvec{0} _{d\times 1}\right) \varvec{\Delta }(\varvec{\pi }_{hi}\left( \varvec{\beta }\right) )\otimes \varvec{x}_{hi}, \end{aligned}$$

the expression of $\varvec{u}_{\phi ,hi}\left( \varvec{\beta }\right) $ is rewritten as (16)–(17). $\square $

1.2 A.2 Proof of Theorem 2

Proof

From Theorem 1 and by following the same steps of the linearization method of Binder (1983),

$$\begin{aligned} \mathbf {G}\left( \varvec{\beta }\right) =\lim _{n\rightarrow \infty } \varvec{V}\left[ \tfrac{1}{\sqrt{n}}\varvec{U}_{\phi }\left( \varvec{ \beta }\right) \right] \quad \text {and}\quad \mathbf {H}\left( \varvec{\beta } \right) =-\lim _{n\rightarrow \infty }\frac{1}{n}\frac{\partial \varvec{U} _{\phi }^{T}\left( \varvec{\beta }\right) }{\partial \varvec{\beta }}, \end{aligned}$$

where $\varvec{U}_{\phi }\left( \varvec{\beta }\right) $ is the random variable generator of $\varvec{u}_{\phi }\left( \varvec{\beta }\right) $ is given by (15). Taking into account that $f_{\phi ,his}(\pi _{his}(\varvec{\beta }),\varvec{\beta })=0$ and $f_{\phi ,his}^{\prime }(\pi _{his}(\varvec{\beta }),\varvec{\beta })=\frac{1}{\pi _{his}( \varvec{\beta })}\phi ^{\prime \prime }\left( 1\right) $, a first Taylor expansion of $f_{\phi ,his}(\tfrac{\widehat{Y}_{his}}{m_{hi}},\varvec{ \beta })$ given in (18) is

$$\begin{aligned} f_{\phi ,his}\left( \tfrac{\widehat{Y}_{his}}{m_{hi}},\varvec{\beta }\right)&=f_{\phi ,his}\left( \pi _{his}\left( \varvec{\beta }\right) ,\varvec{\beta }\right) +f_{\phi ,his}^{\prime }\left( \pi _{his}\left( \varvec{\beta }\right) ,\varvec{\beta }\right) \left( \tfrac{ \widehat{Y}_{his}}{m_{hi}}-\pi _{his}\left( \varvec{\beta }\right) \right) \nonumber \\&\quad +o\left( \tfrac{\widehat{ Y}_{his}}{m_{hi}}-\pi _{his}\left( \varvec{\beta }\right) \right) =\frac{\phi ^{\prime \prime }\left( 1\right) }{\pi _{his}\left( \varvec{\beta }\right) } \left( \tfrac{Y_{his}}{m_{hi}}-\pi _{his}\left( \varvec{\beta }\right) \right) \nonumber \\&\quad +o\left( \tfrac{\widehat{Y} _{his}}{m_{hi}}-\pi _{his}\left( \varvec{\beta }\right) \right) , \end{aligned}$$

(28)

i.e.,

$$\begin{aligned} \varvec{f}_{\phi ,hi}\left( \tfrac{\widehat{\varvec{Y}}_{hi}}{m_{hi}}, \varvec{\beta }\right) =\phi ^{\prime \prime }\left( 1\right) \mathrm {diag}^{-1}\left( \varvec{\pi }_{hi}\left( \varvec{\beta }\right) \right) \left( \tfrac{\widehat{\varvec{Y}} _{hi}}{m_{hi}}-\varvec{\pi }_{hi}\left( \varvec{\beta }\right) \right) +o\left( \tfrac{ \widehat{\varvec{Y}}_{hi}}{m_{hi}}-\varvec{\pi }_{hi}\left( \varvec{ \beta }\right) \right) , \end{aligned}$$

and hence,

$$\begin{aligned} \frac{1}{\sqrt{n}}\varvec{U}_{\phi }\left( \varvec{\beta }\right)= & {} \frac{1}{\sqrt{n}}\sum \limits _{h=1}^{H}\sum \limits _{i=1}^{n_{h}}w_{hi}m_{hi} \frac{\partial \varvec{\pi }_{hi}^{T}\left( \varvec{\beta }\right) }{\partial \varvec{\beta }}\mathrm {diag}^{-1}\left( \varvec{\pi }_{hi}\left( \varvec{\beta }\right) \right) \left( \tfrac{\widehat{\varvec{Y}}_{hi}}{m_{hi}}-\varvec{\pi }_{hi}\left( \varvec{\beta }\right) \right) \\&\quad +\sum \limits _{h=1}^{H}\sqrt{\eta _{h}^{*}}o\left( \frac{1}{\sqrt{n_{h}}}\left( \sum _{i=1}^{n_{h}}\widehat{\varvec{Y}} _{hi}-\sum _{i=1}^{n_{h}}m_{hi}\varvec{\pi }_{hi}\left( \varvec{\beta } \right) \right) \right) . \end{aligned}$$

From the Central Limit Theorem

$$\begin{aligned} \frac{1}{\sqrt{n_{h}}}\left( \sum _{i=1}^{n_{h}}\widehat{\varvec{Y}} _{hi}-\sum _{i=1}^{n_{h}}m_{hi}\varvec{\pi }_{hi}\left( \varvec{\beta } \right) \right) \underset{n_{h}\rightarrow \infty }{\overset{\mathcal {L}}{ \longrightarrow }}\mathcal {N}\left( \varvec{0}_{d+1},\lim _{n_{h}\rightarrow \infty }\tfrac{1}{n_{h}}\sum _{i=1}^{n_{h}}\varvec{V}\left[ \widehat{ \varvec{Y}}_{hi}\right] \right) , \end{aligned}$$

then

$$\begin{aligned} o\left( \frac{1}{\sqrt{n_{h}}}\left( \sum _{i=1}^{n_{h}}\widehat{ \varvec{Y}}_{hi}-\sum _{i=1}^{n_{h}}m_{hi}\varvec{\pi }_{hi}( \varvec{\beta })\right) \right) =o\left( o_{p}(\varvec{1} _{d+1})\right) =o_{p}(\varvec{1}_{d+1}), \end{aligned}$$

and thus,

$$\begin{aligned} \frac{1}{\sqrt{n}}\varvec{U}_{\phi }\left( \varvec{\beta }\right) = \frac{1}{\sqrt{n}}\sum \limits _{h=1}^{H}\sum \limits _{i=1}^{n_{h}}w_{hi}\frac{ \partial \log \varvec{\pi }_{hi}^{T}(\varvec{\beta })}{\partial \varvec{\beta }}(\widehat{\varvec{y}}_{hi}-m_{hi}\varvec{\pi } _{hi}(\varvec{\beta }))+o_{p}(\varvec{1}_{dk}). \end{aligned}$$

Since

$$\begin{aligned} \frac{\partial \log \varvec{\pi }_{hi}^{T}(\varvec{\beta })}{\partial \varvec{\beta }}\varvec{\pi }_{hi}(\varvec{\beta })&=\frac{ \partial \varvec{\pi }_{hi}^{T}(\varvec{\beta })}{\partial \varvec{ \beta }}\mathrm {diag}^{-1}(\varvec{\pi }_{hi}(\varvec{\beta })) \varvec{\pi }_{hi}(\varvec{\beta }) \\&=\frac{\partial \varvec{\pi }_{hi}^{T}(\varvec{\beta })}{\partial \varvec{\beta }}\varvec{1}_{d+1}=\frac{\partial \left( \varvec{\pi } _{hi}^{T}(\varvec{\beta })\varvec{1}_{d+1}\right) }{\partial \varvec{\beta }}=\varvec{0}_{dk},\\ \frac{\partial \log \varvec{\pi }_{hi}^{T}(\varvec{\beta })}{\partial \varvec{\beta }}\widehat{\varvec{y}}_{hi}&=\frac{\partial \varvec{\pi }_{hi}^{T}(\varvec{\beta })}{\partial \varvec{\beta }} \mathrm {diag}^{-1}(\varvec{\pi }_{hi}(\varvec{\beta }))\widehat{ \varvec{Y}}_{hi} \\&=\left( \left( \varvec{I}_{d\times d},\varvec{0}_{d\times 1}\right) \varvec{\Delta }(\varvec{\pi }_{hi}\left( \varvec{\beta } \right) )\otimes \varvec{x}_{hi}\right) \mathrm {diag}^{-1}(\varvec{\pi }_{hi}(\varvec{\beta }))\widehat{\varvec{Y}}_{hi} \\&=\left( \varvec{I}_{d\times d},\varvec{0}_{d\times 1}\right) \varvec{\Delta }(\varvec{\pi }_{hi}\left( \varvec{\beta }\right) ) \mathrm {diag}^{-1}(\varvec{\pi }_{hi}(\varvec{\beta }))\widehat{ \varvec{Y}}_{hi}\otimes \varvec{x}_{hi} \\&=\left( \varvec{I}_{d\times d},\varvec{0}_{d\times 1}\right) \left( \widehat{\varvec{Y}}_{hi}-\varvec{\pi }_{hi}\left( \varvec{\beta } \right) \varvec{\pi }_{hi}\left( \varvec{\beta }\right) ^{T}\mathrm { diag}^{-1}\left( \pi _{hi}\left( \varvec{\beta }\right) \right) \widehat{ \varvec{Y}}_{hi}\right) \otimes \varvec{x}_{hi} \\&=\left( \varvec{I}_{d\times d},\varvec{0}_{d\times 1}\right) \left( \widehat{\varvec{Y}}_{hi}-m_{hi}\varvec{\pi }_{hi}\left( \varvec{ \beta }\right) \right) \otimes \varvec{x}_{hi} \\&=\left( \widehat{\varvec{y}}_{hi}^{*}-m_{hi}\varvec{\pi } _{hi}^{*}\left( \varvec{\beta }\right) \right) \otimes \varvec{x} _{hi}, \end{aligned}$$

it follows that

$$\begin{aligned} \frac{1}{\sqrt{n}}\varvec{U}_{\phi }\left( \varvec{\beta }\right) = \frac{1}{\sqrt{n}}\sum \limits _{h=1}^{H}\sum \limits _{i=1}^{n_{h}}w_{hi}\left( \widehat{\varvec{y}}_{hi}^{*}-m_{hi}\varvec{\pi }_{hi}^{*}( \varvec{\beta })\right) \otimes \varvec{x}_{hi}+o_{p}(\varvec{1} _{dk}), \end{aligned}$$

(29)

Then, $\mathbf {H}\left( \varvec{\beta }_{0}\right) $ is the limit of

$$\begin{aligned} -\frac{1}{n}\frac{\partial }{\partial \varvec{\beta }}\varvec{U}_{\phi }^{T}\left( \varvec{\beta }\right)&=\frac{1}{n}\sum \limits _{h=1}^{H} \sum \limits _{i=1}^{n_{h}}w_{hi}m_{hi}\frac{\partial }{\partial \varvec{ \beta }}\varvec{\pi }_{hi}^{*}(\varvec{\beta })\otimes \varvec{x} _{hi}+o_{p}(\varvec{1}_{dk\times dk}) \\&=\frac{1}{n}\sum \limits _{h=1}^{H}\sum \limits _{i=1}^{n_{h}}w_{hi}m_{hi} \varvec{\Delta }\left( \varvec{\pi }_{hi}^{*}\left( \varvec{\beta } \right) \right) \otimes \varvec{x}_{hi}+o_{p}(\varvec{1}_{dk\times dk}), \end{aligned}$$

as n increases, and hence, $\mathbf {H}\left( \varvec{\beta }\right) =\lim _{n\rightarrow \infty }\mathbf {H}_{n}\left( \varvec{\beta }\right) $. On the other hand, from (29) it follows that

$$\begin{aligned} \frac{1}{\sqrt{n}}\varvec{U}_{\phi }\left( \varvec{\beta }\right) = \frac{1}{\sqrt{n}}\varvec{U}\left( \varvec{\beta }\right) +o_{p}( \varvec{1}_{dk}), \end{aligned}$$

and this justifies that $\mathbf {G}\left( \varvec{\beta }\right) =\lim _{n\rightarrow \infty }\mathbf {G}_{n}\left( \varvec{\beta }\right) $. $\square $

1.3 A.3 Proof of Theorem 3

Proof

If $\varvec{V}[\widehat{\varvec{Y}}_{hi}]=\nu _{m_{h}}m_{h} \varvec{\Delta }(\varvec{\pi }_{hi}\left( \varvec{\beta } _{0}\right) )$, then from the expression of $\mathbf {G}_{n_{h}}^{(h)}\left( \varvec{\beta }_{0}\right) $ given in Theorem 2,

$$\begin{aligned} \mathbf {G}_{n_{h}}^{(h)}\left( \varvec{\beta }_{0}\right)&=\frac{1}{ n_{h}}\sum \limits _{i=1}^{n_{h}}w_{h}^{2}\varvec{V}\left[ \widehat{ \varvec{Y}}_{hi}^{*}\right] \otimes \varvec{x}_{hi}\varvec{x} _{hi}^{T}=\nu _{m_{h}}w_{h}\frac{1}{n_{h}}\sum \limits _{i=1}^{n_{h}}w_{h}m_{h}\varvec{\Delta }\left( \varvec{\pi } _{hi}^{*}( \varvec{\beta }_{0}) \right) \otimes \varvec{x}_{hi} \varvec{x}_{hi}^{T} \\&=\nu _{m_{h}}w_{h}\mathbf {H}_{n_{h}}^{(h)}\left( \varvec{\beta } _{0}\right) . \end{aligned}$$

Hence, from

$$\begin{aligned} \mathrm {trace}\left( \mathbf {H}_{n_{h}}^{(h)}\left( \varvec{\beta } _{0}\right) ^{-1}\mathbf {G}_{n_{h}}^{(h)}\left( \varvec{\beta } _{0}\right) \right) =\nu _{m_{h}}w_{h}dk, \end{aligned}$$

and consistency of $\mathbf {H}_{n_{h}}^{(h)}(\widehat{\varvec{\beta }} _{\phi ,P})$ and $\widehat{\mathbf {G}}_{n_{h}}^{(h)}(\widehat{\varvec{ \beta }}_{\phi ,P})$,

$$\begin{aligned} \widehat{\nu }_{m_{h}}(\widehat{\varvec{\beta }}_{\phi ,P})=\frac{1}{dk} \mathrm {trace}\left( \frac{1}{w_{h}}\mathbf {H}_{n_{h}}^{(h)}(\widehat{ \varvec{\beta }}_{\phi ,P})^{-1}\widehat{\mathbf {G}}_{n_{h}}^{(h)}( \widehat{\varvec{\beta }}_{\phi ,P})\right) , \end{aligned}$$

is proved with

$$\begin{aligned} \frac{1}{w_{h}}\mathbf {H}_{n_{h}}^{(h)}\left( \widehat{\varvec{\beta }}_{\phi ,P}\right) ^{-1}\widehat{\mathbf {G}}_{n_{h}}^{(h)}\left( \widehat{\varvec{\beta }} _{\phi ,P}\right)&=\left( \sum \limits _{i=1}^{n_{h}}m_{h}\varvec{\Delta }\left( \varvec{\pi }_{hi}^{*}\left( \widehat{\varvec{\beta }}_{\phi ,P}\right) \right) \otimes \varvec{x}_{hi}\varvec{x}_{hi}^{T}\right) ^{-1} \\&\quad \times \sum \limits _{i=1}^{n_{h}}\left( \varvec{v}_{hi}\left( \widehat{ \varvec{\beta }}_{\phi ,P}\right) -\varvec{\bar{v}}_{h}\left( \widehat{\varvec{ \beta }}_{\phi ,P}\right) \right) \left( \varvec{v}_{hi}\left( \widehat{\varvec{ \beta }}_{\phi ,P}\right) -\varvec{\bar{v}}_{h}\left( \widehat{\varvec{\beta }} _{\phi ,P}\right) \right) ^{T}, \\ \varvec{v}_{hi}\left( \widehat{\varvec{\beta }}_{\phi ,P}\right)&=\frac{1}{w_{h}} \varvec{u}_{hi}\left( \varvec{\beta }\right) , \end{aligned}$$

which is equivalent to (25). $\square $

1.4 A.4 Proof of Theorem 4

Proof

The mean vector and variance-covariance matrix of

$$\begin{aligned} \varvec{Z}_{hi}^{*}\left( \varvec{\beta }_{0}\right) =\sqrt{m_{h}}\varvec{ \Delta }^{-\frac{1}{2}}\left( \varvec{\pi }_{hi}^{*}\left( \varvec{\beta } _{0}\right) \right) \left( \tfrac{\widehat{\varvec{Y}}_{hi}^{*}}{m_{h}}- \varvec{\pi }_{hi}^{*}\left( \varvec{\beta }_{0}\right) \right) , \end{aligned}$$

are, respectively,

$$\begin{aligned} \varvec{E}[\varvec{Z}_{hi}^{*}(\varvec{\beta }_{0})]&= \varvec{0}_{d}, \\ \varvec{V}[\varvec{Z}_{hi}^{*}(\varvec{\beta }_{0})]&=\nu _{m_{h}}\varvec{I}_{d}, \end{aligned}$$

for $h=1,\ldots ,H$. An unbiased estimator of $\varvec{V}[\varvec{Z} _{hi}^{*}(\varvec{\beta }_{0})]$ is

$$\begin{aligned} \widehat{\varvec{V}}[\varvec{Z}_{hi}^{*}(\varvec{\beta }_{0})]= \frac{1}{n_{h}}\sum \limits _{i=1}^{n_{h}}\varvec{Z}_{hi}^{*}( \varvec{\beta }_{0})\varvec{Z}_{hi}^{*T}(\varvec{\beta }_{0}), \end{aligned}$$

from which is derived

$$\begin{aligned} E\left[ \mathrm {trace}\widehat{\varvec{V}}[\varvec{Z}_{hi}^{*}( \varvec{\beta }_{0})]\right]&=\mathrm {trace}\varvec{V}[\varvec{Z }_{hi}^{*}(\varvec{\beta }_{0})], \\ E\left[ \frac{1}{n_{h}}\sum \limits _{i=1}^{n_{h}}\mathrm {trace}\left( \varvec{Z}_{hi}^{*}(\varvec{\beta }_{0})\varvec{Z}_{hi}^{*T}(\varvec{\beta }_{0})\right) \right]&=\mathrm {trace}\left( \nu _{m_{h}} \varvec{I}_{d}\right) , \\ E\left[ \frac{1}{n_{h}}\sum \limits _{i=1}^{n_{h}}\varvec{Z}_{hi}^{*T}( \varvec{\beta }_{0})\varvec{Z}_{hi}^{*}(\varvec{\beta }_{0}) \right]&=\nu _{m_{h}}d, \\ E\left[ \frac{1}{n_{h}d}\sum \limits _{i=1}^{n_{h}}\varvec{Z}_{hi}^{*T}(\varvec{\beta }_{0})\varvec{Z}_{hi}^{*}(\varvec{\beta }_{0}) \right]&=\nu _{m_{h}}. \end{aligned}$$

This expression suggests using

$$\begin{aligned} \widetilde{\nu }_{m_{h}}\left( \widehat{\varvec{\beta }}_{\phi ,P}\right)&=\frac{1}{ n_{h}d}\sum \limits _{i=1}^{n_{h}}\widehat{\varvec{z}}_{hi,\phi ,P}^{*T}\left( \widehat{\varvec{\beta }}_{\phi ,P}\right) \widehat{\varvec{z}}_{hi,\phi ,P}^{*}\left( \widehat{\varvec{\beta }}_{\phi ,P}\right) \\&=\frac{1}{n_{h}d}m_{h}\left( \tfrac{\widehat{\varvec{y}}_{hi}^{*}}{ m_{h}}-\varvec{\pi }_{hi}^{*}\left( \widehat{\varvec{\beta }}_{\phi ,P}\right) \right) ^{T}\varvec{\Delta }^{-1}\left( \varvec{\pi }_{hi}^{*}\left( \widehat{\varvec{\beta }}_{\phi ,P}\right) \right) \left( \tfrac{\widehat{\varvec{y} }_{hi}^{*}}{m_{h}}-\varvec{\pi }_{hi}^{*}\left( \widehat{\varvec{ \beta }}_{\phi ,P}\right) \right) \\&=\frac{1}{n_{h}d}m_{h}\left( \tfrac{\widehat{\varvec{y}}_{hi}}{m_{h}}- \varvec{\pi }_{hi}\left( \widehat{\varvec{\beta }}_{\phi ,P}\right) \right) ^{T} \varvec{\Delta }^{-}\left( \varvec{\pi }_{hi}\left( \widehat{\varvec{\beta }} _{\phi ,P}\right) \right) \left( \tfrac{\widehat{\varvec{y}}_{hi}^{*}}{m_{h}}- \varvec{\pi }_{hi}^{*}\left( \widehat{\varvec{\beta }}_{\phi ,P}\right) \right) ,\\ \widehat{\varvec{z}}_{hi,\phi ,P}^{*}&=\sqrt{m_{h}}\varvec{\Delta }^{-\frac{1}{2}}\left( \varvec{\pi }_{hi}^{*}\left( \widehat{\varvec{\beta }} _{\phi ,P}\right) \right) \left( \tfrac{\widehat{\varvec{y}}_{hi}^{*}}{m_{h}}- \varvec{\pi }_{hi}^{*}\left( \widehat{\varvec{\beta }}_{\phi ,P}\right) \right) . \end{aligned}$$

Finally, since $\varvec{\Delta }^{-}(\varvec{\pi }_{hi}(\widehat{ \varvec{\beta }}_{\phi ,P}))=\mathrm {diag}^{-1}(\varvec{\pi }_{hi}( \widehat{\varvec{\beta }}_{\phi ,P}))$ is a possible expression for the generalized inverse, the desired result for $\widetilde{\nu }_{m_{h}}( \widehat{\varvec{\beta }}_{\phi ,P})$ is obtained. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Castilla, E., Martín, N. & Pardo, L. Minimum phi-divergence estimators for multinomial logistic regression with complex sample design. AStA Adv Stat Anal 102, 381–411 (2018). https://doi.org/10.1007/s10182-017-0311-6

Download citation

Received: 19 August 2016
Accepted: 19 October 2017
Published: 28 October 2017
Issue Date: 13 July 2018
DOI: https://doi.org/10.1007/s10182-017-0311-6

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Minimum phi-divergence estimators for multinomial logistic regression with complex sample design

Abstract

Access this article

Similar content being viewed by others

Refining analytic approximation based estimation of mixed multinomial probit models by parameter selection

Robust Wald-type tests in GLM with random design based on minimum density power divergence estimators

Efficient Bayesian inference for COM-Poisson regression models

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

A Proof of results

1.1 A.1 Proof of Theorem 1

Proof

1.2 A.2 Proof of Theorem 2

Proof

1.3 A.3 Proof of Theorem 3

Proof

1.4 A.4 Proof of Theorem 4

Proof

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Minimum phi-divergence estimators for multinomial logistic regression with complex sample design

Abstract

Access this article

Similar content being viewed by others

Refining analytic approximation based estimation of mixed multinomial probit models by parameter selection

Robust Wald-type tests in GLM with random design based on minimum density power divergence estimators

Efficient Bayesian inference for COM-Poisson regression models

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

A Proof of results

A Proof of results

1.1 A.1 Proof of Theorem 1

Proof

1.2 A.2 Proof of Theorem 2

Proof

1.3 A.3 Proof of Theorem 3

Proof

1.4 A.4 Proof of Theorem 4

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation