Skip to main content
Log in

Minimum phi-divergence estimators for multinomial logistic regression with complex sample design

  • Original Paper
  • Published:
AStA Advances in Statistical Analysis Aims and scope Submit manuscript

Abstract

This article develops the theoretical framework needed to study the multinomial regression model for complex sample design with pseudo-minimum phi-divergence estimators. The numerical example and the simulation study propose new estimators for the parameter of the logistic regression with overdispersed multinomial distributions for the response variables, the pseudo-minimum Cressie–Read divergence estimators, as well as new estimators for the intra-cluster correlation coefficient. The simulation study shows that the Binder’s method for the intra-cluster correlation coefficient exhibits an excellent performance when the pseudo-minimum Cressie–Read divergence estimator, with \(\lambda =\frac{2}{3}\), is plugged.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  • Agresti, A.: Categorical Data Analysis, 2nd edn. Wiley, Hoboken (2002)

    Book  MATH  Google Scholar 

  • Alonso-Revenga, J.M., Martín, N., Pardo, L.: New improved estimators for overdispersion in models with clustered multinomial data and unequal cluster sizes. Stat. Comput. 27, 193–217 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  • Amemiya, T.: Qualitative response models: a survey. J. Econ. Lit. 19, 1483–1536 (1981)

    Google Scholar 

  • An, A.B.: Performing logistic regression on survey data with the new SURVEYLOGISTIC procedure. In: Proceedings of the 27th Annual SAS Users Group International Conference, CD-Rom Version, Paper 258-27 (2002)

  • Anderson, J.A.: Separate sample logistic discrimination. Biometrika 59, 19–35 (1972)

    Article  MathSciNet  MATH  Google Scholar 

  • Anderson, J.A.: Logistic discrimination. In: Krishnaiah, R., Kanal, L.N. (eds.) Handbook of Statistics, pp. 169–191. North-Holland Publishing Company, Amsterdam (1982)

    Google Scholar 

  • Anderson, J.A.: Regression and ordered categorical variables. J. R. Stat. Soc. Ser. B 46, 1–30 (1984)

    MathSciNet  MATH  Google Scholar 

  • Binder, D.A.: On the variance of asymptotically normal estimators from complex surveys. Int. Stat. Rev. 51, 279–292 (1983)

    Article  MathSciNet  MATH  Google Scholar 

  • Engel, J.: Polytomous logistic regression. Stat. Neerl. 42, 233–252 (1988)

    Article  MathSciNet  MATH  Google Scholar 

  • Ghosh, A., Basu, A.: Robust estimation for independent but non-homogeneous observations using density power divergence with application to linear regression. Electron. J. Stat. 7, 2420–2456 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  • Ghosh, A., Basu, A.: Robust estimation for non-homogeneous data and the selection of the optimal tuning parameter: the density power divergence approach. J. Appl. Stat. 42(9), 2056–2072 (2015)

    Article  MathSciNet  Google Scholar 

  • Ghosh, A., Harris, I.R., Maji, A., Basu, A., Pardo, L.: A generalized divergence for statistical inference. Bernoulli 23(4A), 2746–2783 (2017a)

    Article  MathSciNet  MATH  Google Scholar 

  • Ghosh, A., Martin, N., Basu, A., Pardo, L.: A new class of robust two-sample Wald-type tests (2017b). arXiv:1702.04552

  • Gupta, A.K., Kasturiratna, D., Nguyen, T., Pardo, L.: A new family of BAN estimators for polytomous logistic regression models based on phi-divergence measures. Stat. Methods Appl. 15, 159–176 (2006a)

    Article  MathSciNet  MATH  Google Scholar 

  • Gupta, A.K., Nguyen, T., Pardo, L.: Inference procedures for polytomous logistic regression models based on phi-divergence measures. Math. Methods Stat. 15, 269–288 (2006b)

    Google Scholar 

  • Gupta, A.K., Pardo, L.: Phi-divergences and polytomous logistic regression models: an overview. J. Stat. Plan. Inference 137, 3513–3524 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  • Gupta, A.K., Nguyen, T., Pardo, L.: Residuals for polytomous logistic regression models based on phi-divergences test statistics. Statistics 42, 495–514 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  • Hong, C., Kim, Y.: Automatic selection of the tuning parameter in the minimum density power divergence estimation. J. Korean Stat. Soc. 30, 453465 (2001)

    MathSciNet  Google Scholar 

  • Lehtonen, R., Pahkinen, E.: Practical Methods for Design and Analysis of Complex Surveys. Wiley, Chchester (1995)

    MATH  Google Scholar 

  • Lesaffre, E.: Logistic discrimination analysis with application in electrocardiography. Doctoral thesis, University of Leuven (1986)

  • Lesaffre, E., Albert, A.: Multiple-group logistic regression diagnostic. Appl. Stat. 38, 425–440 (1989)

    Article  MathSciNet  MATH  Google Scholar 

  • Liu, I., Agresti, A.: The analysis of ordered categorical data: an overview and a survey of recent developments. With discussion and a rejoinder by the authors. Test 14, 1–73 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  • Mantel, N.: Models for complex contingency tables and polychotomous dosage response curves. Biometrics 22, 83–95 (1966)

    Article  Google Scholar 

  • McCullagh, P.: Regression models for ordinary data. J. R. Stat. Soc. Ser. B 42, 109–142 (1980)

    MATH  Google Scholar 

  • Molina, E.A., Skinner, C.C.J.: Pseudo-likelihood and quasi-likelihood estimation for complex sampling schemes. Comput. Stat. Data Anal. 13, 395–405 (1992)

    Article  MathSciNet  MATH  Google Scholar 

  • Morel, G.: Logistic regression under complex survey designs. Surv. Methodol. 15, 203–223 (1989)

    Google Scholar 

  • Morel, G., Neerchal, N.K.: Overdispersion Models in SAS. SAS Institute, Cary (2012)

    Google Scholar 

  • Pardo, L.: Statistical Inference Based on Divergence Measures. Statistics: Texbooks and Monographs. Chapman & Hall/CRC, New York (2005)

    Google Scholar 

  • Rao, J.N.K., Scott, A.J.: On Chi-squared tests for multinomial contigency tables with cell proportions estimated from survey data. Ann. Stat. 6, 461–464 (1984)

    Google Scholar 

  • Rao, J.N., Thomas, D.R.: Chi-squared tests for contingency tables. In: Skinner, C.J., Holt, D., Smith, T.M.F. (eds.) Analysis of Complex Survey, pp. 89–114. Wiley, New York (1989)

    Google Scholar 

  • Roberts, G., Rao, J.N.K., Kumer, S.: Logistic regression analysis of sample survey data. Biometrika 74, 1–12 (1987)

    Article  MathSciNet  MATH  Google Scholar 

  • SAS Institute Inc.: SAS/STAT®13.1 User’s Guide. Cary, NC (2013)

  • Skinner, C.J., Holt, D., Smith, T.M.F.: Analysis of Complex Surveys. Wiley, New York (1989)

    MATH  Google Scholar 

  • Theil, H.: A multinomial extension of the linear logit model. Int. Econ. Rev. 10, 251–259 (1969)

    Article  Google Scholar 

  • Warwick, J., Jones, M.C.: Choosing a robustness tuning parameter. J. Stat. Comput. Simul. 75, 581–588 (2005)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

We would like to thank the referees for their helpful comments and suggestions. Their comments have improved the paper. This research is partially supported by Grants MTM2012-33740, MTM2015-67057-P and ECO2015-66593-P from Ministerio de Economia y Competitividad (Spain).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Leandro Pardo.

A Proof of results

A Proof of results

1.1 A.1 Proof of Theorem 1

Proof

The pseudo-minimum phi-divergence estimator of \(\varvec{\beta }\), \( \widehat{\varvec{\beta }}_{\phi ,P}\), is obtained by solving the system of equations \(\frac{\partial }{\partial \varvec{\beta }}d_{\phi }\left( \widehat{\varvec{p}},\varvec{\pi }\left( \varvec{\beta }\right) \right) =\varvec{0}_{dk}\), and then, it is also obtained from \( \varvec{u}_{\phi }\left( \varvec{\beta }\right) =\varvec{0}_{dk}\), where

$$\begin{aligned} \varvec{u}_{\phi }\left( \varvec{\beta }\right) =-\frac{\tau }{ \phi ^{\prime \prime }(1)}\frac{\partial }{\partial \varvec{\beta }}d_{\phi }\left( \widehat{\varvec{p}},\varvec{\pi }\left( \varvec{\beta } \right) \right) =\sum \limits _{h=1}^{H}\sum \limits _{i=1}^{n_{h}}\varvec{u} _{\phi ,hi}\left( \varvec{\beta }\right) , \end{aligned}$$

with

$$\begin{aligned} \varvec{u}_{\phi ,hi}\left( \varvec{\beta }\right)&=-\frac{ w_{hi}m_{hi}}{\phi ^{\prime \prime }(1)}\frac{\partial }{\partial \varvec{ \beta }}d_{\phi }\left( \tfrac{\widehat{\varvec{y}}_{hi}}{m_{hi}}, \varvec{\pi }_{hi}(\varvec{\beta })\right) =\frac{w_{hi}m_{hi}}{ \phi ^{\prime \prime }(1)}\sum \limits _{s=1}^{d+1}\frac{\partial \pi _{his}( \varvec{\beta })}{\partial \varvec{\beta }}f_{\phi ,his}(\tfrac{\widehat{ y}_{his}}{m_{hi}},\varvec{\beta }) \\&=\frac{w_{hi}m_{hi}}{\phi ^{\prime \prime }(1)}\frac{\partial \varvec{\pi } _{hi}^{T}(\varvec{\beta })}{\partial \varvec{\beta }}\varvec{f} _{\phi ,hi}(\tfrac{\widehat{\varvec{y}}_{hi}}{m_{hi}},\varvec{\beta }), \end{aligned}$$

and

$$\begin{aligned} \varvec{f}_{\phi ,hi}\left( \tfrac{\widehat{\varvec{y}}_{hi}}{m_{hi}}, \varvec{\beta }\right) =\left( f_{\phi ,hi1}\left( \tfrac{\widehat{y}_{hi1}}{m_{hi}}, \varvec{\beta }\right) ,\ldots ,f_{\phi ,hi,d+1}\left( \tfrac{\widehat{y}_{hi,d+1}}{m_{hi}}, \varvec{\beta }\right) \right) ^{T}. \end{aligned}$$

Since

$$\begin{aligned} \frac{\partial \varvec{\pi }_{hi}^{T}(\varvec{\beta })}{\partial \varvec{\beta }}=\left( \varvec{I}_{d\times d},\varvec{0} _{d\times 1}\right) \varvec{\Delta }(\varvec{\pi }_{hi}\left( \varvec{\beta }\right) )\otimes \varvec{x}_{hi}, \end{aligned}$$

the expression of \(\varvec{u}_{\phi ,hi}\left( \varvec{\beta }\right) \) is rewritten as (16)–(17). \(\square \)

1.2 A.2 Proof of Theorem 2

Proof

From Theorem 1 and by following the same steps of the linearization method of Binder (1983),

$$\begin{aligned} \mathbf {G}\left( \varvec{\beta }\right) =\lim _{n\rightarrow \infty } \varvec{V}\left[ \tfrac{1}{\sqrt{n}}\varvec{U}_{\phi }\left( \varvec{ \beta }\right) \right] \quad \text {and}\quad \mathbf {H}\left( \varvec{\beta } \right) =-\lim _{n\rightarrow \infty }\frac{1}{n}\frac{\partial \varvec{U} _{\phi }^{T}\left( \varvec{\beta }\right) }{\partial \varvec{\beta }}, \end{aligned}$$

where \(\varvec{U}_{\phi }\left( \varvec{\beta }\right) \) is the random variable generator of \(\varvec{u}_{\phi }\left( \varvec{\beta }\right) \) is given by (15). Taking into account that \(f_{\phi ,his}(\pi _{his}(\varvec{\beta }),\varvec{\beta })=0\) and \(f_{\phi ,his}^{\prime }(\pi _{his}(\varvec{\beta }),\varvec{\beta })=\frac{1}{\pi _{his}( \varvec{\beta })}\phi ^{\prime \prime }\left( 1\right) \), a first Taylor expansion of \(f_{\phi ,his}(\tfrac{\widehat{Y}_{his}}{m_{hi}},\varvec{ \beta })\) given in (18) is

$$\begin{aligned} f_{\phi ,his}\left( \tfrac{\widehat{Y}_{his}}{m_{hi}},\varvec{\beta }\right)&=f_{\phi ,his}\left( \pi _{his}\left( \varvec{\beta }\right) ,\varvec{\beta }\right) +f_{\phi ,his}^{\prime }\left( \pi _{his}\left( \varvec{\beta }\right) ,\varvec{\beta }\right) \left( \tfrac{ \widehat{Y}_{his}}{m_{hi}}-\pi _{his}\left( \varvec{\beta }\right) \right) \nonumber \\&\quad +o\left( \tfrac{\widehat{ Y}_{his}}{m_{hi}}-\pi _{his}\left( \varvec{\beta }\right) \right) =\frac{\phi ^{\prime \prime }\left( 1\right) }{\pi _{his}\left( \varvec{\beta }\right) } \left( \tfrac{Y_{his}}{m_{hi}}-\pi _{his}\left( \varvec{\beta }\right) \right) \nonumber \\&\quad +o\left( \tfrac{\widehat{Y} _{his}}{m_{hi}}-\pi _{his}\left( \varvec{\beta }\right) \right) , \end{aligned}$$
(28)

i.e.,

$$\begin{aligned} \varvec{f}_{\phi ,hi}\left( \tfrac{\widehat{\varvec{Y}}_{hi}}{m_{hi}}, \varvec{\beta }\right) =\phi ^{\prime \prime }\left( 1\right) \mathrm {diag}^{-1}\left( \varvec{\pi }_{hi}\left( \varvec{\beta }\right) \right) \left( \tfrac{\widehat{\varvec{Y}} _{hi}}{m_{hi}}-\varvec{\pi }_{hi}\left( \varvec{\beta }\right) \right) +o\left( \tfrac{ \widehat{\varvec{Y}}_{hi}}{m_{hi}}-\varvec{\pi }_{hi}\left( \varvec{ \beta }\right) \right) , \end{aligned}$$

and hence,

$$\begin{aligned} \frac{1}{\sqrt{n}}\varvec{U}_{\phi }\left( \varvec{\beta }\right)= & {} \frac{1}{\sqrt{n}}\sum \limits _{h=1}^{H}\sum \limits _{i=1}^{n_{h}}w_{hi}m_{hi} \frac{\partial \varvec{\pi }_{hi}^{T}\left( \varvec{\beta }\right) }{\partial \varvec{\beta }}\mathrm {diag}^{-1}\left( \varvec{\pi }_{hi}\left( \varvec{\beta }\right) \right) \left( \tfrac{\widehat{\varvec{Y}}_{hi}}{m_{hi}}-\varvec{\pi }_{hi}\left( \varvec{\beta }\right) \right) \\&\quad +\sum \limits _{h=1}^{H}\sqrt{\eta _{h}^{*}}o\left( \frac{1}{\sqrt{n_{h}}}\left( \sum _{i=1}^{n_{h}}\widehat{\varvec{Y}} _{hi}-\sum _{i=1}^{n_{h}}m_{hi}\varvec{\pi }_{hi}\left( \varvec{\beta } \right) \right) \right) . \end{aligned}$$

From the Central Limit Theorem

$$\begin{aligned} \frac{1}{\sqrt{n_{h}}}\left( \sum _{i=1}^{n_{h}}\widehat{\varvec{Y}} _{hi}-\sum _{i=1}^{n_{h}}m_{hi}\varvec{\pi }_{hi}\left( \varvec{\beta } \right) \right) \underset{n_{h}\rightarrow \infty }{\overset{\mathcal {L}}{ \longrightarrow }}\mathcal {N}\left( \varvec{0}_{d+1},\lim _{n_{h}\rightarrow \infty }\tfrac{1}{n_{h}}\sum _{i=1}^{n_{h}}\varvec{V}\left[ \widehat{ \varvec{Y}}_{hi}\right] \right) , \end{aligned}$$

then

$$\begin{aligned} o\left( \frac{1}{\sqrt{n_{h}}}\left( \sum _{i=1}^{n_{h}}\widehat{ \varvec{Y}}_{hi}-\sum _{i=1}^{n_{h}}m_{hi}\varvec{\pi }_{hi}( \varvec{\beta })\right) \right) =o\left( o_{p}(\varvec{1} _{d+1})\right) =o_{p}(\varvec{1}_{d+1}), \end{aligned}$$

and thus,

$$\begin{aligned} \frac{1}{\sqrt{n}}\varvec{U}_{\phi }\left( \varvec{\beta }\right) = \frac{1}{\sqrt{n}}\sum \limits _{h=1}^{H}\sum \limits _{i=1}^{n_{h}}w_{hi}\frac{ \partial \log \varvec{\pi }_{hi}^{T}(\varvec{\beta })}{\partial \varvec{\beta }}(\widehat{\varvec{y}}_{hi}-m_{hi}\varvec{\pi } _{hi}(\varvec{\beta }))+o_{p}(\varvec{1}_{dk}). \end{aligned}$$

Since

$$\begin{aligned} \frac{\partial \log \varvec{\pi }_{hi}^{T}(\varvec{\beta })}{\partial \varvec{\beta }}\varvec{\pi }_{hi}(\varvec{\beta })&=\frac{ \partial \varvec{\pi }_{hi}^{T}(\varvec{\beta })}{\partial \varvec{ \beta }}\mathrm {diag}^{-1}(\varvec{\pi }_{hi}(\varvec{\beta })) \varvec{\pi }_{hi}(\varvec{\beta }) \\&=\frac{\partial \varvec{\pi }_{hi}^{T}(\varvec{\beta })}{\partial \varvec{\beta }}\varvec{1}_{d+1}=\frac{\partial \left( \varvec{\pi } _{hi}^{T}(\varvec{\beta })\varvec{1}_{d+1}\right) }{\partial \varvec{\beta }}=\varvec{0}_{dk},\\ \frac{\partial \log \varvec{\pi }_{hi}^{T}(\varvec{\beta })}{\partial \varvec{\beta }}\widehat{\varvec{y}}_{hi}&=\frac{\partial \varvec{\pi }_{hi}^{T}(\varvec{\beta })}{\partial \varvec{\beta }} \mathrm {diag}^{-1}(\varvec{\pi }_{hi}(\varvec{\beta }))\widehat{ \varvec{Y}}_{hi} \\&=\left( \left( \varvec{I}_{d\times d},\varvec{0}_{d\times 1}\right) \varvec{\Delta }(\varvec{\pi }_{hi}\left( \varvec{\beta } \right) )\otimes \varvec{x}_{hi}\right) \mathrm {diag}^{-1}(\varvec{\pi }_{hi}(\varvec{\beta }))\widehat{\varvec{Y}}_{hi} \\&=\left( \varvec{I}_{d\times d},\varvec{0}_{d\times 1}\right) \varvec{\Delta }(\varvec{\pi }_{hi}\left( \varvec{\beta }\right) ) \mathrm {diag}^{-1}(\varvec{\pi }_{hi}(\varvec{\beta }))\widehat{ \varvec{Y}}_{hi}\otimes \varvec{x}_{hi} \\&=\left( \varvec{I}_{d\times d},\varvec{0}_{d\times 1}\right) \left( \widehat{\varvec{Y}}_{hi}-\varvec{\pi }_{hi}\left( \varvec{\beta } \right) \varvec{\pi }_{hi}\left( \varvec{\beta }\right) ^{T}\mathrm { diag}^{-1}\left( \pi _{hi}\left( \varvec{\beta }\right) \right) \widehat{ \varvec{Y}}_{hi}\right) \otimes \varvec{x}_{hi} \\&=\left( \varvec{I}_{d\times d},\varvec{0}_{d\times 1}\right) \left( \widehat{\varvec{Y}}_{hi}-m_{hi}\varvec{\pi }_{hi}\left( \varvec{ \beta }\right) \right) \otimes \varvec{x}_{hi} \\&=\left( \widehat{\varvec{y}}_{hi}^{*}-m_{hi}\varvec{\pi } _{hi}^{*}\left( \varvec{\beta }\right) \right) \otimes \varvec{x} _{hi}, \end{aligned}$$

it follows that

$$\begin{aligned} \frac{1}{\sqrt{n}}\varvec{U}_{\phi }\left( \varvec{\beta }\right) = \frac{1}{\sqrt{n}}\sum \limits _{h=1}^{H}\sum \limits _{i=1}^{n_{h}}w_{hi}\left( \widehat{\varvec{y}}_{hi}^{*}-m_{hi}\varvec{\pi }_{hi}^{*}( \varvec{\beta })\right) \otimes \varvec{x}_{hi}+o_{p}(\varvec{1} _{dk}), \end{aligned}$$
(29)

Then, \(\mathbf {H}\left( \varvec{\beta }_{0}\right) \) is the limit of

$$\begin{aligned} -\frac{1}{n}\frac{\partial }{\partial \varvec{\beta }}\varvec{U}_{\phi }^{T}\left( \varvec{\beta }\right)&=\frac{1}{n}\sum \limits _{h=1}^{H} \sum \limits _{i=1}^{n_{h}}w_{hi}m_{hi}\frac{\partial }{\partial \varvec{ \beta }}\varvec{\pi }_{hi}^{*}(\varvec{\beta })\otimes \varvec{x} _{hi}+o_{p}(\varvec{1}_{dk\times dk}) \\&=\frac{1}{n}\sum \limits _{h=1}^{H}\sum \limits _{i=1}^{n_{h}}w_{hi}m_{hi} \varvec{\Delta }\left( \varvec{\pi }_{hi}^{*}\left( \varvec{\beta } \right) \right) \otimes \varvec{x}_{hi}+o_{p}(\varvec{1}_{dk\times dk}), \end{aligned}$$

as n increases, and hence, \(\mathbf {H}\left( \varvec{\beta }\right) =\lim _{n\rightarrow \infty }\mathbf {H}_{n}\left( \varvec{\beta }\right) \). On the other hand, from (29) it follows that

$$\begin{aligned} \frac{1}{\sqrt{n}}\varvec{U}_{\phi }\left( \varvec{\beta }\right) = \frac{1}{\sqrt{n}}\varvec{U}\left( \varvec{\beta }\right) +o_{p}( \varvec{1}_{dk}), \end{aligned}$$

and this justifies that \(\mathbf {G}\left( \varvec{\beta }\right) =\lim _{n\rightarrow \infty }\mathbf {G}_{n}\left( \varvec{\beta }\right) \). \(\square \)

1.3 A.3 Proof of Theorem 3

Proof

If \(\varvec{V}[\widehat{\varvec{Y}}_{hi}]=\nu _{m_{h}}m_{h} \varvec{\Delta }(\varvec{\pi }_{hi}\left( \varvec{\beta } _{0}\right) )\), then from the expression of \(\mathbf {G}_{n_{h}}^{(h)}\left( \varvec{\beta }_{0}\right) \) given in Theorem 2,

$$\begin{aligned} \mathbf {G}_{n_{h}}^{(h)}\left( \varvec{\beta }_{0}\right)&=\frac{1}{ n_{h}}\sum \limits _{i=1}^{n_{h}}w_{h}^{2}\varvec{V}\left[ \widehat{ \varvec{Y}}_{hi}^{*}\right] \otimes \varvec{x}_{hi}\varvec{x} _{hi}^{T}=\nu _{m_{h}}w_{h}\frac{1}{n_{h}}\sum \limits _{i=1}^{n_{h}}w_{h}m_{h}\varvec{\Delta }\left( \varvec{\pi } _{hi}^{*}( \varvec{\beta }_{0}) \right) \otimes \varvec{x}_{hi} \varvec{x}_{hi}^{T} \\&=\nu _{m_{h}}w_{h}\mathbf {H}_{n_{h}}^{(h)}\left( \varvec{\beta } _{0}\right) . \end{aligned}$$

Hence, from

$$\begin{aligned} \mathrm {trace}\left( \mathbf {H}_{n_{h}}^{(h)}\left( \varvec{\beta } _{0}\right) ^{-1}\mathbf {G}_{n_{h}}^{(h)}\left( \varvec{\beta } _{0}\right) \right) =\nu _{m_{h}}w_{h}dk, \end{aligned}$$

and consistency of \(\mathbf {H}_{n_{h}}^{(h)}(\widehat{\varvec{\beta }} _{\phi ,P})\) and \(\widehat{\mathbf {G}}_{n_{h}}^{(h)}(\widehat{\varvec{ \beta }}_{\phi ,P})\),

$$\begin{aligned} \widehat{\nu }_{m_{h}}(\widehat{\varvec{\beta }}_{\phi ,P})=\frac{1}{dk} \mathrm {trace}\left( \frac{1}{w_{h}}\mathbf {H}_{n_{h}}^{(h)}(\widehat{ \varvec{\beta }}_{\phi ,P})^{-1}\widehat{\mathbf {G}}_{n_{h}}^{(h)}( \widehat{\varvec{\beta }}_{\phi ,P})\right) , \end{aligned}$$

is proved with

$$\begin{aligned} \frac{1}{w_{h}}\mathbf {H}_{n_{h}}^{(h)}\left( \widehat{\varvec{\beta }}_{\phi ,P}\right) ^{-1}\widehat{\mathbf {G}}_{n_{h}}^{(h)}\left( \widehat{\varvec{\beta }} _{\phi ,P}\right)&=\left( \sum \limits _{i=1}^{n_{h}}m_{h}\varvec{\Delta }\left( \varvec{\pi }_{hi}^{*}\left( \widehat{\varvec{\beta }}_{\phi ,P}\right) \right) \otimes \varvec{x}_{hi}\varvec{x}_{hi}^{T}\right) ^{-1} \\&\quad \times \sum \limits _{i=1}^{n_{h}}\left( \varvec{v}_{hi}\left( \widehat{ \varvec{\beta }}_{\phi ,P}\right) -\varvec{\bar{v}}_{h}\left( \widehat{\varvec{ \beta }}_{\phi ,P}\right) \right) \left( \varvec{v}_{hi}\left( \widehat{\varvec{ \beta }}_{\phi ,P}\right) -\varvec{\bar{v}}_{h}\left( \widehat{\varvec{\beta }} _{\phi ,P}\right) \right) ^{T}, \\ \varvec{v}_{hi}\left( \widehat{\varvec{\beta }}_{\phi ,P}\right)&=\frac{1}{w_{h}} \varvec{u}_{hi}\left( \varvec{\beta }\right) , \end{aligned}$$

which is equivalent to (25). \(\square \)

1.4 A.4 Proof of Theorem 4

Proof

The mean vector and variance-covariance matrix of

$$\begin{aligned} \varvec{Z}_{hi}^{*}\left( \varvec{\beta }_{0}\right) =\sqrt{m_{h}}\varvec{ \Delta }^{-\frac{1}{2}}\left( \varvec{\pi }_{hi}^{*}\left( \varvec{\beta } _{0}\right) \right) \left( \tfrac{\widehat{\varvec{Y}}_{hi}^{*}}{m_{h}}- \varvec{\pi }_{hi}^{*}\left( \varvec{\beta }_{0}\right) \right) , \end{aligned}$$

are, respectively,

$$\begin{aligned} \varvec{E}[\varvec{Z}_{hi}^{*}(\varvec{\beta }_{0})]&= \varvec{0}_{d}, \\ \varvec{V}[\varvec{Z}_{hi}^{*}(\varvec{\beta }_{0})]&=\nu _{m_{h}}\varvec{I}_{d}, \end{aligned}$$

for \(h=1,\ldots ,H\). An unbiased estimator of \(\varvec{V}[\varvec{Z} _{hi}^{*}(\varvec{\beta }_{0})]\) is

$$\begin{aligned} \widehat{\varvec{V}}[\varvec{Z}_{hi}^{*}(\varvec{\beta }_{0})]= \frac{1}{n_{h}}\sum \limits _{i=1}^{n_{h}}\varvec{Z}_{hi}^{*}( \varvec{\beta }_{0})\varvec{Z}_{hi}^{*T}(\varvec{\beta }_{0}), \end{aligned}$$

from which is derived

$$\begin{aligned} E\left[ \mathrm {trace}\widehat{\varvec{V}}[\varvec{Z}_{hi}^{*}( \varvec{\beta }_{0})]\right]&=\mathrm {trace}\varvec{V}[\varvec{Z }_{hi}^{*}(\varvec{\beta }_{0})], \\ E\left[ \frac{1}{n_{h}}\sum \limits _{i=1}^{n_{h}}\mathrm {trace}\left( \varvec{Z}_{hi}^{*}(\varvec{\beta }_{0})\varvec{Z}_{hi}^{*T}(\varvec{\beta }_{0})\right) \right]&=\mathrm {trace}\left( \nu _{m_{h}} \varvec{I}_{d}\right) , \\ E\left[ \frac{1}{n_{h}}\sum \limits _{i=1}^{n_{h}}\varvec{Z}_{hi}^{*T}( \varvec{\beta }_{0})\varvec{Z}_{hi}^{*}(\varvec{\beta }_{0}) \right]&=\nu _{m_{h}}d, \\ E\left[ \frac{1}{n_{h}d}\sum \limits _{i=1}^{n_{h}}\varvec{Z}_{hi}^{*T}(\varvec{\beta }_{0})\varvec{Z}_{hi}^{*}(\varvec{\beta }_{0}) \right]&=\nu _{m_{h}}. \end{aligned}$$

This expression suggests using

$$\begin{aligned} \widetilde{\nu }_{m_{h}}\left( \widehat{\varvec{\beta }}_{\phi ,P}\right)&=\frac{1}{ n_{h}d}\sum \limits _{i=1}^{n_{h}}\widehat{\varvec{z}}_{hi,\phi ,P}^{*T}\left( \widehat{\varvec{\beta }}_{\phi ,P}\right) \widehat{\varvec{z}}_{hi,\phi ,P}^{*}\left( \widehat{\varvec{\beta }}_{\phi ,P}\right) \\&=\frac{1}{n_{h}d}m_{h}\left( \tfrac{\widehat{\varvec{y}}_{hi}^{*}}{ m_{h}}-\varvec{\pi }_{hi}^{*}\left( \widehat{\varvec{\beta }}_{\phi ,P}\right) \right) ^{T}\varvec{\Delta }^{-1}\left( \varvec{\pi }_{hi}^{*}\left( \widehat{\varvec{\beta }}_{\phi ,P}\right) \right) \left( \tfrac{\widehat{\varvec{y} }_{hi}^{*}}{m_{h}}-\varvec{\pi }_{hi}^{*}\left( \widehat{\varvec{ \beta }}_{\phi ,P}\right) \right) \\&=\frac{1}{n_{h}d}m_{h}\left( \tfrac{\widehat{\varvec{y}}_{hi}}{m_{h}}- \varvec{\pi }_{hi}\left( \widehat{\varvec{\beta }}_{\phi ,P}\right) \right) ^{T} \varvec{\Delta }^{-}\left( \varvec{\pi }_{hi}\left( \widehat{\varvec{\beta }} _{\phi ,P}\right) \right) \left( \tfrac{\widehat{\varvec{y}}_{hi}^{*}}{m_{h}}- \varvec{\pi }_{hi}^{*}\left( \widehat{\varvec{\beta }}_{\phi ,P}\right) \right) ,\\ \widehat{\varvec{z}}_{hi,\phi ,P}^{*}&=\sqrt{m_{h}}\varvec{\Delta }^{-\frac{1}{2}}\left( \varvec{\pi }_{hi}^{*}\left( \widehat{\varvec{\beta }} _{\phi ,P}\right) \right) \left( \tfrac{\widehat{\varvec{y}}_{hi}^{*}}{m_{h}}- \varvec{\pi }_{hi}^{*}\left( \widehat{\varvec{\beta }}_{\phi ,P}\right) \right) . \end{aligned}$$

Finally, since \(\varvec{\Delta }^{-}(\varvec{\pi }_{hi}(\widehat{ \varvec{\beta }}_{\phi ,P}))=\mathrm {diag}^{-1}(\varvec{\pi }_{hi}( \widehat{\varvec{\beta }}_{\phi ,P}))\) is a possible expression for the generalized inverse, the desired result for \(\widetilde{\nu }_{m_{h}}( \widehat{\varvec{\beta }}_{\phi ,P})\) is obtained. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Castilla, E., Martín, N. & Pardo, L. Minimum phi-divergence estimators for multinomial logistic regression with complex sample design. AStA Adv Stat Anal 102, 381–411 (2018). https://doi.org/10.1007/s10182-017-0311-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10182-017-0311-6

Keywords

Mathematics Subject Classification

Navigation