Prediction Theory for Multinomial Proportions Using Two-stage Cluster Samples

Sutradhar, Brajendra C.

doi:10.1007/s13171-022-00297-0

Prediction Theory for Multinomial Proportions Using Two-stage Cluster Samples

Published: 24 October 2022

Volume 85, pages 1452–1488, (2023)
Cite this article

Sankhya A Aims and scope Submit manuscript

Brajendra C. Sutradhar¹

103 Accesses
Explore all metrics

Abstract

In a two-stage clusters sampling setup for categorical data, it is well known that the so-called best prediction of the category based proportions involves computing the conditional means of the non-sampled multinomial variables conditional on the sampled multinomial responses. This computation is however not easy mainly due to the complex cluster correlations among multinomial responses within a cluster. The independence assumption based approach or any linear model approach for cluster correlated data those used so far in the existing studies are not valid for the computation of such conditional means in the prediction function for multinomial data. As opposed to these ‘working’ independence or linear models based approaches, in this paper we first develop a cluster correlation structure for multinomial data and exploit this structure to compute theoretically valid formulas for the conditional means of non-sampled hypothetical responses. Next because these conditional means or equivalently the prediction function contains the regression and clustered variance/correlation parameters, we estimate these parameters using the survey sampling weights based conditional likelihood approach, whereas the existing studies mostly use the independence assumption based likelihood or moment approaches which are invalid or inadequate in a correlation setup. The proposed conditional likelihood estimators are shown to be consistent for their respective parameters leading to the consistent estimation of the prediction function for the multinomial proportions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multinomial Logistic Mixed Models for Clustered Categorical Data in a Complex Survey Sampling Setup

Article 17 September 2020

An empirical likelihood approach under cluster sampling with missing observations

Article 03 August 2018

Doubly Weighted Estimation Approach for Linear Regression Analysis with Two-stage Cluster Samples

Article 15 December 2023

References

Agresti, A. (2002). Categorical Data Analysis. Wiley, New York.
Book MATH Google Scholar
Binder, D. (1983). On the variances of asymptotically normal estimators from complex surveys. Int. Stat. Rev. 51, 279–292.
Article MathSciNet MATH Google Scholar
Breslow, N.E. and Clayton, D.G. (1993). Approximate inference in generalized linear mixed models. J. Amer. Stat. Assoc. 88, 9–25.
MATH Google Scholar
Cochran, W.G. (1977). Sampling Techniques. Wiley, New York.
MATH Google Scholar
Ghosh, M. (1991). Estimating functions in survey sampling: A review. Oxford Science Publications, Godambe, V. P. (ed.), p. 201–210.
Godambe, V.P. and Thompson, M.E. (1986). Parameters of super-population and survey population: Their relationships and estimation. Int. Stat. Rev. 54, 127–138.
Article MATH Google Scholar
Isaki, C.T. and Fuller, W.A. (1982). Survey design under the regression super-population model. J. Amer. Stat. Assoc. 77, 89–96.
Article MATH Google Scholar
Kennel, T. and Valliant, R. (2020). Multivariate logistic assisted estimators of totals from clustered survey samples. Journal of Survey Statistic and Methodology, 1–35.
Lee, S.E., Lee, P.R. and Shin, K. (2016). A composite estimator for stratified two-stage cluster sampling. Commun. Stat. Applic. Methods 23, 47–55.
Article Google Scholar
Lee, Y. and Nelder, J. (1996). Hierarchical generalized linear models. J. R. Stat. Soc. B 58, 619–678.
MathSciNet MATH Google Scholar
Lehtonen, R. and Veijanen, A. (1998). Logistic generalized regression estimators. Surv. Methodol. 24, 51–55.
Google Scholar
MacGibon, B. and Tomberlin, T.J (1989). Small area estimation of proportions via empirical Bayes techniques. Surv. Methodol. 15, 237–252.
Google Scholar
Nandram, B. and Sedransk, J. (1993). Bayesian predictive inference for a finite population proportion: Two-stage cluster sampling. J. R. Statist. Soc. B. 55, 399–408.
MathSciNet MATH Google Scholar
Rao, J.N.K. and Molina, I. (2015). Small Area Estimation. Wiley, New York.
Book MATH Google Scholar
Särndal, C-E., Swensson, B. and Wretman, J (1992). Model Assisted Survey Sampling. Springer, New York.
Book MATH Google Scholar
Sutradhar, B.C. (2004). On exact quasi-likelihood inference in generalized linear mixed models. Sankhya B 66, 261–289.
Google Scholar
Sutradhar, B.C. (2020). Multinomial logistic mixed models for clustered categorical data in a complex survey setup. Sankhya A. Available as online first article https://doi.org/10.1007/s13171-020-00215-2.
Sutradhar, B.C. (2022). Fixed versus mixed effects based marginal models for clustered correlated binary data: an overview on advances and challenges. Sankhya B84, 259–302.
Article MathSciNet MATH Google Scholar
Ten Have, T.R. and Morabia, A. (1999). Mixed effects models with bivariate and univariate association parameters for longitudinal bivariate binary response data. Biometrics 55, 85–93.
Article MATH Google Scholar
Valliant, R. (1985). Nonlinear prediction theory and the estimation of proportions in a finite population. J. Amer. Stat. Assoc. 80, 631–641.
Article MathSciNet MATH Google Scholar
Valliant, R. (1987). Generalized variance functions in stratified two-stage sampling. J. Amer. Stat. Assoc. 82, 409–508.
Article MathSciNet MATH Google Scholar
Valliant, R., Dorfman, A.H. and Royal, R.M. (2000). Finite Population Sampling and Inference: A Prediction Approach. Wiley, New York.
MATH Google Scholar

Download references

Acknowledgements

The author would like to thank the reviewer for comments and suggestions leading to the improvement of the paper. Thanks are also due to the Editor in Chief, the Editor and an Associate Editor for their suggestions during the review process.

Funding

No fund was used to complete this research.

Author information

Authors and Affiliations

Memorial University, St. John’s, Canada
Brajendra C. Sutradhar

Authors

Brajendra C. Sutradhar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Brajendra C. Sutradhar.

Ethics declarations

Conflict of Interests

There is no conflict of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Sutradhar, B.C. Prediction Theory for Multinomial Proportions Using Two-stage Cluster Samples. Sankhya A 85, 1452–1488 (2023). https://doi.org/10.1007/s13171-022-00297-0

Download citation

Received: 05 July 2021
Accepted: 21 September 2022
Published: 24 October 2022
Issue Date: August 2023
DOI: https://doi.org/10.1007/s13171-022-00297-0

Keywords

Mathematics Subject Classification (2010)

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Prediction Theory for Multinomial Proportions Using Two-stage Cluster Samples

Abstract

Access this article

Similar content being viewed by others

Multinomial Logistic Mixed Models for Clustered Categorical Data in a Complex Survey Sampling Setup

An empirical likelihood approach under cluster sampling with missing observations

Doubly Weighted Estimation Approach for Linear Regression Analysis with Two-stage Cluster Samples

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification (2010)

Navigation

Prediction Theory for Multinomial Proportions Using Two-stage Cluster Samples

Abstract

Access this article

Similar content being viewed by others

Multinomial Logistic Mixed Models for Clustered Categorical Data in a Complex Survey Sampling Setup

An empirical likelihood approach under cluster sampling with missing observations

Doubly Weighted Estimation Approach for Linear Regression Analysis with Two-stage Cluster Samples

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification (2010)

Search

Navigation