Finite Population Survey Sampling: An Unapologetic Bayesian Perspective

Banerjee, Sudipto

doi:10.1007/s13171-024-00348-8

Finite Population Survey Sampling: An Unapologetic Bayesian Perspective

Published: 08 April 2024

(2024)
Cite this article

Sankhya A Aims and scope Submit manuscript

Sudipto Banerjee ORCID: orcid.org/0000-0002-2239-208X¹

17 Accesses
Explore all metrics

Abstract

This article attempts to offer some perspectives on Bayesian inference for finite population quantities when the units in the population are assumed to exhibit complex dependencies. Beginning with an overview of Bayesian hierarchical models, including some that yield design-based Horvitz-Thompson estimators, the article proceeds to introduce dependence in finite populations and sets out inferential frameworks for ignorable and nonignorable responses. Multivariate dependencies using graphical models and spatial processes are discussed and some salient features of two recent analyses for spatial finite populations are presented.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An Introduction to Bayesian Inference for Finite Population Characteristics

Model-based variance estimation in two-dimensional systematic sampling

Article 03 October 2017

A Bayesian Nonparametric Framework to Inference on Totals of Finite Populations

References

Arora, V. and Lahiri, P. (1997). On the superiority of the bayesian method over the blup in small area estimation problems. Stat. Sin. 7, 1053–1063. http://www.jstor.org/stable/24306172.
Arora, V., Lahiri, P. and Mukherjee, K. (1997). Empirical bayes estimation of finite population means from complex surveys. J. Am. Stat. Assoc. 92, 1555–1562.
Article MathSciNet Google Scholar
Banerjee, S., Carlin, B.P. and Gelfand, A.E. (2014). Hierarchical Modeling and Analysis for Spatial Data, 2nd edn. Chapman & Hall/CRC, Boca Raton, FL.
Basu, D. (1971). An essay on the logical foundations of survey sampling, part 1. Holt, Rinehart and Winston, Toronto, p 203–242.
Google Scholar
Boyle, D., King, A., Kourakos, G., et al. (2012). Groundwater Nitrate Occurrence, Technical Report 4. Tech. rep., Center for Watershed Sciences, University of California, Davis, Davis, CA.
Google Scholar
Bruno, F., Cocchi, D. and Vagheggini, A. (2013). Finite population properties of individual predictors based on spatial pattern. Environ. Ecol. Stat. 20, 467–494.
Article MathSciNet Google Scholar
Chan-Golston, A.M., Banerjee, S. and Handcock, M.S. (2020). Bayesian inference for finite populations under spatial process settings. Environmetrics 31, e2606. https://doi.org/10.1002/env.2606, https://onlinelibrary.wiley.com/doi/abs/10.1002/env.2606.
Article MathSciNet Google Scholar
Chan-Golston, A.M., Banerjee, S., Belin, T.R., et al. (2022). Bayesian finite-population inference with spatially correlated measurements. Jpn. J. Stat. Data Sci. 5, 407–430. https://doi.org/10.1007/s42081-022-00178-8.
Article MathSciNet Google Scholar
Cicchitelli, G. and Montanari, G.E. (2012). Model-assisted estimation of a spatial population mean. Int. Stat. Rev. 80, 111–126.
Article MathSciNet Google Scholar
Clayton, D. and Kaldor, J. (1987). Empirical Bayes estimates of age-standardized relative risks for use in disease mapping. Biometrics 43, 671–681.
Article Google Scholar
Cochran, W.G. (1977). Sampling Techniques, 3rd edn. John Wiley & Sons, Hoboken, NJ.
Google Scholar
Cox, D. and Wermuth, N. (1996). Multivariate Dependencies. Chapman & Hall/CRC, Boca Raton, FL.
Article MathSciNet Google Scholar
Cox, D.R. and Wermuth, N. (1993). Linear dependencies represented by chain graphs. Stat. Sci. 8, 204 – 218. https://doi.org/10.1214/ss/1177010887.
Google Scholar
Cressie, N. and Wikle, C.K. (2011). Statistics for Spatio-Temporal Data. John Wiley & Sons, Hoboken, NJ.
Google Scholar
Datta, G.S. and Ghosh, M. (1991). Bayesian prediction in linear models: applications to small area estimation. Ann. Stat., pp. 1748–1770.
de Valpine, P., Turek, D., Paciorek, C., et al. (2017). Programming with models: writing statistical algorithms for general model structures with NIMBLE. J. Comput. Graph. Stat. 26, 403–413. https://doi.org/10.1080/10618600.2016.1172487.
de Valpine, P., Paciorek, C., Turek, D., et al. (2023). NIMBLE: MCMC, Particle Filtering, and Programmable Hierarchical Modeling. https://doi.org/10.5281/zenodo.1211190, https://cran.r-project.org/package=nimble, R package version 1.0.1.
Article MathSciNet Google Scholar
Diggle, P.J., Menezes R. and Su T.L. (2010). Geostatistical inference under preferential sampling. J. R. Stat. Soc.: Ser. C 59, 191–232.
Di Zio, M., Liseo, B. and Ranalli, M.G. (2023) Bayesian Ideas in Survey Sampling: The Legacy of Basu. Sankhya A. https://doi.org/10.1007/s13171-023-00327-5
Ericson, W.A. (1969). Subjective Bayesian models in sampling finite populations. J. R. Stat. Soc. Ser. B 31, 195–233.
MathSciNet Google Scholar
Finley, A.O., Banerjee S. and MacFarlane D.W. (2011). A hierarchical model for quantifying forest variables over large heterogeneous landscapes with uncertain forest areas. J. Am. Stat. Assoc. 106, 31–48. https://doi.org/10.1198/jasa.2011.ap09653, pMID: 26139950.
Finley, A.O., Andersen, H.E., Babcock, C., et al. (in press) Models to support forest inventory and small area estimation using sparsely sampled lidar: a case study involving g-liht lidar in tanana, alaska. J. Agric. Biol. Environ. Stat.
Gelfand, A.E. and Banerjee S. (2010). Multivariate spatial process models. In: Handbook of Spatial Statistics, (A. Gelfand, P. Diggle and M. Fuentes, et al eds.). CRC Press, Boca Raton, FL, p. 495–516.
Chapter Google Scholar
Gelfand, A.E. and Ghosh, S.K. (1998). Model choice: a minimum posterior predictive loss approach. Biometrika 85, 1–11.
Article MathSciNet Google Scholar
Gelman, A. (2007). Struggles with survey weighting and regression modeling. Stat. Sci. 22, 153–164.
MathSciNet Google Scholar
Gelman, A., Carlin, J.B., Stern, H.S., et al. (2014). Bayesian Data Analysis, 3rd edn. Chapman & Hall/CRC, Boca Raton, FL.
Google Scholar
Genton, M.G. and Kleiber, W. (2015). Cross-covariance functions for multivariate geostatistics. Stat. Sci., pp. 147–163.
Ghosh, M. (2012). Finite population sampling: a model-design synthesis. Stat. Transit. 13, 235–242.
Google Scholar
Ghosh, M. and Meeden, G. (1997). Bayesian Methods for Finite Population Sampling. Chapman & Hall, London.
Book Google Scholar
Ghosh, M. and Rao, J.N.K. (1994). Small area estimation: an appraisal. Stat. Sci. 9, 55–93.
MathSciNet Google Scholar
Ghosh, M. and Sinha, B.K. (1990). On the consistency between model-and design-based estimators in survey sampling. Commun. Stat. - Theory Methods 19, 689–702. https://doi.org/10.1080/03610929008830226.
Article MathSciNet Google Scholar
Ghosh, M., Natarajan, K., Stroud, T.W.F., et al. (1998). Generalized linear models for small-area estimation. J. Am. Stat. Assoc. 93, 273–282.
Article MathSciNet Google Scholar
Gneiting, T. and Raftery, A.E. (2007). Strictly proper scoring rules, prediction, and estimation. J. Am. Stat. Assoc. 102, 359–378.
Article MathSciNet Google Scholar
Guhaniyogi, R. and Banerjee, S. (2018). Meta-kriging: scalable bayesian modeling and inference for massive spatial datasets. Technometrics 60, 430–444.
Article MathSciNet Google Scholar
Harter, T., Dzurella, K., Kourakos, G., et al. (2017). Nitrogen Fertilizer Loading to Groundwater in the Central Valley, Final Report to the Fertilizer Research Education Program Projects 11-0301 and 15-0454. Tech. rep., California Department of Food and Agriculture and University of California Davis, Davis, CA.
Horvitz, D.G. and Thompson, D.J. (1952). A generalization of sampling without replacement from a finite universe. J. Am. Stat. Assoc. 47, 663–685.
Article MathSciNet Google Scholar
Kalton, G. (2019). Developments in survey research over the past 60 years: a personal perspective. Int. Stat. Rev. 87, S10–S30. https://doi.org/10.1111/insr.12287, https://onlinelibrary.wiley.com/doi/abs/10.1111/insr.12287.
Kish, L. (1965). Survey Sampling. John Wiley & Sons, Inc., Hoboken, New Jersey.
Google Scholar
Kish, L. (1995). The hundred years’ wars of survey sampling. Stat. in Transition 2, 813–830.
Google Scholar
Little, R. and Rubin, D. (2002). Statistical Analysis with Missing Data. John Wiley & Sons, Inc., Hoboken, New Jersey.
Article MathSciNet Google Scholar
Little, R.J. (1982). Models for nonresponse in sample surveys. J. Am. Stat. Assoc. 77, 237–250.
Article MathSciNet Google Scholar
Little, R.J. (2004). To model or not to model? Competing modes of inference for finite population sampling. J. Am. Stat. Assoc. 99, 546–556.
Book Google Scholar
Lunn, D., Spiegelhalter, D., Thomas, A., et al. (2009). The bugs project: evolution, critique and future directions. Stat. Med. 28, 3049–3067. https://doi.org/10.1002/sim.3680, https://onlinelibrary.wiley.com/doi/abs/10.1002/sim.3680.
Malec, D. and Sedransk, J. (1985). Bayesian inference for finite population parameters in multistage cluster sampling. J. Am. Stat. Assoc. 80, 897–902.
Article MathSciNet Google Scholar
Narain, R. (1951). On sampling without replacement with varying probabilities. Journal of the Indian Society of Agricultural Statistics 3, 169–175.
MathSciNet Google Scholar
Rao, J.N.K. (2011). Impact of frequentist and Bayesian methods on survey sampling practice: a selective appraisal. Stat. Sci. 26, 240–256. https://doi.org/10.1214/10-STS346.
Rao, J.N.K. and Ghangurde, P.D. (1972). Bayesian optimization in sampling finite populations. J. Am. Stat. Assoc. 67, 439–443. https://doi.org/10.1080/01621459.1972.10482406, https://www.tandfonline.com/doi/abs/10.1080/01621459.1972.10482406.
Article MathSciNet Google Scholar
Rao, J.N.K. and Molina, I. (2015). Small Area Estimation, 2nd edn. John Wiley & Sons, Hoboken, NJ.
Book Google Scholar
Ripley, B.D. (2004). Spatial Statistics. John Wiley & Sons, Hoboken, NJ.
Google Scholar
Rubin, D.B. (1976). Inference and missing data. Biometrika 63, 581–592.
Article MathSciNet Google Scholar
Scott, A. and Smith, T.M.F. (1969). Estimation in multi-stage surveys. J. Am. Stat. Assoc. 64, 830–840.
Article Google Scholar
Stan Development Team (2024) RStan: the R interface to Stan. https://mc-stan.org/, r package version 2.32.5.
Tang, G., Little, R.J. and Raghunathan, T.E. (2003). Analysis of multivariate missing data with nonignorable nonresponse. Biometrika 90, 747–764.
Article MathSciNet Google Scholar
Vehtari, A., Gelman, A. and Gabry, J. (2017). Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. Stat. Comput. 27, 1413–1432.
Article MathSciNet Google Scholar
Ver Hoef, J. (2002). Sampling and geostatistics for spatial data. Écoscience 9, 152–161.
Article Google Scholar
Watanabe, S. (2013). A widely applicable bayesian information criterion. J. Mach. Learn. Res. 14, 867–897. http://jmlr.org/papers/v14/watanabe13a.html.
Zhang, L. and Banerjee, S. (2022). Spatial factor modeling: a bayesian matrix-normal approach for misaligned data. Biometrics 78, 560–573. https://doi.org/10.1111/biom.13452, https://onlinelibrary.wiley.com/doi/abs/10.1111/biom.13452.

Download references

Acknowledgements

The author wishes to thank the editors and an anonymous referee for valuable feedback. The author is especially grateful to Professors Roderick J. Little and Trivellore Raghunathan from the University of Michigan, Ann Arbor, U.S.A., and Professor J.N.K. Rao from Carleton University, Ottawa, Canada, for insightful discussions on inference for finite populations. The work of the author has been supported, in part, by the National Science Foundation (NSF) from grants NSF/DMS 1916349 and NSF/DMS 2113778, by the National Institute of Environmental Health Sciences (NIEHS) from grants R01ES030210 and R01ES027027 and by the National Institute of General Medical Science from grant R01GM148761.

Funding

The work of the author has been supported, in part, by the National Science Foundation (NSF) from grants NSF/DMS 1916349 and NSF/DMS 2113778, by the National Institute of Environmental Health Sciences (NIEHS) from grants R01ES030210 and R01ES027027 and by the National Institute of General Medical Science from grant R01GM148761.

Author information

Authors and Affiliations

UCLA Department of Biostatistics; UCLA Department of Statistics and Data Science, University of California Los Angeles, 650 Charles E. Young Drive South, Los Angeles, 90095-1772, California, USA
Sudipto Banerjee

Authors

Sudipto Banerjee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sudipto Banerjee.

Ethics declarations

Conflict of interest

The author declares that there are no financial or non-financial conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Banerjee, S. Finite Population Survey Sampling: An Unapologetic Bayesian Perspective. Sankhya A (2024). https://doi.org/10.1007/s13171-024-00348-8

Download citation

Received: 16 June 2023
Published: 08 April 2024
DOI: https://doi.org/10.1007/s13171-024-00348-8

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Finite Population Survey Sampling: An Unapologetic Bayesian Perspective

Abstract

Access this article

Similar content being viewed by others

An Introduction to Bayesian Inference for Finite Population Characteristics

Model-based variance estimation in two-dimensional systematic sampling

A Bayesian Nonparametric Framework to Inference on Totals of Finite Populations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Finite Population Survey Sampling: An Unapologetic Bayesian Perspective

Abstract

Access this article

Similar content being viewed by others

An Introduction to Bayesian Inference for Finite Population Characteristics

Model-based variance estimation in two-dimensional systematic sampling

A Bayesian Nonparametric Framework to Inference on Totals of Finite Populations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation