Abstract
We introduce a novel method to extract a sample from a finite population where units with desired characteristics are over-represented. The approach is both sequential and adaptive and allows, via suitable compositions of predictive and objective functions, to target specific subsets of the population. We consider the problem of estimation and conjecture the validity of a modified Horvitz–Thompson estimator capable to account for the imbalance induced by the targeting procedure. After discussing how to apply the method to the sampling of geographically distributed units, we investigate its potential via simulations.
Similar content being viewed by others
References
Andreis F, Furfaro E, Mecatti F (2017) Methodological perspectives for surveying rare and clustered population: towards a sequentially adaptive approach. In: Perna C, Pratesi M, Ruiz-Gazen A (eds) Studies in theoretical and applied statistics. Springer, Berlin
Andridge RR, Little RJA (2010) A review of hot deck imputation for survey non-response. Int Stat Rev 78:40–64
Baddeley A, Rubak E, Turner R (2015) Spatial point patterns: methodology and applications with R. Chapman and Hall, London. http://www.crcpress.com/Spatial-Point-Patterns-Methodology-and-Applications-with-R/Baddeley-Rubak-Turner/9781482210200/
Bivand RS, Pebesma E, Gómez-Rubio V (2013) Applied spatial data analysis with R, 2nd edn. UseR! series. Springer, Berlin
Brown JA, Manly BFJ (2016) Restricted adaptive cluster sampling. Environ Ecol Stat 5:47–62
Bruno F, Cocchi D, Vagheggini A (2013) Finite population properties of individual predictors based on spatial pattern. Environ Ecol Stat 20:457–494
Burrough PA, McDonnell RA (1998) Principles of geographical information systems. Oxford University Press, Oxford
Chipeta MG, Terlouw DJ, Phiri KS, Diggle PJ (2016) Adaptive geostatistical design and analysis for prevalence suveys. Spat Stat 15:70–84
Di Battista T (2003) Resampling methods for estimating dispersion indices in random and adaptive designs. Environ Ecol Stat 10(1):83–93
Fattorini L (2006) Applying the Horvitz–Thompson criterion in complex designs: a computer-intensive perspective for estimating inclusion probabilities. Biometrika 93(2):269–278
Fattorini L, Corona P, Chirici G, Pagliarella MC (2015) Design-based strategies for sampling spatial units from regular grids with applications to forest surveys, land use and land cover estimation. Environmetrics 26:216–228. https://doi.org/10.1002/env.2332
Fattorini L, Marcheselli M, Pratelli L (2017) Design-based maps for finite populations of spatial units. J Am Stat Assoc. https://doi.org/10.1080/01621459.2016.1278174
Gattone S, Di Battista T (2011) Adaptive cluster sampling with a data driven stopping rule. Stat Methods Appl 20(1):1–21
Gattone S, Mohamed E, Di Battista T (2016) Adaptive cluster sampling with clusters selected without replacement and stopping rule. Environ Ecol Stat 23:453–468
Gräler B, Pebesma EJ, Heuvelink G (2016) Spatio-temporal interpolation using gstat. R J 8(1):204–218
Grafström A, Saarela S, Ene LT (2014) Efficient sampling strategies for forest inventories by spreading the sample in auxiliary space. Can J For Res 44(10):1156–1164
Hájek J (1964) Asymptotic theory of rejective sampling with varying probabilities from a finite population. Ann Math Stat 35(4):1491–1523
Hájek J (1981) Sampling from a finite population. In: Dupač V (ed) Statistics: textbooks and monographs, vol 37. Marcel Dekker Inc., New York. ISBN: 0-8247-1291-9. (with a foreword by Sen PK)
Joenssen DW (2015) HotDeckImputation: hot deck imputation methods for missing data. R package version 1.1.0. https://CRAN.R-project.org/package=HotDeckImputation
Kabaghe AN, Chipeta MG, McCann RS, Phiri KS, van Vugt M, Takken W, Diggle P, Terlouw AD (2017) Adaptive geostatistical sampling enables efficient identification of malaria hotspots in repeated cross-sectional surveys in rural Malawi. PLoS ONE 12(2):e0172266. https://doi.org/10.1371/journal.pone.0172266
Marella D, Scanu M, Conti PL (2008) On the matching noise of some nonparametric imputation procedures. Stat Probab Lett 78:15931600
Pacifici K, Reich BJ, Dorazio RM, Conroy MJ (2016) Occupancy estimation for rare species using a spatially-adaptive sampling design. Methods Ecol Evol 7:285–293. https://doi.org/10.1111/2041-210X.12499
Pebesma EJ, Bivand RS (2005) Classes and methods for spatial data in R. R News 5(2). https://cran.r-project.org/doc/Rnews/
R Core Team (2016) R: a language and environment for statistical computing. R Foundation for Statistical Computing. url: http://www.R-project.org
Rosen B (1997) On sampling with probability proportional to size. J Stat Plan Inference 62:159–191
Salehi MM, Seber GAF (2017) Two-stage complete allocation sampling. Environmetrics 28(3):1–10. https://doi.org/10.1002/env.2441
Salehi MM, Moradi M, Al Khayat JA, Brown J, Yousif AEM (2015) Inverse adaptive cluster sampling with unequal selection probabilities: case studies on crab holes and arsenic pollution. Aust N Z J Stat 57:189–201. https://doi.org/10.1111/anzs.12118
Seber GAF, Salehi MM (2013) Adaptive sampling designs: inference for sparse and clustered populations. Springer, Heidelberg
Seber GA, Thompson SK (1994) 6 Environmental adaptive sampling. In: Patil GP, Rao CR (eds) Handbook of statistics, vol 12. North-Holland, New York, pp 201–220
Thompson SK (1990) Adaptive cluster sampling. J Am Stat Assoc 85(412):1050–1059
Yan J (2007) Enjoy the joy of copulas: with a package copula. J Stat Softw 21(4):1–21. http://www.jstatsoft.org/v21/i04/
Acknowledgements
The idea of using Hot Deck imputation to construct a resampling population as described in Section 4 originates from unpublished work on bootstrapping non-iid samples developed in collaboration among Fulvia Mecatti, Pierluigi Conti and the first author. A draft version of that work, which will appear elsewhere, is available at https://arxiv.org/abs/1705.03827v2. The authors also wish to thank two anonymous referees for their precious comments, that helped improve the quality of this work.
Author information
Authors and Affiliations
Corresponding author
Additional information
Handling Editor: Bryan F. J. Manly.
Rights and permissions
About this article
Cite this article
Andreis, F., Bonetti, M. A proposal for a two-step sampling design to oversample units responding to prescribed characteristics. Environ Ecol Stat 25, 139–154 (2018). https://doi.org/10.1007/s10651-017-0396-9
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10651-017-0396-9