Skip to main content
Log in

A proposal for a two-step sampling design to oversample units responding to prescribed characteristics

  • Published:
Environmental and Ecological Statistics Aims and scope Submit manuscript

Abstract

We introduce a novel method to extract a sample from a finite population where units with desired characteristics are over-represented. The approach is both sequential and adaptive and allows, via suitable compositions of predictive and objective functions, to target specific subsets of the population. We consider the problem of estimation and conjecture the validity of a modified Horvitz–Thompson estimator capable to account for the imbalance induced by the targeting procedure. After discussing how to apply the method to the sampling of geographically distributed units, we investigate its potential via simulations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  • Andreis F, Furfaro E, Mecatti F (2017) Methodological perspectives for surveying rare and clustered population: towards a sequentially adaptive approach. In: Perna C, Pratesi M, Ruiz-Gazen A (eds) Studies in theoretical and applied statistics. Springer, Berlin

    Google Scholar 

  • Andridge RR, Little RJA (2010) A review of hot deck imputation for survey non-response. Int Stat Rev 78:40–64

    Article  PubMed  PubMed Central  Google Scholar 

  • Baddeley A, Rubak E, Turner R (2015) Spatial point patterns: methodology and applications with R. Chapman and Hall, London. http://www.crcpress.com/Spatial-Point-Patterns-Methodology-and-Applications-with-R/Baddeley-Rubak-Turner/9781482210200/

  • Bivand RS, Pebesma E, Gómez-Rubio V (2013) Applied spatial data analysis with R, 2nd edn. UseR! series. Springer, Berlin

    Book  Google Scholar 

  • Brown JA, Manly BFJ (2016) Restricted adaptive cluster sampling. Environ Ecol Stat 5:47–62

    Google Scholar 

  • Bruno F, Cocchi D, Vagheggini A (2013) Finite population properties of individual predictors based on spatial pattern. Environ Ecol Stat 20:457–494

    Article  Google Scholar 

  • Burrough PA, McDonnell RA (1998) Principles of geographical information systems. Oxford University Press, Oxford

    Google Scholar 

  • Chipeta MG, Terlouw DJ, Phiri KS, Diggle PJ (2016) Adaptive geostatistical design and analysis for prevalence suveys. Spat Stat 15:70–84

    Article  Google Scholar 

  • Di Battista T (2003) Resampling methods for estimating dispersion indices in random and adaptive designs. Environ Ecol Stat 10(1):83–93

    Article  Google Scholar 

  • Fattorini L (2006) Applying the Horvitz–Thompson criterion in complex designs: a computer-intensive perspective for estimating inclusion probabilities. Biometrika 93(2):269–278

    Article  Google Scholar 

  • Fattorini L, Corona P, Chirici G, Pagliarella MC (2015) Design-based strategies for sampling spatial units from regular grids with applications to forest surveys, land use and land cover estimation. Environmetrics 26:216–228. https://doi.org/10.1002/env.2332

    Article  Google Scholar 

  • Fattorini L, Marcheselli M, Pratelli L (2017) Design-based maps for finite populations of spatial units. J Am Stat Assoc. https://doi.org/10.1080/01621459.2016.1278174

    Google Scholar 

  • Gattone S, Di Battista T (2011) Adaptive cluster sampling with a data driven stopping rule. Stat Methods Appl 20(1):1–21

    Article  Google Scholar 

  • Gattone S, Mohamed E, Di Battista T (2016) Adaptive cluster sampling with clusters selected without replacement and stopping rule. Environ Ecol Stat 23:453–468

    Article  Google Scholar 

  • Gräler B, Pebesma EJ, Heuvelink G (2016) Spatio-temporal interpolation using gstat. R J 8(1):204–218

    Google Scholar 

  • Grafström A, Saarela S, Ene LT (2014) Efficient sampling strategies for forest inventories by spreading the sample in auxiliary space. Can J For Res 44(10):1156–1164

    Article  Google Scholar 

  • Hájek J (1964) Asymptotic theory of rejective sampling with varying probabilities from a finite population. Ann Math Stat 35(4):1491–1523

    Article  Google Scholar 

  • Hájek J (1981) Sampling from a finite population. In: Dupač V (ed) Statistics: textbooks and monographs, vol 37. Marcel Dekker Inc., New York. ISBN: 0-8247-1291-9. (with a foreword by Sen PK)

  • Joenssen DW (2015) HotDeckImputation: hot deck imputation methods for missing data. R package version 1.1.0. https://CRAN.R-project.org/package=HotDeckImputation

  • Kabaghe AN, Chipeta MG, McCann RS, Phiri KS, van Vugt M, Takken W, Diggle P, Terlouw AD (2017) Adaptive geostatistical sampling enables efficient identification of malaria hotspots in repeated cross-sectional surveys in rural Malawi. PLoS ONE 12(2):e0172266. https://doi.org/10.1371/journal.pone.0172266

    Article  PubMed  PubMed Central  Google Scholar 

  • Marella D, Scanu M, Conti PL (2008) On the matching noise of some nonparametric imputation procedures. Stat Probab Lett 78:15931600

    Article  Google Scholar 

  • Pacifici K, Reich BJ, Dorazio RM, Conroy MJ (2016) Occupancy estimation for rare species using a spatially-adaptive sampling design. Methods Ecol Evol 7:285–293. https://doi.org/10.1111/2041-210X.12499

    Article  Google Scholar 

  • Pebesma EJ, Bivand RS (2005) Classes and methods for spatial data in R. R News 5(2). https://cran.r-project.org/doc/Rnews/

  • R Core Team (2016) R: a language and environment for statistical computing. R Foundation for Statistical Computing. url: http://www.R-project.org

  • Rosen B (1997) On sampling with probability proportional to size. J Stat Plan Inference 62:159–191

    Article  Google Scholar 

  • Salehi MM, Seber GAF (2017) Two-stage complete allocation sampling. Environmetrics 28(3):1–10. https://doi.org/10.1002/env.2441

    Article  Google Scholar 

  • Salehi MM, Moradi M, Al Khayat JA, Brown J, Yousif AEM (2015) Inverse adaptive cluster sampling with unequal selection probabilities: case studies on crab holes and arsenic pollution. Aust N Z J Stat 57:189–201. https://doi.org/10.1111/anzs.12118

    Article  Google Scholar 

  • Seber GAF, Salehi MM (2013) Adaptive sampling designs: inference for sparse and clustered populations. Springer, Heidelberg

    Book  Google Scholar 

  • Seber GA, Thompson SK (1994) 6 Environmental adaptive sampling. In: Patil GP, Rao CR (eds) Handbook of statistics, vol 12. North-Holland, New York, pp 201–220

    Google Scholar 

  • Thompson SK (1990) Adaptive cluster sampling. J Am Stat Assoc 85(412):1050–1059

    Article  Google Scholar 

  • Yan J (2007) Enjoy the joy of copulas: with a package copula. J Stat Softw 21(4):1–21. http://www.jstatsoft.org/v21/i04/

Download references

Acknowledgements

The idea of using Hot Deck imputation to construct a resampling population as described in Section 4 originates from unpublished work on bootstrapping non-iid samples developed in collaboration among Fulvia Mecatti, Pierluigi Conti and the first author. A draft version of that work, which will appear elsewhere, is available at https://arxiv.org/abs/1705.03827v2. The authors also wish to thank two anonymous referees for their precious comments, that helped improve the quality of this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Federico Andreis.

Additional information

Handling Editor: Bryan F. J. Manly.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Andreis, F., Bonetti, M. A proposal for a two-step sampling design to oversample units responding to prescribed characteristics. Environ Ecol Stat 25, 139–154 (2018). https://doi.org/10.1007/s10651-017-0396-9

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10651-017-0396-9

Keywords

Navigation