Abstract
Excellent data analysis methodologies fail to produce good results when using bad data. Bad data arise from inadequate strategies at the collecting stage, that are responsible for bias, or insufficient to produce accurate estimates of parameters of interest. Sampling is the statistical subfield that uses randomness as an ally in data gathering, the gold standard in ideal situations being to collect samples without replacement (thus each item bringing in new information, and as a consequence the estimator having reduced variance when compared to the corresponding sampling with replacement estimator). A quick overview of sampling strategies is presented, showing how they deal with cost control in non-ideal circumstances. Comments on the use of immoderately large samples, on the reuse of samples, and on computational sample augmentation, and other critical comments on misuse of statistics are registered, in the hope that these alerts improve the obtention of statistical findings, so often blurred because sophisticated statistical analysis is useless when it uses bad data.
To guess is cheap, to guess cheaply can be wrong and expensive.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Anderson, M., Fienberg, S.E.: History, myth-making and statistics: a short story about the reapportionment of congress and the 1990 census. PS. Polit. Sci. Polit. 33, 783–794 (2000)
Arnold, D.N.: Integrity Under Attack: The State of Scholarly Publishing. http://www.ima.umn.edu/arnold//siam-columns/integrity-under-attack.pdf
Barnett, V.: Sample Surveys: Principles and Methods. Arnold, London (2002)
Belin, T.R., Rolph, J.E.: Can we reach consensus on census adjustment? Stat. Sci. 9, 486–508 (1994)
Billard, L.: The census count: who counts? How do we count? When do we count? PS. Polit. Sci. Polit. 33, 767–774 (2000)
Boswell, M.T., Gore, S.D., Lovison, G., Patil, G.P.: Annotated bibliography of composite sampling Part A: 1936–92. Environ. Ecol. Stat. 3, 1–50 (1996)
Breiman, L.: The 1991 census adjustment: undercount or bad data? Stat. Sci. 9, 458–475 (1994)
Brilhante, M.F., Mendonça, S., Pestana, D., Sequeira, F.: Using products and powers of products to test uniformity. In: Luzar-Stiffler, V., Jarec, I. Bekic, Z. (eds.) Proceedings of the ITI 2010, 32nd International Conference on Information Technology Interfaces, IEEE CFP10498-PRT, pp. 509–514
Brunell, T.L.: Statistical sampling to estimate the U.S. population: the methodological and political debate over census 2000. PS. Polit. Sci. Polit. 33, 775–782 (2000)
Dorfman, R.: The detection of defective members in large populations. Ann. Math. Stat. 14, 436–440 (1943)
Erdös, P., Rényi, A.: On a central limit theorem for samples from a finite population. Publ. Math. Inst. Hung. Acad. Sci. 4, 49–61 (1959)
Fink, A. (ed.) The Survey Kit—1: The Survey Handbook. 2: How to Ask Survey Questions. 3: How to Conduct Self-Administered and Mail Surveys. 4: How to Conduct Telephone Surveys 5: How to Conduct In-Person Interviews for Surveys. 6: How to Design Survey Studies. 7: How to Sample in Surveys. 8: How to Assess and Interpret Survey Psychometrics 9: How to Manage, Analyze, and Interpret Survey Data. 10: How to Report on Surveys. Sage Publications, Thousand Oaks (2003)
Glass, G.V.: Primary, secondary, and meta-analysis of resealch. Educ. Res. 5, 3–8 (1976)
Goldacre, B.: Bad Science, Harper Perennial. Fourth Estate, London (2009)
Hansen, M.M., Hurwitz, W.N.: On the theory of sampling from finite populations. Ann. Math. Stat. 14, 333–362 (1943)
Hansen, M.M., Hurwitz, W.N., Madow, W.G.: Sample Survey Methods and Theory. Wiley, New York (1962)
Horvitz, D.G., Thompson, D.J.: A Generalization of sampling without replacement from a finite universe. J. Am. Stat. Assoc. 47, 663–685 (1952)
Improbable Research. http://www.improbable.com/ig/
Ioannidis, J.P.A.: Why Most Published Research Findings Are False. http://www.plosmedicine.org/article/info:doi/10.1371/journal.pmed.0020124
Kalton, G., Anderson, D.: Sampling rare populations. J. R. Stat. Soc. A 149, 65–82 (1986)
Lynn, P.: Principles of sampling,In: Greenfield, T. (ed.) Research Methods for Postgraduates, pp. 185–194. Arnold, London (2002)
Marques, T.A., Buckland, S.T., Bispo, R., Howland, B.: Accounting for animal density gradients using independent information in distance sampling surveys. Stat. Methods Appl. (2013). doi:10.1007/s10260-012-0223-2
Marsden, P., Wright, J. (eds.): Handbook of Survey Research. Emerald, United Kingdom (2010)
Martins, J.P., Santos, R., Felgueiras, M.: A Maximum Likelihood Estimator for the Prevalence Rate Using Pooled Sample Tests Notas e Comunicações do CEAUL 27 (2013)
Morrison, Ml, Block, W.M., Strickland, M.D., Collier, B.A.: Wildlife Study Design. Springer, New York (2008). Sample survey strategies 137–198 Sampling strategies: applications 199–228
Mosteller, F., Tukey, J.W.: Data Analysis and Regression: A Second Course in Statistics. Addison-Wesley, Boston (1977)
Office of National Statistics: Census Coverage Survey Sample Balance Adjustment. 2011 Census: Methods and Quality Report. www.ons.gov.uk/.../census/.../census...census.../ccs-sample-balance-adjus...? (2011)
Petersen, C.G.J.: The yearly immigration of young plaice into the Limfjord from the German Sea. Dan. Biol. St. 6, 5–84 (1895)
Randell, B.: The Colossus. In: Metropolis, N., Howlett, J., Rota, G.C. (eds.) A History of Computing in the Twentieth Century, pp. 47–92. Academic Press, New York (1980)
Ronzio, C.R.: Ambiguity and discord in U.S. Census data on the undercount, race/ethnicity and SES: responding to the challenge of complexity. Int. J. Crit. Stat. 1, 11–18 (2007)
Santos, R., Martins, J.P., Felgueiras, M.: Discrete Compound Tests and Dorfmans Methodology in the Presence of Misclassification. Notas e Comunicações do CEAUL 26 (2013)
Särndal, C.-E., Swensson, B., Wretman, J.: Model Assisted Survey Sampling. Springer, New York (2003)
Scheaffer, R.L., Mendenhall III, W., Ott, R.L., Gerow, K.: Elementary Survey Sampling. Duxbury, Belmont (2012)
Seber, G.A.F., Salehi, M.M.: Adaptive Sampling Designs: Inference for Sparse and Clustered Populations. Springer, New York (2012)
Singh, S.: Advanced Sampling Theory with Applications, How Michael ‘Selected’ Amy. Kluwer, Dordrecht (2003)
Thompson, S.K.: Sampling. Wiley, New York (2012)
Thompson, S.K., Seber, G.A.F.: Adaptive Sampling. Wiley, New York (1996)
Tukey, J.W.: Exploratory Data Analysis. Addison-Wesley, Boston (1977)
Wald, A.: Sequential tests of statistical hypotheses. Ann. Math. Stat. 16, 117–186 (1945)
Acknowledgments
This research has been supported by National Funds through FCT—Fundação para a Ciência e a Tecnologia, project PEst-OE/MAT/UI0006/2011, and PTDC/FEDER.
The authors are grateful for very useful comments from Prof. Sneh Gulati, that helped them to improve the readability of the text, and to the comments and suggestions of the anonymous referees, very helpful in improving the presentation.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Pestana, D., Rocha, M.L., Sequeira, F. (2015). Finite Populations Sampling Strategies and Costs Control. In: Kitsos, C., Oliveira, T., Rigas, A., Gulati, S. (eds) Theory and Practice of Risk Assessment. Springer Proceedings in Mathematics & Statistics, vol 136. Springer, Cham. https://doi.org/10.1007/978-3-319-18029-8_16
Download citation
DOI: https://doi.org/10.1007/978-3-319-18029-8_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-18028-1
Online ISBN: 978-3-319-18029-8
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)