Skip to main content

Combining Scientific and Non-scientific Surveys to Improve Estimation and Reduce Costs

  • Chapter
  • First Online:
  • 679 Accesses

Part of the book series: Computational Social Sciences ((CSS))

Abstract

Survey data collection costs have risen to a point where many survey researchers are abandoning large, expensive probability-based samples in favor of less expensive nonprobability samples. The empirical literature suggests this strategy may be unwise for many reasons, among them probability samples tend to outperform nonprobability samples on accuracy when assessed against population benchmarks. Nevertheless, the attractive cost properties and convenience of nonprobability samples suggest they are here to stay. But instead of forgoing probability sampling entirely, we consider a method of combining both probability and nonprobability samples in a way that exploits their strengths to overcome their weaknesses. Using Bayesian inference, we evaluate the use of nonprobability data as a supplement to probability-based estimations based on small probability samples. In a case study involving actual survey data, we show that specifying prior distributions using nonprobability data reduces variances and mean-squared errors considerably for estimates of two commonly used health variables, height and weight, compared to the probability-only sample estimates. We further show that these gains in efficiency yield expected cost savings up to 66% based on actual cost data from eight nonprobability surveys conducted by different commercial vendors and assumed cost data for a probability-based Internet panel. We conclude with a discussion of these findings, their implications for survey practice, and possible research extensions.

Electronic Supplementary Material: The online version of this chapter (https://doi.org/10.1007/10.1007/978-3-030-54936-7_4) contains supplementary material, which is available to authorized users.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   99.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Bootstrap methods have been used in many contexts and were originally proposed by Efron (1979). The general approach is to randomly draw subsamples with replacement from the full sample a large number of times and estimate the statistic of interest in each subsample before combining them using a bootstrap estimator.

  2. 2.

    We assume the GIP per unit cost is higher than the per unit costs of the nonprobability surveys due to the interviewer-administered recruitment and setup costs of equipping the offline population. Further, we reason that, in practice, a high response rate would be desired for the small probability sample to minimize the risk of nonresponse bias in the sparse sample, for which extensive recruitment efforts may be needed.

References

  • AAPOR, Standard Definitions: Final Dispositions of Case Codes and Outcome Rates for Surveys, 9th edn. (American Association for Public Opinion Research, 2016)

    Google Scholar 

  • S. Ansolabehere, D. Rivers, Cooperative survey research. Ann. Rev. Polit. Sci. 16, 307–329 (2013)

    Article  Google Scholar 

  • R. Baker, J.M. Brick, N.A. Bates, M. Battaglia, M.P. Couper, J.A. Dever, K.J. Gile, R. Tourangeau, Summary report of the AAPOR task force on non-probability sampling. J. Surv. Stat. Methodol. 1(2), 90–143 (2013)

    Article  Google Scholar 

  • T. Bayes, An essay towards solving a problem in the doctrine of chances. Philos. Trans. 53, 370–418 (1763)

    Article  Google Scholar 

  • K.S. Berbaum, D.D. Dorfman, E.A. Franken, R.T. Caldwell, An empirical comparison of discrete ratings and subjective probability ratings. Acad. Radiol. 9(7), 756–763 (2002)

    Article  Google Scholar 

  • A.G. Blom, C. Gathmann, U. Krieger, Setting up an online panel representative of the general population: the German internet panel. Field Methods 27(4), 391–408 (2015)

    Article  Google Scholar 

  • A.G. Blom, J.M.E. Herzing, C. Cornesse, J.W. Sakshaug, U. Krieger, D. Bossert, Does the recruitment of offline households increase the sample representativeness of probability-based online panels? evidence from the German internet panel. Soc. Sci. Comput. Rev. 35(4), 498–520 (2017)

    Article  Google Scholar 

  • A.G. Blom, D. Ackermann-Piek, S.C. Helmschrott, C. Cornesse, J.W. Sakshaug, The representativeness of online panels: coverage, sampling and weighting, in Paper Presented at the General Online Research Conference (2017)

    Google Scholar 

  • D. Briggs, D. Fecht, K. De Hoogh, Census data issues for epidemiology and health risk assessment: experiences from the small area health statistics unit. J. R. Stat. Soc. Ser. A (Stat. Soc.) 170(2), 355–378 (2007)

    Google Scholar 

  • L. Chang, J.A. Krosnick, National surveys via RDD telephone interviewing versus the internet comparing sample representativeness and response quality. Public Opin. Q. 73(4), 641–678 (2009)

    Article  Google Scholar 

  • C. Cornesse, A.G Blom, D. Dutwin, J.A. Krosnick, E.D. De Leeuw, S. Legleye, J. Pasek, D. Pennay, B. Phillips, J. W. Sakshaug, B. Struminskaya, A. Wenz, A Review of Conceptual Approaches and Empirical Evidence on Probability and Nonprobability Sample Survey Research. J. Surv. Stat. Methodol. 8(1), 4–36 (2020)

    Google Scholar 

  • B.O. Daponte, J.B. Kadane, L.J. Wolfson, Bayesian demography: projecting the Iraqi Kurdish population, 1977–1990. J. Am. Stat. Assoc. 92(440), 1256–1267 (1997)

    Google Scholar 

  • D. Dutwin, T.D. Buskirk, Apples to oranges or gala versus golden delicious? comparing data quality of nonprobability internet samples to low response rate probability samples. Public Opin. Q. 81(S1), 213–239 (2017)

    Article  Google Scholar 

  • B. Efron, Bootstrap methods: another look at the jackknife. Ann. Stat. 7, 1–26 (1979)

    Article  Google Scholar 

  • A. Gelman, J.B. Carlin, H.S. Stern, D.B. Rubin, Bayesian Data Analysis, Vol. 3 (Chapman & Hall/CRC, Boca Raton, 2013)

    Google Scholar 

  • S. Lee, Propensity score adjustment as a weighting scheme for volunteer panel web surveys. J. Off. Stat. 22(2), 329 (2006)

    Google Scholar 

  • S. Lee, R. Valliant, Estimation for volunteer panel web surveys using propensity score adjustment and calibration adjustment. Sociol. Methods Res. 37(3), 319–343 (2009)

    Article  Google Scholar 

  • N. Malhotra, J.A. Krosnick, The effect of survey mode and sampling on inferences about political attitudes and behavior: comparing the 2000 and 2004 ANES to internet surveys with nonprobability samples. Polit. Anal. 15, 286–323 (2007)

    Article  Google Scholar 

  • S. Marchetti, C. Giusti, M. Pratesi, The use of twitter data to improve small area estimates of households? share of food consumption expenditure in italy. AStA Wirtschafts-und Sozialstatistisches Archiv 10(2–3), 79–93 (2016)

    Article  Google Scholar 

  • A.H. Murphy, H. Daan, Impacts of feedback and experience on the quality of subjective probability forecasts. Comparison of results from the first and second years of the Zierikzee experiment. Mon. Weather Rev. 112(3), 413–423 (1984)

    Google Scholar 

  • A. O’Hagan, C.E. Buck, A. Daneshkhah, J.R. Eiser, P.H. Garthwaite, D.J. Jenkinson, J.E. Oakley, T. Rakow, Uncertain Judgments Eliciting Expert’s Probabilities (Wiley, Chichester, 2006)

    Book  Google Scholar 

  • J. Pasek, When will nonprobability surveys mirror probability surveys? considering types of inference and weighting strategies as criteria for correspondence. Int. J. Public Opin. Res. 28(2), 269–291 (2016)

    Article  Google Scholar 

  • D.W. Pennay, D. Neiger, P.J. Lavrakas, K.A. Borg, S. Mission, N. Honey, Australian online panels benchmarking study, in Presented at the 69th Annual Conference of the World Association for Public Opinion Research, Austin, May (2016)

    Google Scholar 

  • A.T. Porter, S.H. Holan, C.K. Wikle, N. Cressie, Spatial fay–herriot models for small area estimation with functional covariates. Spatial Stat. 10, 27–42 (2014)

    Article  Google Scholar 

  • S.S. Qian, K.H. Reckhow, Modeling phosphorus trapping in wetlands using nonparametric Bayesian regression. Water Res. Res. 34(7), 1745–1754 (1998)

    Article  Google Scholar 

  • R Core Team, R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, Vienna, 2016)

    Google Scholar 

  • J.N. Rao, Small-Area Estimation. (Wiley Online Library, Hoboken, 2003)

    Google Scholar 

  • J. Raymer, A. Wiśniowski, J.J. Forster, P.W. Smith, J. Bijak, Integrated modeling of European migration. J. Am. Stat. Assoc. 108(503), 801–819 (2013)

    Article  Google Scholar 

  • S. Renooij, C. Witteman, Talking probabilities: communicating probabilistic information with words and numbers. Int. J. Approx. Reason. 22(3), 169–194 (1999)

    Article  Google Scholar 

  • D. Rivers, Sampling for web surveys, in Joint Statistical Meetings (2007)

    Google Scholar 

  • D. Rivers, D. Bailey, Inference from matched samples in the 2008 us national elections, in Proceedings of the Joint Statistical Meetings, Vol. 1, pp. 627–39 (YouGov/Polimetrix Palo Alto, 2009)

    Google Scholar 

  • J.W. Sakshaug, A. Wiśniowski, D.A. Perez-Ruiz, A.G. Blom, Supplementing small probability samples with nonprobability samples: a Bayesian approach. J. Off. Stat. 35(3), 653–681 (2019)

    Article  Google Scholar 

  • C.P. Schmertmann, S.M. Cavenaghi, R.M. Assunção, J.E. Potter, Bayes plus brass: estimating total fertility for many small areas from sparse census data. Popul. Stud. 67(3), 255–273 (2013)

    Article  Google Scholar 

  • R. Valliant, J.A. Dever, Estimating propensity adjustments for volunteer web surveys. Sociol. Methods Res. 40(1), 105–137 (2011)

    Article  Google Scholar 

  • L.C. van der Gaag, S. Renooij, C.L.M. Witteman, B.M.P. Aleman, B.G. Taal, Probabilities for a probabilistic network: a case study in oesophageal cancer. Artif. Intell. Med. 25(2), 123–148 (2002)

    Article  Google Scholar 

  • M.D. Vescio, R.L. Thompson, Forecaster?s forum: subjective tornado probability forecasts in severe weather watches. Weather Forecast 16(1), 192–195 (2001)

    Article  Google Scholar 

  • A. Wiśniowski, J.W. Sakshaug, D.A. Perez-Ruiz, A.G. Blom, Integrating probability and nonprobability samples for survey inference. J. Surv. Stat. Methodol. 8, 120–147 (2020)

    Article  Google Scholar 

  • D.S. Yeager, J.A. Krosnick, L. Chang, H.S. Javitz, M.S. Levendusky, A. Simpser, R. Wang, Comparing the accuracy of RDD telephone surveys and internet surveys conducted with probability and non-probability samples. Public Opin. Q. nfr020 (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Joseph W. Sakshaug .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Sakshaug, J.W., Wiśniowski, A., Perez Ruiz, D.A., Blom, A.G. (2021). Combining Scientific and Non-scientific Surveys to Improve Estimation and Reduce Costs. In: Rudas, T., Péli, G. (eds) Pathways Between Social Science and Computational Social Science. Computational Social Sciences. Springer, Cham. https://doi.org/10.1007/978-3-030-54936-7_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-54936-7_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-54935-0

  • Online ISBN: 978-3-030-54936-7

  • eBook Packages: Social SciencesSocial Sciences (R0)

Publish with us

Policies and ethics