Skip to main content
Log in

Using ‘found’ data to augment a probability sample: Procedure and case study

  • Published:
Environmental Monitoring and Assessment Aims and scope Submit manuscript

Abstract

While probability sampling has the advantage of permitting unbiased population estimates, many past and existing monitoring schemes do not employ probability sampling. We describe and demonstrate a general procedure for augmenting an existing probability sample with data from nonprobability-based surveys (‘found’ data). The procedure, first proposed by Overton (1990), uses sampling frame attributes to group the probability and found samples into similar subsets. Subsequently, this similarity is assumed to reflect the representativeness of the found sample for the matching subpopulation. Two methods of establishing similarity and producing estimates are described: pseudo-random and calibration. The pseudo-random method is used when the found sample can contribute additional information on variables already measured for the probability sample, thus increasing the effective sample size. The calibration method is used when the found sample contributes information that is unique to the found observations. For either approach, the found sample data yield observations that are treated as a probability sample, and population estimates are made according to a probability estimation protocol. To demonstrate these approaches, we applied them to found and probability samples of stream discharge data for the southeastern US.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Messer, J.J., Ariss, C.W., Baker, J.R., Drouse, S.K., Eshleman, K.N., Kaufmann, P.R., Linthurst, R.A., Omernik, J.M., Overton, W.S., Sale, M.J., Schonbrod, R.D., Stambaugh, S.M. and Tuschall, J.R. Jr.: 1986, ‘National Surface Water Survey: National Stream Survey Phase I — Pilot Survey’, EPA/600/4-86/026. U.S. Environmental Protection Agency, Office of Research and Development, Washington DC.

    Google Scholar 

  • Overton, W.S.: 1987, ‘A Sampling and Analysis Plan for Streams, in the National Surface Water Survey Conducted by the EPA’, Technical Report 117, Department of Statistics, Oregon State University, Corvallis.

    Google Scholar 

  • Overton, W.S.: 1989, ‘Calibration Methodology for the Double Sample Structure of the National Lake Survey Phase II Sample’, Technical Report 130, Department of Statistics, Oregon State University, Corvallis.

    Google Scholar 

  • Overton, W.S.: 1990, ‘A Strategy for Use of Found Samples in a Rigorous Monitoring Design’, Technical Report 139, Department of Statistics, Oregon STate University, Corvallis.

    Google Scholar 

  • Smith, B.G.: 1987, ‘CLUSB, Version 3, Recording for Microcomputer and Manual Revision’, Unpublished manuscript.

  • Young, T.C., DePinto, J.V. and Heidtke, T.M.: 1988, ‘Some Factors Affecting Fluvial Load Estimation Efficiency’,Water Resources Research 24, 1535–1540.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mc Overton, J.C., Young, T.C. & Overton, W.S. Using ‘found’ data to augment a probability sample: Procedure and case study. Environ Monit Assess 26, 65–83 (1993). https://doi.org/10.1007/BF00555062

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF00555062

Keywords

Navigation