Using ‘found’ data to augment a probability sample: Procedure and case study
- 83 Downloads
While probability sampling has the advantage of permitting unbiased population estimates, many past and existing monitoring schemes do not employ probability sampling. We describe and demonstrate a general procedure for augmenting an existing probability sample with data from nonprobability-based surveys (‘found’ data). The procedure, first proposed by Overton (1990), uses sampling frame attributes to group the probability and found samples into similar subsets. Subsequently, this similarity is assumed to reflect the representativeness of the found sample for the matching subpopulation. Two methods of establishing similarity and producing estimates are described: pseudo-random and calibration. The pseudo-random method is used when the found sample can contribute additional information on variables already measured for the probability sample, thus increasing the effective sample size. The calibration method is used when the found sample contributes information that is unique to the found observations. For either approach, the found sample data yield observations that are treated as a probability sample, and population estimates are made according to a probability estimation protocol. To demonstrate these approaches, we applied them to found and probability samples of stream discharge data for the southeastern US.
KeywordsEnvironmental Management General Procedure Discharge Data Sampling Frame Probability Estimation
Unable to display preview. Download preview PDF.
- Messer, J.J., Ariss, C.W., Baker, J.R., Drouse, S.K., Eshleman, K.N., Kaufmann, P.R., Linthurst, R.A., Omernik, J.M., Overton, W.S., Sale, M.J., Schonbrod, R.D., Stambaugh, S.M. and Tuschall, J.R. Jr.: 1986, ‘National Surface Water Survey: National Stream Survey Phase I — Pilot Survey’, EPA/600/4-86/026. U.S. Environmental Protection Agency, Office of Research and Development, Washington DC.Google Scholar
- Overton, W.S.: 1987, ‘A Sampling and Analysis Plan for Streams, in the National Surface Water Survey Conducted by the EPA’, Technical Report 117, Department of Statistics, Oregon State University, Corvallis.Google Scholar
- Overton, W.S.: 1989, ‘Calibration Methodology for the Double Sample Structure of the National Lake Survey Phase II Sample’, Technical Report 130, Department of Statistics, Oregon State University, Corvallis.Google Scholar
- Overton, W.S.: 1990, ‘A Strategy for Use of Found Samples in a Rigorous Monitoring Design’, Technical Report 139, Department of Statistics, Oregon STate University, Corvallis.Google Scholar
- Smith, B.G.: 1987, ‘CLUSB, Version 3, Recording for Microcomputer and Manual Revision’, Unpublished manuscript.Google Scholar
- Young, T.C., DePinto, J.V. and Heidtke, T.M.: 1988, ‘Some Factors Affecting Fluvial Load Estimation Efficiency’,Water Resources Research 24, 1535–1540.Google Scholar