Encyclopedia of Database Systems

2018 Edition
| Editors: Ling Liu, M. Tamer Özsu

Sampling Techniques for Statistical Databases

  • Amarnath Gupta
Reference work entry
DOI: https://doi.org/10.1007/978-1-4614-8265-9_1293

Definition

A sampling technique is a method by which one inspects only a small portion of data from a database to reduce the time to compute an aggregate query, but simultaneously ensuring that result computed on the sample faithfully represents the true results of the query for the entire data population.

Example

Acceptance-Rejection sampling(AR sampling) is sampling technique.

Key Points

Sampling is used in a database for different reasons such as (i) to estimate the results of aggregate queries (e.g., SUM, COUNT, orAVERAGE), (ii) to retrieve a sample of records from a database query for subsequent processing, (iii) for internal use by the query optimizer for selectivity estimation, (iv) to provide privacy protection for records on individuals contained in statistical databases. It has been determined that fixed size random sampling of data does not yield a true representation of the population. Acceptance/rejection (A/R) samplingis used to construct weighted samples in which the...

This is a preview of subscription content, log in to check access.

Recommended Reading

  1. 1.
    Olken F, Rotem D. Random sampling from databases: a survey. Stat Comput. 1995;5:25–42.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.San Diego Supercomputer CenterUniversity of California San DiegoLa JollaUSA

Section editors and affiliations

  • Amarnath Gupta
    • 1
  1. 1.San Diego Supercomputer CenterUniv. of California San DiegoLa JollaUSA