Skip to main content

Histograms on Streams

  • Living reference work entry
  • First Online:
Encyclopedia of Database Systems
  • 23 Accesses

Synonyms

Piecewise-constant approximations

Definition

A B-bucket histogram of length N is a partition of the set [0 , N) of N integers into intervals [b 0 , b 1) ∪ [b 1 , b 2) ∪  …  ∪ [b B − 1 , b B ), where b 0 = 0 and b B  = N, together with a collection of B heights h j , for 0 ≤ j < B, one for each bucket. On point query i, the histogram answer is h j , where j is the index of the interval (or “bucket”) containing i; that is, the unique j with b j  ≤ i < b j + 1. In vector notation, χ S is the vector that is 1 on the set S and zero elsewhere and the answer vector of a histogram is \( \overrightarrow{H}={\displaystyle {\sum}_{0\le j<B^h_j}{\chi}_{\left[{b}_j,{b}_{j+1}\right).}} \)

A histogram, \( \overrightarrow{H} \), is often used to approximate some other function, \( \overrightarrow{A} \), on [0 , N). In building a B-bucket histogram, it is desirable to choose B − 1 boundaries b j and B heights h j that tend to minimize some distance, e.g., the sum square error \( {\left\Vert...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Recommended Reading

  1. Cormode G, Muthukrishnan S. An improved data stream summary: the count-min sketch and its applications. In: Proceedings of the 6th Latin American symposium theoretical informatics; 2004, p. 29–38.

    Google Scholar 

  2. Gilbert A, Guha S, Indyk P, Kotidis Y, Muthukrishnan S, Strauss M. Fast, small-space algorithms for approximate histogram maintenance. In: Proceedings of the 34th annual ACM symposium on theory of computing; 2002, p. 389–98.

    Google Scholar 

  3. Guha S, Koudas N, Shim K. Approximation and streaming algorithms for histogram construction problems. ACM Trans Database Syst. 2006;31(1):396–438.

    Article  Google Scholar 

  4. Ioannidis Y. The history of histograms (abridged). In: Proceedings of the 29th international conference on very large data bases; 2003, p. 19–30.

    Google Scholar 

  5. Jagadish H, Koudas N, Muthukrishnan S, Poosala V, Sevcik K, Suel T. Optimal histograms with quality guarantees. In: Proceedings of the 24th international conference. on very large data bases; 1998, p. 275–86.

    Google Scholar 

  6. Muthukrishnan S, Strauss M. Approximate histogram and wavelet summaries of streaming data. In: Data-stream management – processing high-speed data streams. New York: Springer; 2009 (Data-Centric Systems and Applications Series).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Martin J. Strauss .

Editor information

Editors and Affiliations

Section Editor information

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer Science+Business Media LLC

About this entry

Cite this entry

Strauss, M.J. (2016). Histograms on Streams. In: Liu, L., Özsu, M. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4899-7993-3_191-2

Download citation

  • DOI: https://doi.org/10.1007/978-1-4899-7993-3_191-2

  • Received:

  • Accepted:

  • Published:

  • Publisher Name: Springer, New York, NY

  • Online ISBN: 978-1-4899-7993-3

  • eBook Packages: Springer Reference Computer SciencesReference Module Computer Science and Engineering

Publish with us

Policies and ethics