Synonyms
Definition
A B-bucket histogram of length N is a partition of the set [0 , N) of N integers into intervals [b 0 , b 1) ∪ [b 1 , b 2) ∪ … ∪ [b B − 1 , b B ), where b 0 = 0 and b B = N, together with a collection of B heights h j , for 0 ≤ j < B, one for each bucket. On point query i, the histogram answer is h j , where j is the index of the interval (or “bucket”) containing i; that is, the unique j with b j ≤ i < b j + 1. In vector notation, χ S is the vector that is 1 on the set S and zero elsewhere and the answer vector of a histogram is \( \overrightarrow{H}={\displaystyle {\sum}_{0\le j<B^h_j}{\chi}_{\left[{b}_j,{b}_{j+1}\right).}} \)
A histogram, \( \overrightarrow{H} \), is often used to approximate some other function, \( \overrightarrow{A} \), on [0 , N). In building a B-bucket histogram, it is desirable to choose B − 1 boundaries b j and B heights h j that tend to minimize some distance, e.g., the sum square error \( {\left\Vert...
Recommended Reading
Cormode G, Muthukrishnan S. An improved data stream summary: the count-min sketch and its applications. In: Proceedings of the 6th Latin American symposium theoretical informatics; 2004, p. 29–38.
Gilbert A, Guha S, Indyk P, Kotidis Y, Muthukrishnan S, Strauss M. Fast, small-space algorithms for approximate histogram maintenance. In: Proceedings of the 34th annual ACM symposium on theory of computing; 2002, p. 389–98.
Guha S, Koudas N, Shim K. Approximation and streaming algorithms for histogram construction problems. ACM Trans Database Syst. 2006;31(1):396–438.
Ioannidis Y. The history of histograms (abridged). In: Proceedings of the 29th international conference on very large data bases; 2003, p. 19–30.
Jagadish H, Koudas N, Muthukrishnan S, Poosala V, Sevcik K, Suel T. Optimal histograms with quality guarantees. In: Proceedings of the 24th international conference. on very large data bases; 1998, p. 275–86.
Muthukrishnan S, Strauss M. Approximate histogram and wavelet summaries of streaming data. In: Data-stream management – processing high-speed data streams. New York: Springer; 2009 (Data-Centric Systems and Applications Series).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Section Editor information
Rights and permissions
Copyright information
© 2016 Springer Science+Business Media LLC
About this entry
Cite this entry
Strauss, M.J. (2016). Histograms on Streams. In: Liu, L., Özsu, M. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4899-7993-3_191-2
Download citation
DOI: https://doi.org/10.1007/978-1-4899-7993-3_191-2
Received:
Accepted:
Published:
Publisher Name: Springer, New York, NY
Online ISBN: 978-1-4899-7993-3
eBook Packages: Springer Reference Computer SciencesReference Module Computer Science and Engineering