Years and Authors of Summarized Original Work
2004; Cormode, Muthukrishnan
Problem Definition
The problem of sketching a large mathematical object is to produce a compact data structure that approximately represents it. The Count-Min (CM) sketch is an example of a sketch that allows a number of related quantities to be estimated with accuracy guarantees, including point queries and dot product queries. Such queries are at the core of many computations, so the structure can be used in order to answer a variety of other queries, such as frequent items (heavy hitters), quantile finding, join size estimation, and more. Since the sketch can process updates in the form of additions or subtractions to dimensions of the vector (which may correspond to insertions or deletions or other transactions), it is capable of working over streams of updates, at high rates.
The data structure maintains the linear projection of the vector with a number of other random vectors. These vectors are defined...
Keywords
- Streaming algorithms
- Frequent items
- Approximate counting
- Sketch
This is a preview of subscription content, access via your institution.

Recommended Reading
Alon N, Matias Y, Szegedy M (1996) The space complexity of approximating the frequency moments. In: ACM symposium on theory of computing, Philadelphia, pp 20–29
Charikar M, Chen K, Farach-Colton M (2002) Finding frequent items in data streams. In: Procedings of the international colloquium on automata, languages and programming (ICALP), Málaga
Cormode G, Hadjieleftheriou M (2009) Finding the frequent items in streams of data. Commun ACM 52(10):97–105
Cormode G, Muthukrishnan S (2005) An improved data stream summary: the Count-Min sketch and its applications. J Algorithms 55(1):58–75
Cormode G, Muthukrishnan S (2005) Summarizing and mining skewed data streams. In: SIAM conference on data mining, Newport Beach
Cormode G, Korn F, Muthukrishnan S, Johnson T, Spatscheck O, Srivastava D (2004) Holistic UDAFs at streaming speeds. In: ACM SIGMOD international conference on management of data, Paris, pp 35–46
Estan C, Varghese G (2002) New directions in traffic measurement and accounting. In: Proceedings of ACM SIGCOMM, computer communication review, vol 32, 4, Pittsburgh, PA, pp 323–338
Ganguly S, Lakshminath B (2006) Estimating entropy over data streams. In: European symposium on algorithms (ESA), Zurich
Indyk P (2003) Better algorithms for high-dimensional proximity problems via asymmetric embeddings. In: ACM-SIAM symposium on discrete algorithms, Baltimore
Lai YK, Byrd GT (2006) High-throughput sketch update on a low-power stream processor. In: Proceedings of the ACM/IEEE symposium on architecture for networking and communications systems, San Jose
Manerikar N, Palpanas T (2009) Frequent items in streaming data: an experimental evaluation of the state-of-the-art. Data Knowl Eng 68(4):415–430
Motwani R, Raghavan P (1995) Randomized algorithms. Cambridge University Press, Cambridge/New York
Roughan M, Zhang Y (2006) Secure distributed data mining and its application in large-scale network measurements. In: ACM SIGCOMM computer communication review (CCR), Pisa
Sarlós T, Benzúr A, Csalogány K, Fogaras D, Rácz B (2006) To randomize or not to randomize: space optimal summaries for hyperlink analysis. In: International conference on World Wide Web (WWW), Edinburgh
Spiegel J, Polyzotis N (2006) Graph-based synopses for relational selectivity estimation. In: ACM SIGMOD international conference on management of data, Chicago
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer Science+Business Media New York
About this entry
Cite this entry
Cormode, G. (2014). Count-Min Sketch. In: Kao, MY. (eds) Encyclopedia of Algorithms. Springer, Boston, MA. https://doi.org/10.1007/978-3-642-27848-8_579-1
Download citation
DOI: https://doi.org/10.1007/978-3-642-27848-8_579-1
Received:
Accepted:
Published:
Publisher Name: Springer, Boston, MA
Online ISBN: 978-3-642-27848-8
eBook Packages: Springer Reference Computer SciencesReference Module Computer Science and Engineering