A synopsis structure for a dataset S is any summary of S whose size is substantively smaller than S. Formally, its size is at most O(|S|ε), where |S| is the size (in bytes) of S, for some constant ε < 1.
Synopsis structures are small, often statistical summaries of a data set. The term serves as an umbrella for any summarization structure of sufficiently small size, such as random samples, histograms, wavelets, sketches, top-k summaries, etc.
Synopsis structures are most commonly used in conjunction with data streams. The goal is to construct, in one pass over the data stream, a synopsis structure that can be used to answer any query from a prespecified class of queries. That is, at any point, a user may pose a query Q on the data stream thus far, and a (typically approximate) answer to Qmust be produced using only the current synopsis structure. Two key advantages of using a synopsis structure to answer queries are that the space overhead is...
- 1.Gibbons PB, Matias Y. Synopsis data structures for massive data sets. DIMACS Series in Discrete Mathematics and Theoretical Computer Science: External Memory Algorithms. 1999.Google Scholar