Abstract
We present novel algorithms for estimating the size of the natural join of two data streams that have efficient update processing times and provide excellent quality of estimates.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Alon, N., Gibbons, P.B., Matias, Y., Szegedy, M.: Tracking Join and Self- Join Sizes in Limited Storage. In: Proceedings of the Eighteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, Philadeplphia, Pennsylvania (May 1999)
Alon, N., Matias, Y., Szegedy, M.: The Space Complexity of Approximating the Frequency Moments. In: Proceedings of the 28th Annual ACM Symposium on the Theory of Computing STOC 1996, Philadelphia, Pennsylvania, pp. 20–29 (May 1996)
Alon, N., Matias, Y., Szegedy, M.: The space complexity of approximating frequency moments. Journal of Computer Systems and Sciences 58(1), 137–147 (1998)
Arasu, A., Babcock, B., Babu, S., Cieslewicz, J., Datar, M., Ito, K.: STREAM: The Stanford Data Stream Management System. In: Garofalakis, M., Gehrke, J., Rastogi, R. (eds.) Data Stream Management Processing High-Speed Data Streams Series: Data-Centric Systems and Applications, Springer, Heidelberg (2006) ISBN: 3-540-28607-1
Avnur, R., Hellerstein, J.M.: Eddies: Continuously Adaptive Query Processing. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, Dallas, Texas, USA (2000)
Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: Models and Issues in Data Stream Systems. In: Proceedings of the Twentysecond ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, Madison, Wisconsin, USA (2002)
Carney, D., Çetintemel, U., Cherniack, M., Convey, C., Lee, S., Seidman, G., Stonebraker, M., Tatbul, N., Zdonik, S.B.: Monitoring Streams - A New Class of Data Management Applications. In: Proceedings of the 28th International Conference on Very Large Data Bases, Hong Kong, China (2002)
Charikar, M., Chen, K., Farach-Colton, M.: Finding frequent items in data streams. In: Proceedings of the 29th International Colloquium on Automata Languages and Programming (2002)
Cormode, G., Muthukrishnan, S.: An improved data stream summary: The Count-Min sketch and its applications. In: Farach-Colton, M. (ed.) LATIN 2004. LNCS, vol. 2976, pp. 29–38. Springer, Heidelberg (2004) ISBN 3-540-21258- 2
Cormode, G., Garofalakis, M.: Sketching Streams Through the Net: Distributed Approximate Query Tracking. In: Proceedings of the 31st International Conference on Very Large Data Bases (September 2005)
Dobra, A., Garofalakis, M.N., Gehrke, J., Rastogi, R.: Processing complex aggregate queries over data streams. In: Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data, Madison, Wisconsin, USA (2002)
Ganguly, S., Garofalakis, M., Rastogi, R.: Processing Data Stream Join Aggregates using Skimmed Sketches. In: Proceedings of the Ninth International Conference on Extending Database Technology, Herkailon, Crete, Greece (March 2004)
Ganguly, S., Gibbons, P., Matias, Y., Silberschatz, A.: Bifocal Sampling for Skew-Resistant Join Size Estimation. In: Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, Montreal, Quebec (June 1996)
Hou, W.-C., Ozsoyoglu, G., Taneja, B.K.: Statistical estimators for relational algebra expressions. In: Proceedings of the Seventh ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, Philadelphia, Pennsylvania, March 1988, pp. 276–287 (1988)
Lipton, R., Naughton, J., Schneider, D.: Practical Selectivity Estimation Through Adaptive Sampling. In: Proceedings of the 1990 ACM SIGMOD International Conference on Management of Data, Atlantic City, NJ (1990)
Thorup, M., Zhang, Y.: Tabulation based 4-universal hashing with applications to second moment estimation. In: Proceedings of the Fifteenth ACM SIAM Symposium on Discrete Algorithms, New Orleans, Louisiana, USA, pp. 615–624 (January 2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ganguly, S., Kesh, D., Saha, C. (2005). Practical Algorithms for Tracking Database Join Sizes. In: Sarukkai, S., Sen, S. (eds) FSTTCS 2005: Foundations of Software Technology and Theoretical Computer Science. FSTTCS 2005. Lecture Notes in Computer Science, vol 3821. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11590156_24
Download citation
DOI: https://doi.org/10.1007/11590156_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-30495-1
Online ISBN: 978-3-540-32419-5
eBook Packages: Computer ScienceComputer Science (R0)