SCUBA: Scalable Cluster-Based Algorithm for Evaluating Continuous Spatio-temporal Queries on Moving Objects
Abstract
In this paper, we propose, SCUBA, a Scalable Cl uster Based Algorithm for evaluating a large set of continuous queries over spatio-temporal data streams. The key idea of SCUBA is to group moving objects and queries based on common spatio-temporal properties at run-time into moving clusters to optimize query execution and thus facilitate scalability. SCUBA exploits shared cluster-based execution by abstracting the evaluation of a set of spatio-temporal queries as a spatial join first between moving clusters. This cluster-based filtering prunes true negatives. Then the execution proceeds with a fine-grained within-moving-cluster join process for all pairs of moving clusters identified as potentially joinable by a positive cluster-join match. A moving cluster can serve as an approximation of the location of its members. We show how moving clusters can serve as means for intelligent load shedding of spatio-temporal data to avoid performance degradation with minimal harm to result quality. Our experiments on real datasets demonstrate that SCUBA can achieve a substantial improvement when executing continuous queries on spatio-temporal data streams.
Keywords
Cluster Member Query Execution Continuous Query Location Update Incremental ClusterPreview
Unable to display preview. Download preview PDF.
References
- 1.Agrawal, R., et al.: Automatic subspace clustering of high dimensional data for data mining applications. In: SIGMOD, pp. 94–105 (1998)Google Scholar
- 2.Babcock, B., Datar, M., Motwani, R.: Load shedding techniques for data stream systems (2003)Google Scholar
- 3.Barbará, D.: Chaotic mining: Knowledge discovery using the fractal dimension. In: SIGMOD Workshop on Data Mining and Knowl. Discovery (1999)Google Scholar
- 4.Barbará, D.: Requirements for clustering data streams. SIGKDD Explorations 3(2), 23–27 (2002)CrossRefGoogle Scholar
- 5.Brinkhoff, T.: A framework for generating network-based moving objects. GeoInformatica 6(2), 153–180 (2002)zbMATHCrossRefGoogle Scholar
- 6.Compton, C.L., Tennenhouse, D.L.: Collaborative load shedding for mediabased applications. In: Int. Conf. on Multimedia Computing and Systems (1994)Google Scholar
- 7.Domingos, P., Hulten, G.: Catching up with the data: Research issues in mining data streams. In: DMKD (2001)Google Scholar
- 8.Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. Wiley-Interscience Publication, Hoboken (2000)Google Scholar
- 9.Elmongui, H.G., Mokbel, M.F., Aref, W.G.: Spatio-temporal histograms. In: Bauzer Medeiros, C., Egenhofer, M.J., Bertino, E. (eds.) SSTD 2005. LNCS, vol. 3633, pp. 19–36. Springer, Heidelberg (2005)CrossRefGoogle Scholar
- 10.Fisher, D.H.: Iterative optimization and simplification of hierarchical clusterings. CoRR, cs.AI/9604103 (1996)Google Scholar
- 11.Gedik, B., Liu, L.: Distributed processing of continuously moving queries on moving objects in a mobile system. In: EDBT 2002, pp. 67–87 (2004)Google Scholar
- 12.Guha, S., Mishra, N., Motwani, R., O’Callaghan, L.: Clustering data streams. In: FOCS, pp. 359–366 (2000)Google Scholar
- 13.Gupta, S.K., Rao, K.S., Bhatnagar, V.: K-means clustering algorithm for categorical attributes. In: Mohania, M., Tjoa, A.M. (eds.) DaWaK 1999. LNCS, vol. 1676, pp. 203–208. Springer, Heidelberg (1999)Google Scholar
- 14.Hambrusch, S.E., Liu, C.-M., Aref, W.G., Prabhakar, S.: Query processing in broadcasted spatial index trees. In: Jensen, C.S., Schneider, M., Seeger, B., Tsotras, V.J. (eds.) SSTD 2001. LNCS, vol. 2121, p. 502. Springer, Heidelberg (2001)CrossRefGoogle Scholar
- 15.Har-Peled, S.: Clustering motion. In: FOCS 2001: Proceedings of the 42nd IEEE symposium on Foundations of Computer Science, p. 84 (2001)Google Scholar
- 16.Hartigan, J.A.: Clustering Algorithms. John Wiley and Sons, Chichester (1975)zbMATHGoogle Scholar
- 17.Hu, H., Xu, J., Lee, D.L.: A generic framework for monitoring continuous spatial queries over moving objects. In: SIGMOD (2005)Google Scholar
- 18.Jacobson, V.: Congestion avoidance and control. SIGCOMM Comput. Commun.Rev. 25(1), 157–187 (1995)CrossRefGoogle Scholar
- 19.Jain, A.K., Murthy, M.N., Flynn, P.J.: Data clustering: A review. Technical Report MSU-CSE-00-16, Dept. of CS, Michigan State University (2000)Google Scholar
- 20.Kalashnikov, D.V., et al.: Main memory evaluation of monitoring queries over moving objects. Distrib. Parallel Databases 15(2) (2004)Google Scholar
- 21.Kalnis, P., Mamoulis, N., Bakiras, S.: On discovering moving clusters in spatiotemporal data. In: Bauzer Medeiros, C., Egenhofer, M.J., Bertino, E. (eds.) SSTD 2005. LNCS, vol. 3633, pp. 364–381. Springer, Heidelberg (2005)CrossRefGoogle Scholar
- 22.Li, Y., Han, J., Yang, J.: Clustering moving objects. In: KDD, pp. 617–622 (2004)Google Scholar
- 23.Mokbel, M.F., et al.: Towards scalable location-aware services: requirements and research issues. In: GIS, pp. 110–117 (2003)Google Scholar
- 24.Mokbel, M.F., Xiong, X., Aref, W.G.: Sina: Scalable incremental processing of continuous queries in spatio-temporal databases. In: SIGMOD, pp. 623–634 (2004)Google Scholar
- 25.Ng, R.T., Han, J.: Efficient and effective clustering methods for spatial data mining. In: VLDB, pp. 144–155 (1994)Google Scholar
- 26.O’Callaghan, L., et al.: Streaming-data algorithms for high-quality clustering. In: ICDE, p. 685 (2002)Google Scholar
- 27.Papadias, D., et al.: Conceptual partitioning: An efficient method for continuous nearest neighbor monitoring. In: SIGMOD (2005)Google Scholar
- 28.Philip, Y.C.: Loadstar: A load shedding scheme for classifying data streamsGoogle Scholar
- 29.Prabhakar, S., et al.: Query indexing and velocity constrained indexing: Scalable techniques for continuous queries on moving objects. IEEE Trans. Computers 51(10) (2002)Google Scholar
- 30.Rasmussen, E.M.: Clustering algorithms. In: Information Retrieval: Data Structures & Algorithms, pp. 419–442 (1992)Google Scholar
- 31.Rundensteiner, E.A., Ding, L., et al.: Cape: Continuous query engine with heterogeneous-grained adaptivity. In: VLDB, pp. 1353–1356 (2004)Google Scholar
- 32.Saltenis, S., Jensen, C.S., Leutenegger, S.T., Lopez, M.A.: Indexing the positions of continuously moving objects. In: SIGMOD, pp. 331–342 (2000)Google Scholar
- 33.Sistla, A.P., Wolfson, O., Chamberlain, S., Dao, S.: Modeling and querying moving objects. In: ICDE, pp. 422–432 (1997)Google Scholar
- 34.Tao, Y., Papadias, D.: Time-parameterized queries in spatio-temporal databases. In: SIGMOD, pp. 334–345 (2002)Google Scholar
- 35.Tatbul, N.: QoS-driven load shedding on data streams. In: Chaudhri, A.B., Unland, R., Djeraba, C., Lindner, W. (eds.) EDBT 2002. LNCS, vol. 2490, pp. 566–576. Springer, Heidelberg (2002)CrossRefGoogle Scholar
- 36.Tatbul, N., Çetintemel, U., Zdonik, S.B., Cherniack, M., Stonebraker, M.: Load shedding in a data stream manager. In: VLDB, pp. 309–320 (2003)Google Scholar
- 37.Tayeb, J., Ulusoy, Ö., Wolfson, O.: A quadtree-based dynamic attribute indexing method. Comput. J. 41(3), 185–200 (1998)zbMATHCrossRefGoogle Scholar
- 38.Xiong, X., Mokbel, M.F., et al.: Scalable spatio-temporal continuous query processing for location-aware services. In: SSDBM, p. 317 (2004)Google Scholar
- 39.Xiong, X., Mokbel, M.F., Aref, W.G.: Sea-cnn: Scalable processing of continuous k-nearest neighbor queries in spatio-temporal databases. In: ICDE, pp. 643–654 (2005)Google Scholar
- 40.Ye, N., Li, X.: A scalable, incremental learning algorithm for classification problems. Comput. Ind. Eng. 43(4), 677–692 (2002)CrossRefMathSciNetGoogle Scholar
- 41.Zhang, T., Ramakrishnan, R., Livny, M.: Birch: An efficient data clustering method for very large databases. In: SIGMOD, pp. 103–114 (1996)Google Scholar