Abstract
In this paper, we propose, SCUBA, a Scalable Cl uster Based Algorithm for evaluating a large set of continuous queries over spatio-temporal data streams. The key idea of SCUBA is to group moving objects and queries based on common spatio-temporal properties at run-time into moving clusters to optimize query execution and thus facilitate scalability. SCUBA exploits shared cluster-based execution by abstracting the evaluation of a set of spatio-temporal queries as a spatial join first between moving clusters. This cluster-based filtering prunes true negatives. Then the execution proceeds with a fine-grained within-moving-cluster join process for all pairs of moving clusters identified as potentially joinable by a positive cluster-join match. A moving cluster can serve as an approximation of the location of its members. We show how moving clusters can serve as means for intelligent load shedding of spatio-temporal data to avoid performance degradation with minimal harm to result quality. Our experiments on real datasets demonstrate that SCUBA can achieve a substantial improvement when executing continuous queries on spatio-temporal data streams.
Keywords
- Cluster Member
- Query Execution
- Continuous Query
- Location Update
- Incremental Cluster
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, access via your institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Agrawal, R., et al.: Automatic subspace clustering of high dimensional data for data mining applications. In: SIGMOD, pp. 94–105 (1998)
Babcock, B., Datar, M., Motwani, R.: Load shedding techniques for data stream systems (2003)
Barbará, D.: Chaotic mining: Knowledge discovery using the fractal dimension. In: SIGMOD Workshop on Data Mining and Knowl. Discovery (1999)
Barbará, D.: Requirements for clustering data streams. SIGKDD Explorations 3(2), 23–27 (2002)
Brinkhoff, T.: A framework for generating network-based moving objects. GeoInformatica 6(2), 153–180 (2002)
Compton, C.L., Tennenhouse, D.L.: Collaborative load shedding for mediabased applications. In: Int. Conf. on Multimedia Computing and Systems (1994)
Domingos, P., Hulten, G.: Catching up with the data: Research issues in mining data streams. In: DMKD (2001)
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. Wiley-Interscience Publication, Hoboken (2000)
Elmongui, H.G., Mokbel, M.F., Aref, W.G.: Spatio-temporal histograms. In: Bauzer Medeiros, C., Egenhofer, M.J., Bertino, E. (eds.) SSTD 2005. LNCS, vol. 3633, pp. 19–36. Springer, Heidelberg (2005)
Fisher, D.H.: Iterative optimization and simplification of hierarchical clusterings. CoRR, cs.AI/9604103 (1996)
Gedik, B., Liu, L.: Distributed processing of continuously moving queries on moving objects in a mobile system. In: EDBT 2002, pp. 67–87 (2004)
Guha, S., Mishra, N., Motwani, R., O’Callaghan, L.: Clustering data streams. In: FOCS, pp. 359–366 (2000)
Gupta, S.K., Rao, K.S., Bhatnagar, V.: K-means clustering algorithm for categorical attributes. In: Mohania, M., Tjoa, A.M. (eds.) DaWaK 1999. LNCS, vol. 1676, pp. 203–208. Springer, Heidelberg (1999)
Hambrusch, S.E., Liu, C.-M., Aref, W.G., Prabhakar, S.: Query processing in broadcasted spatial index trees. In: Jensen, C.S., Schneider, M., Seeger, B., Tsotras, V.J. (eds.) SSTD 2001. LNCS, vol. 2121, p. 502. Springer, Heidelberg (2001)
Har-Peled, S.: Clustering motion. In: FOCS 2001: Proceedings of the 42nd IEEE symposium on Foundations of Computer Science, p. 84 (2001)
Hartigan, J.A.: Clustering Algorithms. John Wiley and Sons, Chichester (1975)
Hu, H., Xu, J., Lee, D.L.: A generic framework for monitoring continuous spatial queries over moving objects. In: SIGMOD (2005)
Jacobson, V.: Congestion avoidance and control. SIGCOMM Comput. Commun.Rev. 25(1), 157–187 (1995)
Jain, A.K., Murthy, M.N., Flynn, P.J.: Data clustering: A review. Technical Report MSU-CSE-00-16, Dept. of CS, Michigan State University (2000)
Kalashnikov, D.V., et al.: Main memory evaluation of monitoring queries over moving objects. Distrib. Parallel Databases 15(2) (2004)
Kalnis, P., Mamoulis, N., Bakiras, S.: On discovering moving clusters in spatiotemporal data. In: Bauzer Medeiros, C., Egenhofer, M.J., Bertino, E. (eds.) SSTD 2005. LNCS, vol. 3633, pp. 364–381. Springer, Heidelberg (2005)
Li, Y., Han, J., Yang, J.: Clustering moving objects. In: KDD, pp. 617–622 (2004)
Mokbel, M.F., et al.: Towards scalable location-aware services: requirements and research issues. In: GIS, pp. 110–117 (2003)
Mokbel, M.F., Xiong, X., Aref, W.G.: Sina: Scalable incremental processing of continuous queries in spatio-temporal databases. In: SIGMOD, pp. 623–634 (2004)
Ng, R.T., Han, J.: Efficient and effective clustering methods for spatial data mining. In: VLDB, pp. 144–155 (1994)
O’Callaghan, L., et al.: Streaming-data algorithms for high-quality clustering. In: ICDE, p. 685 (2002)
Papadias, D., et al.: Conceptual partitioning: An efficient method for continuous nearest neighbor monitoring. In: SIGMOD (2005)
Philip, Y.C.: Loadstar: A load shedding scheme for classifying data streams
Prabhakar, S., et al.: Query indexing and velocity constrained indexing: Scalable techniques for continuous queries on moving objects. IEEE Trans. Computers 51(10) (2002)
Rasmussen, E.M.: Clustering algorithms. In: Information Retrieval: Data Structures & Algorithms, pp. 419–442 (1992)
Rundensteiner, E.A., Ding, L., et al.: Cape: Continuous query engine with heterogeneous-grained adaptivity. In: VLDB, pp. 1353–1356 (2004)
Saltenis, S., Jensen, C.S., Leutenegger, S.T., Lopez, M.A.: Indexing the positions of continuously moving objects. In: SIGMOD, pp. 331–342 (2000)
Sistla, A.P., Wolfson, O., Chamberlain, S., Dao, S.: Modeling and querying moving objects. In: ICDE, pp. 422–432 (1997)
Tao, Y., Papadias, D.: Time-parameterized queries in spatio-temporal databases. In: SIGMOD, pp. 334–345 (2002)
Tatbul, N.: QoS-driven load shedding on data streams. In: Chaudhri, A.B., Unland, R., Djeraba, C., Lindner, W. (eds.) EDBT 2002. LNCS, vol. 2490, pp. 566–576. Springer, Heidelberg (2002)
Tatbul, N., Çetintemel, U., Zdonik, S.B., Cherniack, M., Stonebraker, M.: Load shedding in a data stream manager. In: VLDB, pp. 309–320 (2003)
Tayeb, J., Ulusoy, Ö., Wolfson, O.: A quadtree-based dynamic attribute indexing method. Comput. J. 41(3), 185–200 (1998)
Xiong, X., Mokbel, M.F., et al.: Scalable spatio-temporal continuous query processing for location-aware services. In: SSDBM, p. 317 (2004)
Xiong, X., Mokbel, M.F., Aref, W.G.: Sea-cnn: Scalable processing of continuous k-nearest neighbor queries in spatio-temporal databases. In: ICDE, pp. 643–654 (2005)
Ye, N., Li, X.: A scalable, incremental learning algorithm for classification problems. Comput. Ind. Eng. 43(4), 677–692 (2002)
Zhang, T., Ramakrishnan, R., Livny, M.: Birch: An efficient data clustering method for very large databases. In: SIGMOD, pp. 103–114 (1996)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Nehme, R.V., Rundensteiner, E.A. (2006). SCUBA: Scalable Cluster-Based Algorithm for Evaluating Continuous Spatio-temporal Queries on Moving Objects. In: Ioannidis, Y., et al. Advances in Database Technology - EDBT 2006. EDBT 2006. Lecture Notes in Computer Science, vol 3896. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11687238_58
Download citation
DOI: https://doi.org/10.1007/11687238_58
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-32960-2
Online ISBN: 978-3-540-32961-9
eBook Packages: Computer ScienceComputer Science (R0)