SCUBA: Scalable Cluster-Based Algorithm for Evaluating Continuous Spatio-temporal Queries on Moving Objects

  • Rimma V. Nehme
  • Elke A. Rundensteiner
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3896)

Abstract

In this paper, we propose, SCUBA, a Scalable Cl uster Based Algorithm for evaluating a large set of continuous queries over spatio-temporal data streams. The key idea of SCUBA is to group moving objects and queries based on common spatio-temporal properties at run-time into moving clusters to optimize query execution and thus facilitate scalability. SCUBA exploits shared cluster-based execution by abstracting the evaluation of a set of spatio-temporal queries as a spatial join first between moving clusters. This cluster-based filtering prunes true negatives. Then the execution proceeds with a fine-grained within-moving-cluster join process for all pairs of moving clusters identified as potentially joinable by a positive cluster-join match. A moving cluster can serve as an approximation of the location of its members. We show how moving clusters can serve as means for intelligent load shedding of spatio-temporal data to avoid performance degradation with minimal harm to result quality. Our experiments on real datasets demonstrate that SCUBA can achieve a substantial improvement when executing continuous queries on spatio-temporal data streams.

Keywords

Cluster Member Query Execution Continuous Query Location Update Incremental Cluster 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agrawal, R., et al.: Automatic subspace clustering of high dimensional data for data mining applications. In: SIGMOD, pp. 94–105 (1998)Google Scholar
  2. 2.
    Babcock, B., Datar, M., Motwani, R.: Load shedding techniques for data stream systems (2003)Google Scholar
  3. 3.
    Barbará, D.: Chaotic mining: Knowledge discovery using the fractal dimension. In: SIGMOD Workshop on Data Mining and Knowl. Discovery (1999)Google Scholar
  4. 4.
    Barbará, D.: Requirements for clustering data streams. SIGKDD Explorations 3(2), 23–27 (2002)CrossRefGoogle Scholar
  5. 5.
    Brinkhoff, T.: A framework for generating network-based moving objects. GeoInformatica 6(2), 153–180 (2002)MATHCrossRefGoogle Scholar
  6. 6.
    Compton, C.L., Tennenhouse, D.L.: Collaborative load shedding for mediabased applications. In: Int. Conf. on Multimedia Computing and Systems (1994)Google Scholar
  7. 7.
    Domingos, P., Hulten, G.: Catching up with the data: Research issues in mining data streams. In: DMKD (2001)Google Scholar
  8. 8.
    Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. Wiley-Interscience Publication, Hoboken (2000)Google Scholar
  9. 9.
    Elmongui, H.G., Mokbel, M.F., Aref, W.G.: Spatio-temporal histograms. In: Bauzer Medeiros, C., Egenhofer, M.J., Bertino, E. (eds.) SSTD 2005. LNCS, vol. 3633, pp. 19–36. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  10. 10.
    Fisher, D.H.: Iterative optimization and simplification of hierarchical clusterings. CoRR, cs.AI/9604103 (1996)Google Scholar
  11. 11.
    Gedik, B., Liu, L.: Distributed processing of continuously moving queries on moving objects in a mobile system. In: EDBT 2002, pp. 67–87 (2004)Google Scholar
  12. 12.
    Guha, S., Mishra, N., Motwani, R., O’Callaghan, L.: Clustering data streams. In: FOCS, pp. 359–366 (2000)Google Scholar
  13. 13.
    Gupta, S.K., Rao, K.S., Bhatnagar, V.: K-means clustering algorithm for categorical attributes. In: Mohania, M., Tjoa, A.M. (eds.) DaWaK 1999. LNCS, vol. 1676, pp. 203–208. Springer, Heidelberg (1999)Google Scholar
  14. 14.
    Hambrusch, S.E., Liu, C.-M., Aref, W.G., Prabhakar, S.: Query processing in broadcasted spatial index trees. In: Jensen, C.S., Schneider, M., Seeger, B., Tsotras, V.J. (eds.) SSTD 2001. LNCS, vol. 2121, p. 502. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  15. 15.
    Har-Peled, S.: Clustering motion. In: FOCS 2001: Proceedings of the 42nd IEEE symposium on Foundations of Computer Science, p. 84 (2001)Google Scholar
  16. 16.
    Hartigan, J.A.: Clustering Algorithms. John Wiley and Sons, Chichester (1975)MATHGoogle Scholar
  17. 17.
    Hu, H., Xu, J., Lee, D.L.: A generic framework for monitoring continuous spatial queries over moving objects. In: SIGMOD (2005)Google Scholar
  18. 18.
    Jacobson, V.: Congestion avoidance and control. SIGCOMM Comput. Commun.Rev. 25(1), 157–187 (1995)CrossRefGoogle Scholar
  19. 19.
    Jain, A.K., Murthy, M.N., Flynn, P.J.: Data clustering: A review. Technical Report MSU-CSE-00-16, Dept. of CS, Michigan State University (2000)Google Scholar
  20. 20.
    Kalashnikov, D.V., et al.: Main memory evaluation of monitoring queries over moving objects. Distrib. Parallel Databases 15(2) (2004)Google Scholar
  21. 21.
    Kalnis, P., Mamoulis, N., Bakiras, S.: On discovering moving clusters in spatiotemporal data. In: Bauzer Medeiros, C., Egenhofer, M.J., Bertino, E. (eds.) SSTD 2005. LNCS, vol. 3633, pp. 364–381. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  22. 22.
    Li, Y., Han, J., Yang, J.: Clustering moving objects. In: KDD, pp. 617–622 (2004)Google Scholar
  23. 23.
    Mokbel, M.F., et al.: Towards scalable location-aware services: requirements and research issues. In: GIS, pp. 110–117 (2003)Google Scholar
  24. 24.
    Mokbel, M.F., Xiong, X., Aref, W.G.: Sina: Scalable incremental processing of continuous queries in spatio-temporal databases. In: SIGMOD, pp. 623–634 (2004)Google Scholar
  25. 25.
    Ng, R.T., Han, J.: Efficient and effective clustering methods for spatial data mining. In: VLDB, pp. 144–155 (1994)Google Scholar
  26. 26.
    O’Callaghan, L., et al.: Streaming-data algorithms for high-quality clustering. In: ICDE, p. 685 (2002)Google Scholar
  27. 27.
    Papadias, D., et al.: Conceptual partitioning: An efficient method for continuous nearest neighbor monitoring. In: SIGMOD (2005)Google Scholar
  28. 28.
    Philip, Y.C.: Loadstar: A load shedding scheme for classifying data streamsGoogle Scholar
  29. 29.
    Prabhakar, S., et al.: Query indexing and velocity constrained indexing: Scalable techniques for continuous queries on moving objects. IEEE Trans. Computers 51(10) (2002)Google Scholar
  30. 30.
    Rasmussen, E.M.: Clustering algorithms. In: Information Retrieval: Data Structures & Algorithms, pp. 419–442 (1992)Google Scholar
  31. 31.
    Rundensteiner, E.A., Ding, L., et al.: Cape: Continuous query engine with heterogeneous-grained adaptivity. In: VLDB, pp. 1353–1356 (2004)Google Scholar
  32. 32.
    Saltenis, S., Jensen, C.S., Leutenegger, S.T., Lopez, M.A.: Indexing the positions of continuously moving objects. In: SIGMOD, pp. 331–342 (2000)Google Scholar
  33. 33.
    Sistla, A.P., Wolfson, O., Chamberlain, S., Dao, S.: Modeling and querying moving objects. In: ICDE, pp. 422–432 (1997)Google Scholar
  34. 34.
    Tao, Y., Papadias, D.: Time-parameterized queries in spatio-temporal databases. In: SIGMOD, pp. 334–345 (2002)Google Scholar
  35. 35.
    Tatbul, N.: QoS-driven load shedding on data streams. In: Chaudhri, A.B., Unland, R., Djeraba, C., Lindner, W. (eds.) EDBT 2002. LNCS, vol. 2490, pp. 566–576. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  36. 36.
    Tatbul, N., Çetintemel, U., Zdonik, S.B., Cherniack, M., Stonebraker, M.: Load shedding in a data stream manager. In: VLDB, pp. 309–320 (2003)Google Scholar
  37. 37.
    Tayeb, J., Ulusoy, Ö., Wolfson, O.: A quadtree-based dynamic attribute indexing method. Comput. J. 41(3), 185–200 (1998)MATHCrossRefGoogle Scholar
  38. 38.
    Xiong, X., Mokbel, M.F., et al.: Scalable spatio-temporal continuous query processing for location-aware services. In: SSDBM, p. 317 (2004)Google Scholar
  39. 39.
    Xiong, X., Mokbel, M.F., Aref, W.G.: Sea-cnn: Scalable processing of continuous k-nearest neighbor queries in spatio-temporal databases. In: ICDE, pp. 643–654 (2005)Google Scholar
  40. 40.
    Ye, N., Li, X.: A scalable, incremental learning algorithm for classification problems. Comput. Ind. Eng. 43(4), 677–692 (2002)CrossRefMathSciNetGoogle Scholar
  41. 41.
    Zhang, T., Ramakrishnan, R., Livny, M.: Birch: An efficient data clustering method for very large databases. In: SIGMOD, pp. 103–114 (1996)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Rimma V. Nehme
    • 1
  • Elke A. Rundensteiner
    • 2
  1. 1.Department of Computer SciencePurdue University 
  2. 2.Department of Computer ScienceWorcester Polytechnic Institute 

Personalised recommendations