Skip to main content

ClusterSheddy: Load Shedding Using Moving Clusters over Spatio-temporal Data Streams

  • Conference paper

Part of the Lecture Notes in Computer Science book series (LNISA,volume 4443)

Abstract

Moving object environments are characterized by large numbers of objects continuously sending location updates. At times, data arrival rates may spike up, causing the load on the system to exceed its capacity. This may result in increased output latencies, potentially leading to invalid or obsolete answers. Dropping data randomly, the most frequently used approach in the literature for load shedding, may adversely affect the accuracy of the results. We thus propose a load shedding technique customized for spatio-temporal stream data. In our model, spatio-temporal properties, such as location, time, direction and speed over time, serve as critical factors in the load shedding decision. The main idea is to abstract similarly moving objects into moving clusters which serve as summaries of their members’ movement. Based on resource restrictions, members within clusters may be selectively discarded, while their locations are being approximated by their respective moving clusters. Our experimental study illustrates the performance gains achieved by our load-shedding framework and the tradeoff between the amount of data shed and the result accuracy.

Keywords

  • Data Stream
  • Cluster Member
  • Continuous Query
  • Location Update
  • Query Answer

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Babcock, B., Datar, M., Motwani, R.: Load shedding techniques for data stream systems. In: MPDS: Workshop on Management and Processing of Data Streams (2003)

    Google Scholar 

  2. Babcock, B., Datar, M., Motwani, R.: Load shedding for aggregation queries over data streams. In: ICDE, pp. 350–361 (2004)

    Google Scholar 

  3. Barbará, D., DuMouchel, W., et al.: The new jersey data reduction report. IEEE Data Eng. Bull. 20(4) (1997)

    Google Scholar 

  4. Bolch, G., et al.: Queueing Networks and Markov Chains: Modeling and Performance Evaluation With Computer Science Applications. John Wiley and Sons, Chichester (1998)

    MATH  Google Scholar 

  5. Brinkhoff, T.: A framework for generating network-based moving objects. GeoInformatica 6(2), 153–180 (2002)

    CrossRef  MATH  Google Scholar 

  6. Carney, D., Çetintemel, U., et al.: Monitoring streams - a new class of data management applications. In: VLDB, pp. 215–226 (2002)

    Google Scholar 

  7. Chu, S.: The influence of urban elements on time-pattern of pedestrian movement. In: The 6th Int. Conf. on Walking in the 21st Cent. (2005)

    Google Scholar 

  8. Das, A., Gehrke, J., et al.: Semantic approximation of data stream joins. IEEE Trans. Knowl. Data Eng. 17(1), 44–59 (2005)

    CrossRef  Google Scholar 

  9. Das, A., Gehrke, J., Riedewald, M.: Approximate join processing over data streams. In: SIGMOD, pp. 40–51 (2003)

    Google Scholar 

  10. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. Wiley-Interscience, Hoboken (2000)

    Google Scholar 

  11. Hartigan, J.A.: Clustering Algorithms. John Wiley and Sons, Chichester (1975)

    MATH  Google Scholar 

  12. Jain, A.K., Murthy, M.N., Flynn, P.J.: Data clustering: A review. Technical Report MSU-CSE-00-16, Department of Computer Science, Michigan State University, East Lansing, Michigan (August 2000)

    Google Scholar 

  13. Jain, R.K.: The Art of Computer Systems Performance Analysis: Techniques for Experimental Design, Measurement, Simulation, and Modeling. John Wiley and Sons, Chichester (1991)

    MATH  Google Scholar 

  14. Kalnis, P., Mamoulis, N., Bakiras, S.: On Discovering Moving Clusters in Spatio-temporal Data. In: Bauzer Medeiros, C., Egenhofer, M.J., Bertino, E. (eds.) SSTD 2005. LNCS, vol. 3633, pp. 364–381. Springer, Heidelberg (2005)

    Google Scholar 

  15. Kurose, J.F., Ross, K.: Computer Networking: A Top-Down Approach Featuring the Internet. Addison-Wesley Longman Publishing Co., Inc., Boston (2002)

    Google Scholar 

  16. Liu, B., Zhu, Y., Rundensteiner, E.: Run-time operator state spilling for memory intensive continuous queries. In: SIGMOD Conference, pp. 347–358 (2006)

    Google Scholar 

  17. Mokbel, M.F., Aref, W.G.: Sole: Scalable online execution of continuous queries on spatio-temporal data streams. tr csd-05-016. Technical report, Purdue University (2005)

    Google Scholar 

  18. Mokbel, M.F., Aref, W.G., Hambrusch, S.E., Prabhakar, S.: Towards scalable location-aware services: requirements and research issues. In: GIS, pp. 110–117 (2003)

    Google Scholar 

  19. Mokbel, M.F., Xiong, X., et al.: Sina: Scalable incremental processing of continuous queries in spatio-temporal databases. In: SIGMOD, pp. 623–634 (2004)

    Google Scholar 

  20. Nehme, R.V., Rundensteiner, E.A.: SCUBA: Scalable Cluster-Based Algorithm for Evaluating Continuous Spatio-temporal Queries on Moving Objects. In: Ioannidis, Y., Scholl, M.H., Schmidt, J.W., Matthes, F., Hatzopoulos, M., Böhm, K., Kemper, A., Grust, T., Böhm, C. (eds.) EDBT 2006. LNCS, vol. 3896, pp. 1001–1019. Springer, Heidelberg (2006)

    CrossRef  Google Scholar 

  21. Prabhakar, S., et al.: Query indexing and velocity constrained indexing: Scalable techniques for continuous queries on moving objects. IEEE Trans. Computers 51(10) (2002)

    Google Scholar 

  22. Reiss, F., Hellerstein, J.M.: Data triage: An adaptive architecture for load shedding in telegraphcq. In: ICDE, pp. 155–156 (2005)

    Google Scholar 

  23. Rundensteiner, E.A., Ding, L., et al.: Cape: Continuous query engine with heterogeneous-grained adaptivity. In: VLDB, pp. 1353–1356 (2004)

    Google Scholar 

  24. Shah, M., Hellerstein, J., et al.: Flux: An adaptive partitioning operator for continuous query systems. cs-02-1205. Technical report, U.C. Berkeley (2002)

    Google Scholar 

  25. Sistla, A.P., Wolfson, O., et al.: Modeling and querying moving objects. In: ICDE, pp. 422–432 (1997)

    Google Scholar 

  26. Tatbul, N.: Qos-driven load shedding on data streams. In: XMLDM, pp. 566–576 (2002)

    Google Scholar 

  27. Tatbul, N., Çetintemel, U., et al.: Load shedding in a data stream manager. In: VLDB, pp. 309–320 (2003)

    Google Scholar 

  28. Tatbul, N., Zdonik, S.B.: Window-aware load shedding for aggregation queries over data streams. In: VLDB, pp. 799–810 (2006)

    Google Scholar 

  29. Tu, Y.-C., Liu, S., Prabhakar, S., Yao, B.: Load shedding in stream databases: A control-based approach. In: VLDB, pp. 787–798 (2006)

    Google Scholar 

  30. Urhan, T., Franklin, M.J.: Xjoin: A reactively-scheduled pipelined join operator. IEEE Data Eng. Bull. 23(2) (2000)

    Google Scholar 

  31. Wolfson, O., Cao, H., Lin, H., Trajcevski, G., Zhang, F., Rishe, N.: Management of Dynamic Location Information in DOMINO. In: Jensen, C.S., Jeffery, K.G., Pokorný, J., Šaltenis, S., Bertino, E., Böhm, K., Jarke, M. (eds.) EDBT 2002. LNCS, vol. 2287, pp. 769–771. Springer, Heidelberg (2002)

    CrossRef  Google Scholar 

  32. Xiong, X., Mokbel, M.F., et al.: Sea-cnn: Scalable processing of continuous k-nearest neighbor queries in spatio-temporal databases. In: ICDE, pp. 643–654 (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Nehme, R.V., Rundensteiner, E.A. (2007). ClusterSheddy: Load Shedding Using Moving Clusters over Spatio-temporal Data Streams. In: Kotagiri, R., Krishna, P.R., Mohania, M., Nantajeewarawat, E. (eds) Advances in Databases: Concepts, Systems and Applications. DASFAA 2007. Lecture Notes in Computer Science, vol 4443. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71703-4_54

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-71703-4_54

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-71702-7

  • Online ISBN: 978-3-540-71703-4

  • eBook Packages: Computer ScienceComputer Science (R0)