Monitoring distributed fragmented skylines
- 10 Downloads
Abstract
Distributed skyline computation is important for a wide range of domains, from distributed and web-based systems to ISP-network monitoring and distributed databases. The problem is particularly challenging in dynamic distributed settings, where the goal is to efficiently monitor a continuous skyline query over a collection of distributed streams. All existing work relies on the assumption of a single point of reference for object attributes/dimensions: objects may be vertically or horizontally partitioned, but the accurate value of each dimension for each object is always maintained by a single site. This assumption is unrealistic for several distributed applications, where object information is fragmented over a set of distributed streams (each monitored by a different site) and needs to be aggregated (e.g., averaged) across several sites. Furthermore, it is frequently useful to define skyline dimensions through complex functions over the aggregated objects, which raises further challenges for dealing with distribution and object fragmentation. We present the first known distributed algorithms for continuous monitoring of skylines over complex functions of fragmented multi-dimensional objects. Our algorithms rely on decomposition of the skyline monitoring problem to a select set of distributed threshold-crossing queries, which can be monitored locally at each site. We propose several optimizations, including: (a) a technique for adaptively determining the most efficient monitoring strategy for each object, (b) an approximate monitoring technique, and (c) a strategy that reduces communication overhead by grouping together threshold-crossing queries. Furthermore, we discuss how our proposed algorithms can be used to address other continuous query types. A thorough experimental study with synthetic and real-life data sets verifies the effectiveness of our schemes and demonstrates order-of-magnitude improvements in communication costs compared to the only alternative centralized solution.
Keywords
Skylines Fragmented skylines Distributed skylines Geometric methodReferences
- 1.Babcock, B., Olston, C.: Distributed top-k monitoring. In: SIGMOD, pp. 28–39 (2003)Google Scholar
- 2.Balke, W.T., Gntzer, U., Zheng, J.X.: Efficient distributed skylining for web information systems. In: EDBT (2004)Google Scholar
- 3.Börzsönyi, S., Kossmann, D., Stocker, K.: The skyline operator. In: ICDE (2001)Google Scholar
- 4.Burdakis, S., Deligiannakis, A.: Detecting outliers in sensor networks using the geometric approach. In: ICDE (2012)Google Scholar
- 5.Cheema, M.A., Lin, X., Zhang, W., Zhang, Y.: A safe zone based approach for monitoring moving skyline queries. In: EDBT (2013)Google Scholar
- 6.Cheng, R., Kalashnikov, D.V., Prabhakar, S.: Evaluation of probabilistic queries over imprecise data in constantly-evolving environments. Inf. Syst. 32(1), 104–130 (2007)CrossRefGoogle Scholar
- 7.Cormode, G., Garofalakis, M.: Approximate continuous querying over distributed streams. TODS 33(2), 1–42 (2008)CrossRefGoogle Scholar
- 8.Cormode, G., Garofalakis, M., Muthukrishnan, S., Rastogi, R.: Holistic aggregates in a networked world: distributed tracking of approximate quantiles. In: SIGMOD (2005)Google Scholar
- 9.Cranor, C., Johnson, T., Spatscheck, O., Shkapenyuk, V.: Gigascope: A stream database for network applications. In: SIGMOD (2003)Google Scholar
- 10.Cui, B., Lu, H., Xu, Q., Chen, L., Dai, Y., Zhou, Y.: Parallel distributed processing of constrained skyline queries by filtering. In: ICDE (2008)Google Scholar
- 11.Das, A., Ganguly, S., Garofalakis, M., Rastogi, R.: Distributed set-expression cardinality estimation. In: VLDB, pp. 312–323 (2004)Google Scholar
- 12.Graham, R., Knuth, D., Patashnik, O.: Concrete Mathematics: A Foundation for Computer Science. Addison-Wesley, Upper Saddle River (1989)MATHGoogle Scholar
- 13.HadjAli, A., Pivert, O., Prade, H.: On different types of fuzzy skylines. ISMIS 2011, 581–591 (2011)Google Scholar
- 14.Hose, K., Vlachou, A.: A survey of skyline processing in highly distributed environments. VLDB J. 21(3), 359–384 (2011)CrossRefGoogle Scholar
- 15.Huang, Z., Lu, H., Ooi, B.C., Tung, A.K.H.: Continuous skyline queries for moving objects. TKDE 18(12), 1645–1658 (2006)Google Scholar
- 16.Keren, D., Sharfman, I., Schuster, A., Livne, A.: Shape sensitive geometric monitoring. TKDE 24(8), 1520–1535 (2012)Google Scholar
- 17.Koltun, V., Papadimitriou, C.: Approximately dominating representatives. Theor. Comput. Sci. 371(3), 148–154 (2007)MathSciNetCrossRefMATHGoogle Scholar
- 18.Lazerson, A., Sharfman, I., Keren, D., Schuster, A., Garofalakis, M.N., Samoladas, V.: Monitoring distributed streams using convex decompositions. PVLDB 8(5), 545–556 (2015)Google Scholar
- 19.Lee, J., Hwang, S.: Scalable skyline computation using a balanced pivot selection technique. Inf. Syst. 39, 1–21 (2014)CrossRefGoogle Scholar
- 20.Madden, S., Franklin, M., Hellerstein, J., Hong, W.: The design of an acquisitional query processor for sensor networks. In: SIGMOD (2003)Google Scholar
- 21.Olston, C., Jiang, J., Widom, J.: Adaptive filters for continuous queries over distributed data streams. In: SIGMOD (2003)Google Scholar
- 22.Papadias, D., Fu, G., Chase, M., Seeger, B.: Progressive skyline computation in database systems. TODS 30(1), 41–82 (2005)CrossRefGoogle Scholar
- 23.Papapetrou, O., Garofalakis, M.N.: Continuous fragmented skylines over distributed streams. In: ICDE (2014)Google Scholar
- 24.Sharfman, I., Schuster, A., Keren, D.: A geometric approach to monitoring threshold functions over distributed data streams. In: SIGMOD (2006)Google Scholar
- 25.Tao, Y., Papadias, D.: Maintaining sliding window skylines on data streams. TKDE 18(2), 377–391 (2006)Google Scholar
- 26.Tao, Y., Xiao, X., Pei, J.: SUBSKY: efficient computation of skylines in subspaces. In: ICDE (2006)Google Scholar
- 27.Trimponias, G., Bartolini, I., Papadias, D., Yang, Y.: Skyline processing on distributed vertical decompositions. TKDE 25(4), 850–862 (2013). https://doi.org/10.1109/TKDE.2011.266 Google Scholar
- 28.Vlachou, A., Doulkeridis, C., Kotidis, Y., Vazirgiannis, M.: Efficient routing of subspace skyline queries over highly distributed data. TKDE 22(12), 1694–1708 (2010)Google Scholar
- 29.Wu, P., Agrawal, D., Egecioglu, Ö., El Abbadi, A.: DeltaSky: Optimal maintenance of skyline deletions without exclusive dominance region generation. In: ICDE (2007)Google Scholar
- 30.Yuan, Y., Lin, X., Liu, Q., Wang, W., Yu, J.X., Zhang, Q.: Efficient computation of the skyline cube. In: VLDB (2005)Google Scholar
- 31.Zhang, S., Mamoulis, N., Cheung, D.W.: Scalable skyline computation using object-based space partitioning. In: SIGMOD (2009)Google Scholar
- 32.Zhang, Z., Cheng, R., Papadias, D., Tung, A.: Minimizing the communication cost for continuous skyline maintenance. In: SIGMOD (2009)Google Scholar