Theory of Computing Systems

, Volume 57, Issue 4, pp 1008–1037 | Cite as

Parallel Skyline Queries

  • Foto N. Afrati
  • Paraschos Koutris
  • Dan Suciu
  • Jeffrey D. Ullman
Article

Abstract

In this paper, we design and analyze parallel algorithms for skyline queries. The skyline of a multidimensional set consists of the points for which no other point exists that is at least as good along every dimension. As a framework for parallel computation, we use both the MP model proposed in Koutris and Suciu (2011), which requires that the data is perfectly load-balanced, and a variation of the model in Afrati and Ullman (2010), the GMP model, which demands weaker load balancing constraints. In addition to load balancing, we want to minimize the number of blocking steps, where all processors must wait and synchronize. We propose a 2-step algorithm in the MP model for any dimension of the dataset, as well a 1-step algorithm for the case of 2 and 3 dimensions. Finally, we present a 1-step algorithm in the GMP model for any number of dimensions and a 1-step algorithm in the MP model for uniform distributions of data points.

Keywords

Skyline queries Parallel computation Grid partitioning 

References

  1. 1.
    Afrati, F.N., Ullman, J.D.: Optimizing joins in a map-reduce environment. In: EDBT, ACM International Conference Proceeding Series, vol. 426, pp 99–110. ACM (2010)Google Scholar
  2. 2.
    Berenbrink, P., Friedetzky, T., Hu, Z., Martin, R.A.: On weighted balls-into-bins games. Theor. Comput. Sci. 409(3), 511–520 (2008)MATHMathSciNetCrossRefGoogle Scholar
  3. 3.
    Börzsönyi, S., Kossmann, D., Stocker, K.: The skyline operator. In: ICDE, pp 421–430. IEEE Computer Society (2001)Google Scholar
  4. 4.
    Chomicki, J., Godfrey, P., Gryz, J., Liang, D.: Skyline with presorting. In: ICDE, pp 717–816. IEEE Computer Society (2003)Google Scholar
  5. 5.
    Cosgaya-Lozano, A., Rau-Chaplin, A., Zeh, N.: Parallel computation of skyline queries. In: HPCS, p 12. IEEE Computer Society (2007)Google Scholar
  6. 6.
    Dean, J., Ghemawat, S.: Mapreduce: Simplified data processing on large clusters. In: OSDI, pp 137–150 (2004)Google Scholar
  7. 7.
    Dehne, F.K.H.A., Fabri, A., Rau-Chaplin, A.: Scalable parallel geometric algorithms for coarse grained multicomputers. In: Symposium on Computational Geometry, pp 298–307 (1993)Google Scholar
  8. 8.
    Gates, A., Natkovich, O., Chopra, S., Kamath, P., Narayanam, S., Olston, C., Reed, B., Srinivasan, S., Srivastava, U.: Building a highlevel dataflow system on top of mapreduce: The pig experience. PVLDB 2(2), 1414–1425 (2009)Google Scholar
  9. 9.
    Godfrey, P., Shipley, R., Gryz, J.: Maximal vector computation in large data sets. In: VLDB, pp 229–240. ACM (2005)Google Scholar
  10. 10.
    Hellerstein, J.M.: The declarative imperative: experiences and conjectures in distributed logic. SIGMOD Record 39(1), 5–19 (2010)CrossRefGoogle Scholar
  11. 11.
    Karloff, H.J., Suri, S., Vassilvitskii, S.: A model of computation for mapreduce. In: SODA, pp 938–948. SIAM (2010)Google Scholar
  12. 12.
    Köhler, H., Yang, J., Zhou, X.: Efficient parallel skyline processing using hyperplane projections. In: SIGMOD Conference, pp 85–96. ACM (2011)Google Scholar
  13. 13.
    Koutris, P., Suciu, D.: Parallel evaluation of conjunctive queries. In: PODS, pp 223–234. ACM (2011)Google Scholar
  14. 14.
    Kung, H.T., Luccio, F., Preparata, F.P.: On finding the maxima of a set of vectors. J. ACM 22(4), 469–476 (1975)MATHMathSciNetCrossRefGoogle Scholar
  15. 15.
    Lee, K.C.K., Zheng, B., Li, H., Lee, W.C.: Approaching the skyline in z order. In: VLDB, pp 279–290. ACM (2007)Google Scholar
  16. 16.
    Matousek, J.: Computing dominances in E n. Inf. Process. Lett. 38(5), 277–278 (1991)MATHMathSciNetCrossRefGoogle Scholar
  17. 17.
    Papadias, D., Tao, Y., Fu, G., Seeger, B.: Progressive skyline computation in database systems. ACM Trans. Database Syst. 30(1), 41–82 (2005)CrossRefGoogle Scholar
  18. 18.
    Park, S., Kim, T., Park, J., Kim, J., Im, H.: Parallel skyline computation on multicore architectures. In: ICDE, pp 760–771. IEEE (2009)Google Scholar
  19. 19.
    Raab, M., Steger, A.: balls into bins - a simple and tight analysis. In: RANDOM, pp 159–170 (1998)Google Scholar
  20. 20.
    Rocha-Junior, J.B., Vlachou, A., Doulkeridis, C., Nørvåg, K.: Agids: A grid-based strategy for distributed skyline query processing. In: Globe, Lecture Notes in Computer Science, vol. 5697, pp 12–23. Springer (2009)Google Scholar
  21. 21.
    Stojmenovic, I., Miyakawa, M.: An optimal parallel algorithm for solving the maximal elements problem in the plane. Parallel Comput. 7(2), 249–251 (1988)MATHMathSciNetCrossRefGoogle Scholar
  22. 22.
    Vlachou, A., Doulkeridis, C., Kotidis, Y.: Angle-based space partitioning for efficient parallel skyline computation. In: SIGMOD Conference, pp 227–238. ACM (2008)Google Scholar
  23. 23.
    Wang, S., Ooi, B.C., Tung, A.K.H., Xu, L.: Efficient skyline query processing on peer-to-peer networks. In: ICDE, pp 1126–1135. IEEE (2007)Google Scholar
  24. 24.
    Wu, P., Zhang, C., Feng, Y., Zhao, B.Y., Agrawal, D., Abbadi, A.E.: Parallelizing skyline queries for scalable distribution. In: EDBT, Lecture Notes in Computer Science, vol. 3896, pp 112–130. Springer (2006)Google Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  • Foto N. Afrati
    • 1
  • Paraschos Koutris
    • 2
  • Dan Suciu
    • 2
  • Jeffrey D. Ullman
    • 3
  1. 1.National Techincal University of AthensAthensGreece
  2. 2.University of WashingtonSeattleUSA
  3. 3.Stanford UniversityStanfordUSA

Personalised recommendations