Abstract
In this paper, we design and analyze parallel algorithms for skyline queries. The skyline of a multidimensional set consists of the points for which no other point exists that is at least as good along every dimension. As a framework for parallel computation, we use both the MP model proposed in Koutris and Suciu (2011), which requires that the data is perfectly load-balanced, and a variation of the model in Afrati and Ullman (2010), the GMP model, which demands weaker load balancing constraints. In addition to load balancing, we want to minimize the number of blocking steps, where all processors must wait and synchronize. We propose a 2-step algorithm in the MP model for any dimension of the dataset, as well a 1-step algorithm for the case of 2 and 3 dimensions. Finally, we present a 1-step algorithm in the GMP model for any number of dimensions and a 1-step algorithm in the MP model for uniform distributions of data points.
This is a preview of subscription content, access via your institution.


Notes
Throughout this paper, we will assume set (and not bag) semantics.
If the Conjunctive Query has k variables, then ε is at most 1/k.
In [13], the size of the broadcast data was required to be O(n ε), for some ε<1. In this paper we impose a stricter bound, by requiring it to be independent on n.
It will be one of p, p logp or p 1/(d−1).
References
Afrati, F.N., Ullman, J.D.: Optimizing joins in a map-reduce environment. In: EDBT, ACM International Conference Proceeding Series, vol. 426, pp 99–110. ACM (2010)
Berenbrink, P., Friedetzky, T., Hu, Z., Martin, R.A.: On weighted balls-into-bins games. Theor. Comput. Sci. 409(3), 511–520 (2008)
Börzsönyi, S., Kossmann, D., Stocker, K.: The skyline operator. In: ICDE, pp 421–430. IEEE Computer Society (2001)
Chomicki, J., Godfrey, P., Gryz, J., Liang, D.: Skyline with presorting. In: ICDE, pp 717–816. IEEE Computer Society (2003)
Cosgaya-Lozano, A., Rau-Chaplin, A., Zeh, N.: Parallel computation of skyline queries. In: HPCS, p 12. IEEE Computer Society (2007)
Dean, J., Ghemawat, S.: Mapreduce: Simplified data processing on large clusters. In: OSDI, pp 137–150 (2004)
Dehne, F.K.H.A., Fabri, A., Rau-Chaplin, A.: Scalable parallel geometric algorithms for coarse grained multicomputers. In: Symposium on Computational Geometry, pp 298–307 (1993)
Gates, A., Natkovich, O., Chopra, S., Kamath, P., Narayanam, S., Olston, C., Reed, B., Srinivasan, S., Srivastava, U.: Building a highlevel dataflow system on top of mapreduce: The pig experience. PVLDB 2(2), 1414–1425 (2009)
Godfrey, P., Shipley, R., Gryz, J.: Maximal vector computation in large data sets. In: VLDB, pp 229–240. ACM (2005)
Hellerstein, J.M.: The declarative imperative: experiences and conjectures in distributed logic. SIGMOD Record 39(1), 5–19 (2010)
Karloff, H.J., Suri, S., Vassilvitskii, S.: A model of computation for mapreduce. In: SODA, pp 938–948. SIAM (2010)
Köhler, H., Yang, J., Zhou, X.: Efficient parallel skyline processing using hyperplane projections. In: SIGMOD Conference, pp 85–96. ACM (2011)
Koutris, P., Suciu, D.: Parallel evaluation of conjunctive queries. In: PODS, pp 223–234. ACM (2011)
Kung, H.T., Luccio, F., Preparata, F.P.: On finding the maxima of a set of vectors. J. ACM 22(4), 469–476 (1975)
Lee, K.C.K., Zheng, B., Li, H., Lee, W.C.: Approaching the skyline in z order. In: VLDB, pp 279–290. ACM (2007)
Matousek, J.: Computing dominances in E n. Inf. Process. Lett. 38(5), 277–278 (1991)
Papadias, D., Tao, Y., Fu, G., Seeger, B.: Progressive skyline computation in database systems. ACM Trans. Database Syst. 30(1), 41–82 (2005)
Park, S., Kim, T., Park, J., Kim, J., Im, H.: Parallel skyline computation on multicore architectures. In: ICDE, pp 760–771. IEEE (2009)
Raab, M., Steger, A.: balls into bins - a simple and tight analysis. In: RANDOM, pp 159–170 (1998)
Rocha-Junior, J.B., Vlachou, A., Doulkeridis, C., Nørvåg, K.: Agids: A grid-based strategy for distributed skyline query processing. In: Globe, Lecture Notes in Computer Science, vol. 5697, pp 12–23. Springer (2009)
Stojmenovic, I., Miyakawa, M.: An optimal parallel algorithm for solving the maximal elements problem in the plane. Parallel Comput. 7(2), 249–251 (1988)
Vlachou, A., Doulkeridis, C., Kotidis, Y.: Angle-based space partitioning for efficient parallel skyline computation. In: SIGMOD Conference, pp 227–238. ACM (2008)
Wang, S., Ooi, B.C., Tung, A.K.H., Xu, L.: Efficient skyline query processing on peer-to-peer networks. In: ICDE, pp 1126–1135. IEEE (2007)
Wu, P., Zhang, C., Feng, Y., Zhao, B.Y., Agrawal, D., Abbadi, A.E.: Parallelizing skyline queries for scalable distribution. In: EDBT, Lecture Notes in Computer Science, vol. 3896, pp 112–130. Springer (2006)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Afrati, F.N., Koutris, P., Suciu, D. et al. Parallel Skyline Queries. Theory Comput Syst 57, 1008–1037 (2015). https://doi.org/10.1007/s00224-015-9627-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00224-015-9627-3
Keywords
- Skyline queries
- Parallel computation
- Grid partitioning