Skip to main content
Log in

PRS: efficient range skyline computation on massive data via presorting

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

In many applications, range skyline query is an important operation to find the interesting tuples in a potentially huge data space. Given selection condition, range skyline query returns tuples both satisfying the specified selection condition and not dominated by other satisfying tuples. It is found that most of the existing skyline algorithms do not consider the selection condition. This paper proposes a novel table-scan-based Presorted-table-based Range Skyline (PRS) algorithm to efficiently compute range skyline results on massive data. PRS first presorts the table for early termination. The early termination checking is proposed in this paper, along with the theoretical analysis of scan depth. The selection checking and dominance checking are devised in this paper to skip the unsatisfying or dominated tuples directly. The theoretical analysis proves that the overwhelming majority of candidates can be skipped. The extensive experimental results, conducted on synthetic and real-life data sets, show that PRS outperforms the existing algorithms significantly.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20

Similar content being viewed by others

Notes

  1. In this paper, the tuples are considered in general position [31].

References

  1. Berger M (1987) Geometry I. Springer, Berlin

    Book  MATH  Google Scholar 

  2. Börzsönyi S, Kossmann D, Stocker K (2001) The skyline operator. In: Proceedings of the 17th international conference on data engineering, pp 421–430

  3. Chomicki J, Godfrey P, Gryz J, Liang D (2003) Skyline with presorting. In: Proceedings of the 19th international conference on data engineering, pp 717–719

  4. Dellis E, Vlachou A, Vladimirskiy I, et al (2006) Constrained subspace skyline computation. In: Proceedings of the 15th ACM international conference on information and knowledge management, pp 415–424

  5. Endres M (2015) The structure of preference orders. In: Proceedings of 19th East European conference on advances in databases and information systems, pp 32–45

  6. Endres M, Kießling W (2011a) Semi-skyline optimization of constrained skyline queries. In: 22nd Australasian database conference, pp 7–16

  7. Endres M, Kießling W (2011b) Skyline snippets. In: Proceedings of 9th international conference on flexible query answering systems, pp 246–257

  8. Endres M, Kießling W (2014) High parallel skyline computation over low-cardinality domains. In: Proceedings of the 18th East European conference on advances in databases and information systems, pp 97–111

  9. Endres M, Kießling W (2015) Parallel skyline computation exploiting the lattice structure. J Database Manag 26(4):18–43

    Article  Google Scholar 

  10. Endres M, Preisinger T (2017) Beyond skylines: explicit preferences. In: Proceedings of the 22nd international conference on database systems for advanced applications, Part I, pp 327–342

  11. Endres M, Roocks P, Kießling W (2015) Scalagon: an efficient skyline algorithm for all seasons. In: Proceedings of 20th international conference on database systems for advanced applications, Part II, pp 292–308

  12. Fagin R, Lotem A, Naor M (2001) Optimal aggregation algorithms for middleware. In: Proceedings of the 20th ACM SIGMOD-SIGACT-SIGART symposium on principles of database systems, pp 102–113

  13. Godfrey P, Shipley R, Gryz J (2007) Algorithms and analyses for maximal vector computation. VLDB J 16(1):5–28

    Article  Google Scholar 

  14. Han X, Li J, Gao H (2015) Efficient top-k retrieval on massive data. IEEE Trans Knowl Data Eng 27(10):2687–2699

    Article  Google Scholar 

  15. Han X, Li J, Yang D, Wang J (2013) Efficient skyline computation on big data. IEEE Trans Knowl Data Eng 25(11):2521–2535

    Article  Google Scholar 

  16. Hsueh Y, Lin C, Chang C (2017) An efficient indexing method for skyline computations with partially ordered domains. IEEE Trans Knowl Data Eng 29(5):963–976

    Article  Google Scholar 

  17. Kießling W, Endres M, Wenzel F (2011) The preference SQL system—an overview. IEEE Data Eng Bull 34(2):11–18

    Google Scholar 

  18. Korn F, Pagel B, Faloutsos C (2001) On the ’dimensionality curse’ and the ’self-similarity blessing’. IEEE Trans Knowl Data Eng 13(1):96–111

    Article  Google Scholar 

  19. Kossmann D, Ramsak F, Rost S (2002) Shooting stars in the sky: an online algorithm for skyline queries. In: Proceedings of the 28th international conference on very large data bases, pp 275–286

  20. Kung H, Luccio F, Preparata F (1975) On finding the maxima of a set of vectors. J ACM 22(4):469–476

    Article  MathSciNet  MATH  Google Scholar 

  21. Lee J, Hwang S (2014a) Scalable skyline computation using a balanced pivot selection technique. Inf Syst 39:1–21

    Article  Google Scholar 

  22. Lee J, Hwang S (2014b) Toward efficient multidimensional subspace skyline computation. VLDB J 23(1):129–145

    Article  Google Scholar 

  23. Liu X, Lu R, Ma J et al (2016) Efficient and privacy-preserving skyline computation framework across domains. Future Gener Comput Syst 62(C):161–174

    Article  Google Scholar 

  24. Mandl S, Kozachuk O, Endres M, Kießling W (2015) Preference analytics in exasolution. In: Datenbanksysteme für Business, Technologie und Web (BTW), 16. Fachtagung des GI-Fachbereichs “Datenbanken und Informationssysteme” (DBIS). Proceedings, pp 613–632

  25. Morse M, Patel J, Jagadish H (2007) Efficient skyline computation over low-cardinality domains. In: Proceedings of the 33rd international conference on very large data bases, pp 267–278

  26. Mortensen M, Chester S, Assent I, Magnani M (2015) Efficient caching for constrained skyline queries. In: Proceedings of the 18th international conference on extending database technology, pp 337–348

  27. O’Neil P, Quass D (1997) Improved query performance with variant indexes. In: Proceedings of the 1997 ACM SIGMOD international conference on management of data, pp 38–49

  28. Papadias D, Tao Y, Fu G, Seeger B (2005) Progressive skyline computation in database systems. ACM Trans Database Syst 30(1):41–82

    Article  Google Scholar 

  29. Park Y, Min J, Shim K (2017) Efficient processing of skyline queries using mapreduce. IEEE Trans Knowl Data Eng 29(5):1031–1044

    Article  Google Scholar 

  30. Pei J, Yuan Y, Lin X et al (2006) Towards multidimensional subspace skyline analysis. ACM Trans Database Syst 31(4):1335–1381

    Article  Google Scholar 

  31. Sheng C, Tao Y (2011) On finding skylines in external memory. In: Proceedings of the 30th ACM SIGMOD-SIGACT-SIGART symposium on principles of database systems, pp 107–116

  32. Steuer R (1986) Multiple criteria optimization. Wiley, Hoboken

    MATH  Google Scholar 

  33. Tan K, Eng P, Ooi B (2001) Efficient progressive skyline computation. In: Proceedings of the 27th international conference on very large data bases, pp 301–310

  34. Tao Y, Xiao X, Pei J (2007) Efficient skyline and top-k retrieval in subspaces. IEEE Trans Knowl Data Eng 19(8):1072–1088

    Article  Google Scholar 

  35. Wu K, Shoshani A, Stockinger K (2008) Analyses of multi-level and multi-component compressed bitmap indexes. ACM Trans Database Syst 35(1):2:1–2:52

    Google Scholar 

  36. Xia T, Zhang D, Fang Z et al (2012) Online subspace skyline query processing using the compressed skycube. ACM Trans Database Syst 37(2):15:1–15:36

    Article  Google Scholar 

  37. Xin D, Han J, Cheng H, Li X (2006) Answering top-k queries with multi-dimensional selections: the ranking cube approach. In: Proceedings of the 32nd international conference on very large data bases, pp 463–474

  38. Zhang M, Alhajj R (2010) Skyline queries with constraints: integrating skyline and traditional query operators. Data Knowl Eng 69(1):153–168

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China under Grant Nos. 61872106, 61632010, 61502121 and 61402130, National key research and development program of China under Grant No. 2016YFB1000703 and Weihai-HIT co-construction program under Grant No. ZMZ001702.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xixian Han.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (zip 2301 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Han, X., Li, X., Wang, B. et al. PRS: efficient range skyline computation on massive data via presorting. Knowl Inf Syst 60, 1511–1548 (2019). https://doi.org/10.1007/s10115-018-1310-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-018-1310-y

Keywords

Navigation