On High Dimensional Skylines

  • Chee-Yong Chan
  • H. V. Jagadish
  • Kian-Lee Tan
  • Anthony K. H. Tung
  • Zhenjie Zhang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3896)

Abstract

In many decision-making applications, the skyline query is frequently used to find a set of dominating data points (called skyline points) in a multi-dimensional dataset. In a high-dimensional space skyline points no longer offer any interesting insights as there are too many of them. In this paper, we introduce a novel metric, called skyline frequency that compares and ranks the interestingness of data points based on how often they are returned in the skyline when different number of dimensions (i.e., subspaces) are considered. Intuitively, a point with a high skyline frequency is more interesting as it can be dominated on fewer combinations of the dimensions. Thus, the problem becomes one of finding top-k frequent skyline points. But the algorithms thus far proposed for skyline computation typically do not scale well with dimensionality. Moreover, frequent skyline computation requires that skylines be computed for each of an exponential number of subsets of the dimensions. We present efficient approximate algorithms to address these twin difficulties. Our extensive performance study shows that our approximate algorithm can run fast and compute the correct result on large data sets in high-dimensional spaces.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
  2. 2.
    Balke, W.-T., Güntzer, U., Zheng, J.X.: Efficient distributed skylining for web information systems. In: Bertino, E., Christodoulakis, S., Plexousakis, D., Christophides, V., Koubarakis, M., Böhm, K., Ferrari, E. (eds.) EDBT 2004. LNCS, vol. 2992, pp. 256–273. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  3. 3.
    Börzsönyi, S., Kossmann, D., Stocker, K.: The skyline operator. In: ICDE (2001)Google Scholar
  4. 4.
    Carey, M., Kossmann, D.: On saying “enough already!” in SQL. In: SIGMOD (1997)Google Scholar
  5. 5.
    Chan, C.-Y., Eng, P.-K., Tan, K.-L.: Stratified computation of skylines with partiallyordered domains. In: SIGMOD (2005)Google Scholar
  6. 6.
    Chomicki, J., Godfrey, P., Gryz, J., Liang, D.: Skyline with presorting. In: ICDE (2003)Google Scholar
  7. 7.
    Godfrey, P., Shipley, R., Gryz, J.: Maximal vector computation in large data sets. In: VLDB (2005)Google Scholar
  8. 8.
    Kapp, R.M., Luby, M., Madras, N.: Monte-Carlo approximation algorithms for enumeration problems. J. Algorithms 10(3), 429–448 (1989)CrossRefMathSciNetGoogle Scholar
  9. 9.
    Kossmann, D., Ramsak, F., Rost, S.: Shooting stars in the sky: an online algorithm for skyline queries. In: VLDB (2002)Google Scholar
  10. 10.
    Kung, H.T., Luccio, F., Preparata, F.P.: On finding the maxima of a set of vectors. JACM 22(4) (1975)Google Scholar
  11. 11.
    Lin, X., Yuan, Y., Wang, W., Lu, H.: Stabbing the sky: efficient skyline computation over sliding windows. In: ICDE (2005)Google Scholar
  12. 12.
    Matousek, J.: Computing dominances in En. Information Processing Letters 38(5), 277–278 (1991)MATHCrossRefMathSciNetGoogle Scholar
  13. 13.
    Papadias, D., Tao, Y., Fu, G., Seeger, B.: An optimal and progressive algorithm for skyline queries. In: SIGMOD (2003)Google Scholar
  14. 14.
    Papadimitriou, C.H., Yannakakis, M.: Multiobjective query optimization. In: PODS (2001)Google Scholar
  15. 15.
    Pei, J., Jin, W., Ester, M., Tao, Y.: Catching the best views of skyline: a semantic approach based on decisive subspaces. In: VLDB (2005)Google Scholar
  16. 16.
    Preparata, F.P., Shamos, M.I.: Computational Geometry: An Introduction. Springer, Heidelberg (1985)Google Scholar
  17. 17.
    Stojmenovic, I., Miyakawa, M.: An optimal parallel algorithm for solving the maximal elements problem in the plane. Parallel Computing 7(2) (June 1988)Google Scholar
  18. 18.
    Tan, K.-L., Eng, P.-K., Ooi, B.C.: Efficient progressive skyline computation. In: VLDB (2001)Google Scholar
  19. 19.
    Yuan, Y., Lin, X., Liu, Q., Wang, W., Yu, J.X., Zhang, Q.: Efficient computation of skyline cube. In: VLDB (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Chee-Yong Chan
    • 1
  • H. V. Jagadish
    • 1
  • Kian-Lee Tan
    • 1
  • Anthony K. H. Tung
    • 1
  • Zhenjie Zhang
    • 1
  1. 1.National University of Singapore & University of Michigan 

Personalised recommendations