World Wide Web

, Volume 15, Issue 1, pp 33–60 | Cite as

Finding superior skyline points for multidimensional recommendation applications

  • Jing Yang
  • Gabriel Pui Cheong Fung
  • Wei Lu
  • Xiaofang Zhou
  • Hong Chen
  • Xiaoyong Du
Article

Abstract

In a typical Web recommendation system, objects are often described by many attributes. It also needs to serve many users with a diversified range of preferences. In other words, it must be capable to efficiently support high dimensional preference queries that allow the user to explore the data space effectively without imposing specific preference weightings for each dimension. The skyline query, which can produce a set of objects guaranteed to contain all top ranked objects for any linear attribute preference combination, has been proposed to support this type of recommendation applications. However, it suffers from the problem known as ‘dimensionality curse’ as the size of skyline query result set can grow exponentially with the number of dimensions. Therefore, when the dimensionality is high, a large percentage of objects can become skyline points. This problem makes such a recommendation system less usable for users. In this paper, we propose a stronger type of skyline query, called core skyline query, that adopts a new quality measure called vertical dominance to return only an interesting subset of the traditional skyline points. An efficient query processing method is proposed to find core skyline points using a novel indexing structure called Linked Multiple B’-trees (LMB). Our approach can find such superior skyline points progressively without the need of computing the entire set of skyline points first.

Keywords

preference query recommendation systems high dimensional data vertical dominance core skyline points linked multiple B’-tree 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agrawal, R., and Wimmers, E.L.: A framework for expressing and combining preferences. In: Proc. of the 2000 ACM SIGMOD International Conference on Management of Data(SIGMOD 2000), vol. 29, pp. 297–306. ACM, New York (2000)CrossRefGoogle Scholar
  2. 2.
    Bartolini, I., Ciaccia, P., Patella, M.: Salsa: computing the skyline without scanning the whole sky. In: Proc. of the 15th ACM international conference on Information and knowledge management(CIKM 2006), pp. 405–414. ACM, New York (2006)Google Scholar
  3. 3.
    Bartolini, I., Ciaccia, P., Patella, M.: Efficient sort-based skyline evaluation. ACM Trans. Database Syst. (TODS), 33(4), 1–49 (2008)CrossRefGoogle Scholar
  4. 4.
    Bentley, J.L., Kung, H.T.T., Schkolnick, M., Thompson, C.D.: On the average number of maxima in a set of vectors and applications. Journal of the ACM (JACM) 25(4), 536–543 (1978)CrossRefMATHMathSciNetGoogle Scholar
  5. 5.
    Böhm, C., Ooi, B.C., Plant, C., Yan, Y.: Efficiently processing continuous k-nn queries on data streams. In: Proc. of the IEEE 23rd International Conference on Data Enginering(ICDE 2007), pp. 156–165 (2007)Google Scholar
  6. 6.
    Börzsönyi, S., Kossmann, D., Stocker, K.: The skyline operator. In: Proc. of the 17th International Conference on Data Engineering(ICDE 2001), pp. 421–430. IEEE Computer Society, Washington (2001)CrossRefGoogle Scholar
  7. 7.
    Carey, M.J., Kossmann, D.: On saying “enough already!” in SQL. In: SIGMOD, pp. 219–230 (1997)Google Scholar
  8. 8.
    Chan, C.-Y., Eng, P.-K., Tan, K.-L.: Efficient processing of skyline queries with partially-ordered domains. In: Proc. of the 21st International Conference on Data Engineering(ICDE 2005), pp. 190–191. IEEE Computer Society, Washington (2005)Google Scholar
  9. 9.
    Chan, C.-Y., Eng, P.-K., Tan, K.-L.: Stratified computation of skylines with partially-ordered domains. In: Proc. of the 2005 ACM SIGMOD International Conference on Management of Data(SIGMOD 2005), pp. 203–214. ACM, New York (2005)CrossRefGoogle Scholar
  10. 10.
    Chan, C.-Y., Jagadish, H.V., Tan, K.-L., Tung, A.K.H., Zhang, Z.: Finding k-dominant skylines in high dimensional space. In: Proc. of the 2006 ACM SIGMOD International Conference on Management of Data(SIGMOD 2006), pp. 503–514. ACM, New York (2006)CrossRefGoogle Scholar
  11. 11.
    Chan, C.Y., Jagadish, H.V., Tan, K.-L., Tung, A.K.H., Zhang, Z.: On high dimensional skylines. In: Proc. of the 10th International Conference on Extending Database Technology(EDBT 2006), pp. 478–495 (2006)Google Scholar
  12. 12.
    Chaudhuri, S., Dalvi, N., Kaushik, R.: Robust cardinality and cost estimation for skyline operator. In: Proc. of the 22nd International Conference on Data Engineering(ICDE 2006), p. 64. IEEE Computer Society, Washington (2006)Google Scholar
  13. 13.
    Chen, L., Lian, X.: Dynamic skyline queries in metric spaces. In: Proc. of the 11th International Conference on Extending Database Technology(EDBT 2008), pp. 333–343. ACM, New York (2008)CrossRefGoogle Scholar
  14. 14.
    Chomicki, J., Godfrey, P., Gryz, J., Liang, D.: Skyline with presorting. In: Proc. of the 19th International Conference on Data Engineering(ICDE 2003), pp. 717–816 (2003)Google Scholar
  15. 15.
    Das, G., Gunopulos, D., Koudas, N., Sarkas, N.: Ad-hoc top-k query answering for data streams. In: Proc. of the 33rd International Conference on Very Large Data Bases(VLDB 2007), pp. 183–194. VLDB Endowment (2007)Google Scholar
  16. 16.
    Fung, G.P.C., Lu, W., Du, X.: Dominant and k nearest probabilistic skylines. In: Proc. of the 14th International Conference on Database Systems for Advanced Applications(DASFAA 2009), pp. 263–277. Springer-Verlag, Berlin (2009)Google Scholar
  17. 17.
    Fung, G.P.C., Lu, W., Yang, J., Du, X., Zhou, X.: Extract interesting skyline points in high dimension. In: Proc. of 15th International Conference on Database Systems for Advanced Applications (DASFAA 2010), pp. 94–108 (2010)Google Scholar
  18. 18.
    Godfrey, P.: Skyline cardinality for relational processing. In: Foundations of Information and Knowledge Systems, pp. 78–97 (2004)Google Scholar
  19. 19.
    Godfrey, P., Shipley, R., Gryz, J.: Maximal vector computation in large data sets. In: Proc. of the 31st international conference on Very Large Data Bases(VLDB 2005), pp. 229–240. VLDB Endowment (2005)Google Scholar
  20. 20.
    Khalefa, M.E., Mokbel, M.F., Levandoski, J.J.: Skyline query processing for incomplete data. In: Proc. of the 2008 IEEE 24th International Conference on Data Engineering(ICDE 2008), pp. 556–565. IEEE Computer Society, Washington (2008)CrossRefGoogle Scholar
  21. 21.
    Kießling, W.: Foundations of preferences in database systems. InL Proc. of the 28th International Conference on Very Large Data Bases(VLDB 2002), pp. 311–322. VLDB Endowment (2002)Google Scholar
  22. 22.
    Kossmann, D., Ramsak, F., Rost, S.: Shooting stars in the sky: an online algorithm for skyline queries. In: Proc. of the 28th International Conference on Very Large Data Bases(VLDB 2002), pp. 275–286. VLDB Endowment (2002)Google Scholar
  23. 23.
    Kung, H.T.T., Luccio, F.L., Preparata, F.P.: On finding the maxima of a set of vectors. Journal of the ACM (JACM) 22(4), 469–476 (1975)CrossRefMATHMathSciNetGoogle Scholar
  24. 24.
    Lee, K.C.K., Zheng, B., Li, H., Lee, W.-C.: Approaching the skyline in z order. In: Proc. of the 33rd International Conference on Very Large Data Bases(VLDB 2007), pp. 279–290. VLDB Endowment (2007)Google Scholar
  25. 25.
    Lian, X., Chen, L.: Monochromatic and bichromatic reverse skyline search over uncertain databases. In: Proc. of the 2008 ACM SIGMOD International Conference on Management of Data(SIGMOD 2008), pp. 213–226. ACM, New York (2008)CrossRefGoogle Scholar
  26. 26.
    Lin, X., Yuan, Y., Wang, W., Lu, H.: Stabbing the sky: efficient skyline computation over sliding windows. In: Proc. of the 21st International Conference on Data Engineering(ICDE 2005), pp. 502–513. IEEE Computer Society, Washington (2005)Google Scholar
  27. 27.
    Lin, X., Yuan, Y., Zhang, Q., Zhang, Y.: Selecting stars: the k most representative skyline operator. In: Proc. of the IEEE 23rd International Conference on Data Enginering(ICDE 2007), pp. 86–95 (2007)Google Scholar
  28. 28.
    Matoušek, J.: Computing dominances in e n (short communication). Inf. Process. Lett. 38(5), 277–278 (1991)CrossRefMATHGoogle Scholar
  29. 29.
    Morse, M., Patel, J.M., Grosky, W.I.: Efficient continuous skyline computation. Inf. Sci. 177(17), 3411–3437 (2007)CrossRefMathSciNetGoogle Scholar
  30. 30.
    Morse, M., Patel, J.M., Jagadish, H.V.: Efficient skyline computation over low-cardinality domains. In: Proc. of the 33rd International Conference on Very Large Data Bases(VLDB 2007), pp. 267–278. VLDB Endowment (2007)Google Scholar
  31. 31.
    Mouratidis, K., Bakiras, S., Papadias, D.: Continuous monitoring of top-k queries over sliding windows. In: Proc. of the 2006 ACM SIGMOD International Conference on Management of Data(SIGMOD 2006), pp. 635–646. ACM, New York (2006)CrossRefGoogle Scholar
  32. 32.
    Nielsen, F.: Output-sensitive peeling of convex and maximal layers. Inf. Process. Lett. 59(5), 255–259 (1996)CrossRefMATHGoogle Scholar
  33. 33.
    Papadias, D., Tao, Y., Fu, G., Seeger, B.: An optimal and progressive algorithm for skyline queries. In: Proc. of the 2003 ACM SIGMOD International Conference on Management of Data(SIGMOD 2003), pp. 467–478. ACM, New York (2003)CrossRefGoogle Scholar
  34. 34.
    Papadias, D., Tao, Y., Fu, G., Seeger, B.: Progressive skyline computation in database systems. ACM Trans. Database Syst. (TODS) 30(1), 41–82 (2005)CrossRefGoogle Scholar
  35. 35.
    Pei, J., Jiang, B., Lin, X., Yuan, Y.: Probabilistic skylines on uncertain data. In: Proc. of the 33rd International Conference on Very Large Data Bases(VLDB 2007), pp. 15–26. VLDB Endowment (2007)Google Scholar
  36. 36.
    Pei, J., Jin, W., Ester, M., Tao, Y.: Catching the best views of skyline: a semantic approach based on decisive subspaces. In: Proc. of the 31st International Conference on Very Large Data Bases(VLDB 2005), pp. 253–264. VLDB Endowment (2005)Google Scholar
  37. 37.
    Raghu, R., Johannes, G.: Database Management Systems, 3rd edn. McGraw-Hill Science/Engineering/Math (2003)Google Scholar
  38. 38.
    Sacharidis, D., Bouros, P., Sellis, T.: Caching dynamic skyline queries. In: Proc. of the 20th International Conference on Scientific and Statistical Database Management(SSDBM 2008), pp. 455–472. Springer-Verlag, Berlin (2008)Google Scholar
  39. 39.
    Sarkas, N., Das, G., Koudas, N., Tung, A.K.H.: Categorical skylines for streaming data. In: Proc. of the 2008 ACM SIGMOD International Conference on Management of Data(SIGMOD 2008), pp. 239–250. ACM, New York (2008)CrossRefGoogle Scholar
  40. 40.
    Tan, K.-L., Eng, P.-K., Ooi, B.C.: Efficient progressive skyline computation. In: Proc. of the 27th International Conference on Very Large Data Bases(VLDB 2001), pp. 301–310. Morgan Kaufmann Publishers Inc., San Francisco (2001)Google Scholar
  41. 41.
    Tao, Y., Xiao, X., Pei, J.: Subsky: efficient computation of skylines in subspaces. In: Proc. of the 22nd International Conference on Data Engineering(ICDE 2006), p. 65. IEEE Computer Society, Washington (2006)Google Scholar
  42. 42.
    Wong, R.C.-W., Fu, A.W.-C., Pei, J., Ho, Y.S., Wong, T., Liu, Y.: Efficient skyline querying with variable user preferences on nominal attributes. PVLDB 1(1), 1032–1043 (2008)Google Scholar
  43. 43.
    Wong, R.C.-W., Pei, J., Fu, A.W.-C., Wang, K.: Mining favorable facets. In: Proc. of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining(KDD 2007), pp. 804–813. ACM, New York (2007)CrossRefGoogle Scholar
  44. 44.
    Yiu, M.L., Mamoulis, N.: Efficient processing of top-k dominating queries on multi-dimensional data. In: Proc. of the 33rd International Conference on Very Large Data Bases(VLDB 2007), pp. 483–494. VLDB Endowment (2007)Google Scholar
  45. 45.
    Yuan, Y., Lin, X., Liu, Q., Wang, W., Yu, J.X., Zhang, Q.: Efficient computation of the skyline cube. In: Proc. of the 31st International Conference on Very large Data Bases(VLDB 2005), pp. 241–252. VLDB Endowment (2005)Google Scholar
  46. 46.
    Zhang, S., Mamoulis, N., Cheung, D.W.: Scalable skyline computation using object-based space partitioning. In: Proc. of the 35th SIGMOD International Conference on Management of Data(SIGMOD 2009), pp. 483–494. ACM, New York (2009)CrossRefGoogle Scholar
  47. 47.
    Zhang, Z., Guo, X., Lu, H., Tung, A.K.H., Wang, N.: Discovering strong skyline points in high dimensional spaces. In: Proc. of the 14th ACM International Conference on Information and Knowledge Management (CIKM 2005), pp. 247–248. ACM, New York (2005)Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  • Jing Yang
    • 1
    • 2
  • Gabriel Pui Cheong Fung
    • 3
  • Wei Lu
    • 1
    • 2
  • Xiaofang Zhou
    • 1
    • 2
    • 4
  • Hong Chen
    • 1
    • 2
  • Xiaoyong Du
    • 1
    • 2
  1. 1.School of InformationRenmin University of ChinaBeijingChina
  2. 2.Key Labs of Data Engineering and Knowledge Engineering, Ministry of EducationBeijingChina
  3. 3.School of Computing InformaticsArizona State UniversityPhoenixUSA
  4. 4.School of Information Technology and Electrical EngineeringThe University of QueenslandBrisbaneAustralia

Personalised recommendations