DEXA 2009: Database and Expert Systems Applications pp 471-485 | Cite as
Reaching the Top of the Skyline: An Efficient Indexed Algorithm for Top-k Skyline Queries
Abstract
Criteria that induce a Skyline naturally represent user’s preference conditions useful to discard irrelevant data in large datasets. However, in the presence of high-dimensional Skyline spaces, the size of the Skyline can still be very large, making unfeasible for users to process this set of points. To identify the best points among the Skyline, the Top-k Skyline approach has been proposed. Top-k Skyline uses discriminatory criteria to induce a total order of the points that comprise the Skyline, and recognizes the best or top-k objects based on these criteria. Different algorithms have been defined to compute the top-k objects among the Skyline; while existing solutions are able to produce the Top-k Skyline, they may be very costly. First, state-of-the-art Top-k Skyline solutions require the computation of the whole Skyline; second, they execute probes of the multicriteria function over the whole Skyline points. Thus, if k is much smaller than the cardinality of the Skyline, these solutions may be very inefficient because a large number of non-necessary probes may be evaluated. In this paper, we propose the TKSI, an efficient solution for the Top-k Skyline that overcomes existing solutions drawbacks. The TKSI is an index-based algorithm that is able to compute only the subset of the Skyline that will be required to produce the top-k objects; thus, the TKSI is able to minimize the number of non-necessary probes. We have empirically studied the quality of TKSI, and we report initial experimental results that show the TKSI is able to speed up the computation of the Top-k Skyline in at least 50% percent w.r.t. the state-of-the-art solutions, when k is smaller than the size of the Skyline.
Keywords
Preference based Queries Skyline Top-kPreview
Unable to display preview. Download preview PDF.
References
- 1.Balke, W.-T., Güntzer, U.: Multi-objective Query Processing for Database Systems. In: Proceedings of the International Conference on Very Large Databases (VLDB), Canada, pp. 936–947 (2004)Google Scholar
- 2.Balke, W.-T., Güntzer, U., Zheng, J.X.: Efficient distributed skylining for web information systems. In: Bertino, E., Christodoulakis, S., Plexousakis, D., Christophides, V., Koubarakis, M., Böhm, K., Ferrari, E. (eds.) EDBT 2004. LNCS, vol. 2992, pp. 256–273. Springer, Heidelberg (2004)CrossRefGoogle Scholar
- 3.Börzönyi, S., Kossman, D., Stocker, K.: The Skyline operator. In: Proceedings of the International Conference on Data Engineering (ICDE), Germany, pp. 421–430 (2001)Google Scholar
- 4.Brando, C., Goncalves, M., González, V.: Evaluating top-k skyline queries over relational databases. In: Wagner, R., Revell, N., Pernul, G. (eds.) DEXA 2007. LNCS, vol. 4653, pp. 254–263. Springer, Heidelberg (2007)CrossRefGoogle Scholar
- 5.Carey, M., Kossman, D.: On saying “Enough already!” in SQL. In: Proceedings of the ACM SIGMOD Conference on Management of Data, vol. 26(2), pp. 219–230 (1997)Google Scholar
- 6.Chang, K., Hwang, S.-W.: Optimizing access cost for top-k queries over Web sources: A unified cost-based approach. Technical Report UIUCDS-R-2003-2324, University of Illinois at Urbana-Champaign (2003)Google Scholar
- 7.Chan, C.-Y., Jagadish, H.V., Tan, K.-L., Tung, A.K.H., Zhang, Z.: On high dimensional skylines. In: Ioannidis, Y., Scholl, M.H., Schmidt, J.W., Matthes, F., Hatzopoulos, M., Böhm, K., Kemper, A., Grust, T., Böhm, C. (eds.) EDBT 2006. LNCS, vol. 3896, pp. 478–495. Springer, Heidelberg (2006)CrossRefGoogle Scholar
- 8.Fagin, R.: Combining fuzzy information from multiple systems. Journal of Computer and System Sciences (JCSS) 58(1), 216–226 (1996); Proceedings of the Conference on Very Large Data Bases (VLDB), Norway, pp. 229–240 (2005)MathSciNetGoogle Scholar
- 9.Godfrey, P., Shipley, R., Gryz, J.: Maximal Vector Computation in Large Data SetsGoogle Scholar
- 10.Goncalves, M., Vidal, M.-E.: Preferred skyline: A hybrid approach between sQLf and skyline. In: Andersen, K.V., Debenham, J., Wagner, R. (eds.) DEXA 2005. LNCS, vol. 3588, pp. 375–384. Springer, Heidelberg (2005)CrossRefGoogle Scholar
- 11.Goncalves, M., Vidal, M.E.: Top-k Skyline: A Unified Approach. In: Proceedings of OTM (On the Move) 2005 PhD Symposium, Cyprus, pp. 790–799 (2005)Google Scholar
- 12.Lee, J., You, G.-w., Hwang, S.-w.: Telescope: Zooming to interesting skylines. In: Kotagiri, R., Radha Krishna, P., Mohania, M., Nantajeewarawat, E. (eds.) DASFAA 2007. LNCS, vol. 4443, pp. 539–550. Springer, Heidelberg (2007)CrossRefGoogle Scholar
- 13.Lin, X., Yuan, Y., Zhang, Q., Zhang, Y.: Selecting stars: The k most representative Skyline operator. In: Proceedings of the International Conference on Data Engineering (ICDE), Turkey, pp. 86–95 (2007)Google Scholar
- 14.Lo, E., Yip, K., Lin, K.-I., Cheung, D.: Progressive Skylining over Web-Accessible Database. Journal of Data and Knowledge Engineering 57(2), 122–147 (2006)CrossRefGoogle Scholar
- 15.Papadias, D., Tao, Y., Fu, G., Seeger, B.: Progressive Skyline computation in database systems. ACM Transactions Database Systems 30(1), 41–82 (2005)CrossRefGoogle Scholar
- 16.Pei, J., Jin, W., Ester, M., Tao, Y.: Catching the Best Views of Skyline: A semantic Approach Based on Decisive Subspaces. In: Proceedings of the Very Large Databases (VLDB), Norway, pp. 253–264 (2005)Google Scholar
- 17.Tao, Y., Xiao, X., Pei, J.: Efficient Skyline and Top-k Retrieval in Subspaces. IEEE Transactions on Knowledge and Data Engineering 19(8), 1072–1088 (2007)CrossRefGoogle Scholar
- 18.Vlachou, A., Vazirgiannis, M.: Link-based ranking of Skyline result sets. In: Proc. of 3rd Multidiciplinary Workshop on Advances in Preference Handling (2007)Google Scholar
- 19.