Distributed and Parallel Databases

, Volume 31, Issue 1, pp 47–70 | Cite as

Personalized ranking in web databases: establishing and utilizing an appropriate workload

Article

Abstract

The emergence of the deep Web has given a new connotation to the concept of ranking database query results. Earlier approaches for ranking either resorted to analyzing frequencies of database values and query logs or establishing user profiles. In contrast, an integrated approach, based on the notion of a similarity model, for holistically supporting user- and query-dependent ranking has been recently proposed (Telang et al. in IEEE Transactions on Knowledge and Data Engineering (TKDE), 2011). An important component of this framework is a workload consisting of ranking functions, wherein each function represents an individual user’s preferences towards the results of a specific query. At the time of answering a query for which no prior ranking function exists, the similarity model is employed, and is expected to ensure a good quality of ranking as long as a ranking function for a very similar user-query pair exists in this workload.

In this paper, we address the problem of determining an appropriate set of user-query pairs to form a workload of ranking functions to support user- and query-dependent ranking for Web databases. We propose a novel metric, termed workload goodness, that quantifies the notion of a “good” workload into an absolute value. The process of finding such a workload of optimal goodness is a combinatorially explosive problem; therefore, we propose a heuristic solution, and advance three approaches for determining an acceptable workload, in a static as well as a dynamic environment. We discuss the effectiveness of our proposal analytically as well as experimentally over two Web databases.

Keywords

Ranking Web databases Similarity model Workload 

References

  1. 1.
    Agrawal, R., Rantzau, R., Terzi, E.: Context-sensitive ranking. In: SIGMOD Conference, pp. 383–394. ACM, New York (2006) Google Scholar
  2. 2.
    Agrawal, S., Chaudhuri, S., Das, G., Gionis, A.: Automated ranking of database query results. In: Conference on Innovations in Database Research (CIDR) (2003) Google Scholar
  3. 3.
    Balabanovic, M., Shoham, Y.: Content-based collaborative recommendation. ACM Commun. 40(3), 66–72 (1997) CrossRefGoogle Scholar
  4. 4.
    Basilico, J., Hofmann, T.: A joint framework for collaborative and content filtering. In: SIGIR, pp. 550–551 (2004) CrossRefGoogle Scholar
  5. 5.
    Basu, C., Hirsh, H., Cohen, W.W.: Recommendation as classification: using social and content-based information in recommendation. In: AAAI/IAAI, pp. 714–720 (1998) Google Scholar
  6. 6.
    Bergman, M.K.: The deep web: surfacing hidden value. J. Electron. Publ. 7(1) (2001) Google Scholar
  7. 7.
    Billsus, D., Pazzani, M.J.: Learning collaborative information filters. In: International Conference on Machine Learning (ICML), pp. 46–54 (1998) Google Scholar
  8. 8.
    Blum, M., Floyd, R.W., Pratt, V., Rivest, R.L., Tarjan, R.E.: Time bounds for selection. J. Comput. Syst. Sci. 7, 448–461 (1973) MathSciNetMATHCrossRefGoogle Scholar
  9. 9.
    Chang, K.C.-C., He, B., Li, C., Patil, M., Zhang, Z.: Structured databases on the web: observations and implications. SIGMOD Rec. 33(3), 61–70 (2004) CrossRefGoogle Scholar
  10. 10.
    Chaudhuri, S., Das, G., Hristidis, V., Weikum, G.: Probabilistic ranking of database query results. In: VLDB, pp. 888–899 (2004) CrossRefGoogle Scholar
  11. 11.
    Chaudhuri, S., Das, G., Hristidis, V., Weikum, G.: Probabilistic information retrieval approach for ranking of database query results. TODS 31(3), 1134–1168 (2006) CrossRefGoogle Scholar
  12. 12.
    Foltz, P.W., Dumais, S.T.: Personalized information delivery: an analysis of information filtering methods. ACM Commun. 35(12), 51–60 (1992) CrossRefGoogle Scholar
  13. 13.
    Gauch, S., Speretta, M., Chandramouli, A., Micarelli, A.: User profiles for personalized information access. In: The Adaptive Web, pp. 54–89 (2007) CrossRefGoogle Scholar
  14. 14.
    Google. Google base. http://www.google.com/base
  15. 15.
    Hofmann, T.: Collaborative filtering via gaussian probabilistic latent semantic analysis. In: SIGIR, pp. 259–266 (2003) Google Scholar
  16. 16.
    Hwang, S.-W.: Supporting ranking for data retrieval. Ph.D. thesis, University of Illinois, Urbana Champaign (2005) Google Scholar
  17. 17.
    Ilyas, I.F., Soliman, M.A.: Probabilistic Ranking Techniques in Relational Databases. Synthesis Lectures on Data Management (2011). Morgan & Claypool Publishers MATHGoogle Scholar
  18. 18.
    Kanungo, T., Mount, D.: An efficient k-means clustering algorithm: analysis and implementation. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 881–892 (2002) CrossRefGoogle Scholar
  19. 19.
    Koutrika, G.: Database query personalization. In: EDBT, pp. 147–152 (2005) Google Scholar
  20. 20.
    Koutrika, G., Ioannidis, Y.E.: Personalization of queries in database systems. In: ICDE, pp. 597–608 (2004) Google Scholar
  21. 21.
    Koutrika, G., Ioannidis, Y.E.: Constrained optimalities in query personalization. In: SIGMOD Conference, pp. 73–84 (2005) Google Scholar
  22. 22.
    Li, C., Chang, K.C.-C., Ilyas, I.F., Song, S.: Ranksql: query algebra and optimization for relational top-k queries. In: SIGMOD Conference, pp. 131–142 (2005) Google Scholar
  23. 23.
    Marian, A., Bruno, N., Gravano, L.: Evaluating top-k queries over web-accessible databases. ACM Trans. Database Syst. 29(2), 319–362 (2004) CrossRefGoogle Scholar
  24. 24.
    Ortega-Binderberger, M., Chakrabarti, K., Mehrotra, S.: An approach to integrating query refinement in sql. In: EDBT, pp. 15–33 (2002) Google Scholar
  25. 25.
    Schein, A.I., Popescul, A., Ungar, L.H., Pennock, D.M.: Methods and metrics for cold-start recommendations. In: SIGIR, pp. 253–260 (2002) Google Scholar
  26. 26.
    Soliman, M.A., Ilyas, I.F., Ben-David, S.: Supporting ranking queries on uncertain and incomplete data. VLDB J. 19(4), 477–501 (2010) CrossRefGoogle Scholar
  27. 27.
    Soliman, M.A., Ilyas, I.F., Martinenghi, D., Tagliasacchi, M.: Ranking with uncertain scoring functions: semantics and sensitivity measures. In: SIGMOD Conference, pp. 805–816 (2011) Google Scholar
  28. 28.
    Su, W., Wang, J., Huang, Q., Lochovsky, F.: Query result ranking over e-commerce web databases. In: Conference on Information and Knowledge Management (CIKM), pp. 575–584 (2006) Google Scholar
  29. 29.
    Telang, A., Li, C., Chakravarthy, S.: One size does not fit all: towards user- and query-dependent ranking for web databases. Technical report 6, University of Texas at Arlington (2009) Google Scholar
  30. 30.
    Telang, A., Li, C., Chakravarthy, S.: One size does not fit all: towards user- and query-dependent ranking for web databases. IEEE Transactions on Knowledge and Data Engineering (TKDE) (2011) Google Scholar
  31. 31.
    Werner, K.: Foundations of preferences in database systems. In: VLDB. VLDB Endowment, pp. 311–322 (2002) Google Scholar
  32. 32.
    Yu, H., Hwang, S.-w., Chang, K.C.-C.: Enabling soft queries for data retrieval. Inf. Syst. 32(4), 560–574 (2007) CrossRefGoogle Scholar
  33. 33.
    Yu, H., Kim, Y., won Hwang, S.: Rv-svm: an efficient method for learning ranking svm. In: PAKDD, pp. 426–438 (2009) Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2012

Authors and Affiliations

  • Aditya Telang
    • 1
  • Sharma Chakravarthy
    • 1
  • Chengkai Li
    • 1
  1. 1.Dept. of Comp. Sci. & Engg.The University of Texas at ArlingtonArlingtonUSA

Personalised recommendations