Distributed and Parallel Databases

, Volume 31, Issue 1, pp 47-70

First online:

Personalized ranking in web databases: establishing and utilizing an appropriate workload

  • Aditya TelangAffiliated withDept. of Comp. Sci. & Engg., The University of Texas at Arlington Email author 
  • , Sharma ChakravarthyAffiliated withDept. of Comp. Sci. & Engg., The University of Texas at Arlington
  • , Chengkai LiAffiliated withDept. of Comp. Sci. & Engg., The University of Texas at Arlington

Rent the article at a discount

Rent now

* Final gross prices may vary according to local VAT.

Get Access


The emergence of the deep Web has given a new connotation to the concept of ranking database query results. Earlier approaches for ranking either resorted to analyzing frequencies of database values and query logs or establishing user profiles. In contrast, an integrated approach, based on the notion of a similarity model, for holistically supporting user- and query-dependent ranking has been recently proposed (Telang et al. in IEEE Transactions on Knowledge and Data Engineering (TKDE), 2011). An important component of this framework is a workload consisting of ranking functions, wherein each function represents an individual user’s preferences towards the results of a specific query. At the time of answering a query for which no prior ranking function exists, the similarity model is employed, and is expected to ensure a good quality of ranking as long as a ranking function for a very similar user-query pair exists in this workload.

In this paper, we address the problem of determining an appropriate set of user-query pairs to form a workload of ranking functions to support user- and query-dependent ranking for Web databases. We propose a novel metric, termed workload goodness, that quantifies the notion of a “good” workload into an absolute value. The process of finding such a workload of optimal goodness is a combinatorially explosive problem; therefore, we propose a heuristic solution, and advance three approaches for determining an acceptable workload, in a static as well as a dynamic environment. We discuss the effectiveness of our proposal analytically as well as experimentally over two Web databases.


Ranking Web databases Similarity model Workload