The VLDB Journal

, Volume 18, Issue 2, pp 407–427

Anytime measures for top-k algorithms on exact and fuzzy data sets

  • Benjamin Arai
  • Gautam Das
  • Dimitrios Gunopulos
  • Nick Koudas
Special Issue Paper

DOI: 10.1007/s00778-008-0127-9

Cite this article as:
Arai, B., Das, G., Gunopulos, D. et al. The VLDB Journal (2009) 18: 407. doi:10.1007/s00778-008-0127-9

Abstract

Top-k queries on large multi-attribute data sets are fundamental operations in information retrieval and ranking applications. In this article, we initiate research on the anytime behavior of top-k algorithms on exact and fuzzy data. In particular, given specific top-k algorithms (TA and TA-Sorted) we are interested in studying their progress toward identification of the correct result at any point during the algorithms’ execution. We adopt a probabilistic approach where we seek to report at any point of operation of the algorithm the confidence that the top-k result has been identified. Such a functionality can be a valuable asset when one is interested in reducing the runtime cost of top-k computations. We present a thorough experimental evaluation to validate our techniques using both synthetic and real data sets.

Keywords

Approximate query Anytime Top-k Fuzzy data 

Copyright information

© Springer-Verlag 2009

Authors and Affiliations

  • Benjamin Arai
    • 1
  • Gautam Das
    • 2
  • Dimitrios Gunopulos
    • 3
  • Nick Koudas
    • 4
  1. 1.University of CaliforniaRiversideUSA
  2. 2.University of TexasArlingtonUSA
  3. 3.University of AthensAthensGreece
  4. 4.University of TorontoTorontoCanada

Personalised recommendations