RAF: An Activation Framework for Refining Similarity Queries Using Learning Techniques

  • Yiming Ma
  • Sharad Mehrotra
  • Dawit Yimam Seid
  • Qi Zhong
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3882)


In numerous applications that deal with similarity search, a user may not have an exact specification of his information need and/or may not be able to formulate a query that exactly captures his notion of similarity. A promising approach to mitigate this problem is to enable the user to submit a rough approximation of the desired query and use relevance feedback on retrieved objects to refine the query. In this paper, we explore such a refinement strategy for a general class of structured similarity queries. Our approach casts the refinement problem as that of learning concepts using the tuples on which the user provides feedback as a labeled training set. Under this setup, similarity query refinement consists of two learning tasks: learning the structure of the query and learning the relative importance of query components. The paper develops machine learning approaches suitable for the two learning tasks. The primary contribution of the paper is the Refinement Activation Framework (RAF) that decides when each learner is invoked. Experimental analysis over many real life datasets shows that our strategy significantly outperforms existing approaches in terms of retrieval quality.


Relevance Feedback Original Query Initial Query Similarity Query Activation Framework 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Baeza-Yates, R., Ribeiro-Neto,: Modern Information Retrieval. ACM Press Series. Addison Wesley, New York (1999)Google Scholar
  2. 2.
    Bloedorn, E., Michalski, R.S., Wnek, J.: Multistrategy constructive induction: AQ17-MCI. In: Proc. of the 2nd Int. Workshop on Multistrategy Learning, pp. 188–203 (1993)Google Scholar
  3. 3.
    Clark, P., Niblett, T.: The CN2 Induction Algorithm. Machine Learning 3(4), 261–283 (1989)Google Scholar
  4. 4.
    Fagin, R.: Combining Fuzzy Information from Multiple Systems. In: Proc. of the 15th ACM Symp. on PODS (1996)Google Scholar
  5. 5.
    Fagin, R., Lotem, A., Naor, M.: Optimal aggregation algorithms for middleware. In: PODS (2001)Google Scholar
  6. 6.
  7. 7.
    Ishikawa, Y., Subramanya, R., Faloutsos, C.: Mindreader: Querying databases through multiple examples. In: VLDB (1998)Google Scholar
  8. 8.
    Mangasarian, O.L., Setiono, R., Wolberg, W.H.: Pattern recognition via linear programming: Theory and application to medical diagnosis. In: SIAM (1990)Google Scholar
  9. 9.
    Mehrotra, S., Rui, Y., Ortega, M., Huang, T.: Supporting content-based queries over images in mars. In: Proc. of IEEE-ICMCS 1997 (1997)Google Scholar
  10. 10.
    Merz, C.J., Murphy, P.: UCI Repository of Machine Learning Databases (1996),
  11. 11.
    Raymond, J., Mooney, R.J.: Encouraging Experimental Results on learning CNF. Machine Learning 19(1), 79–92 (1995)Google Scholar
  12. 12.
    Ortega, M., Rui, Y., Chakrabarti, K., Porkaew, K., Mehrotra, S., Huang, T.: Supporting ranked boolean similarity queries in mars. IEEE Trans. on Data Engineering 10(6) ( December 1998)Google Scholar
  13. 13.
    Ortega-Binderberger, M., Chakrabarti, K., Mehrotra, S.: An Approach to Integrating Query Refinement in SQL. In: Chaudhri, A.B., Unland, R., Djeraba, C., Lindner, W. (eds.) EDBT 2002. LNCS, vol. 2490, Springer, Heidelberg (2002)Google Scholar
  14. 14.
    Porkaew, K., Mehrotra, S., Ortega, M., Chakrabarti, K.: Similarity search using multiple examples in mars. In: Huijsmans, D.P., Smeulders, A.W.M. (eds.) VISUAL 1999. LNCS, vol. 1614, Springer, Heidelberg (1999)CrossRefGoogle Scholar
  15. 15.
    Quinlan, R.: C4.5: Program for Machine Learning. Morgan Kaufmann, San Francisco (1992)Google Scholar
  16. 16.
    Rocchio, J.: Relevance feedback in information retrieval. In: Salton, G. (ed.) The SMART Retrieval System: Experiments in Automatic Document Processing, pp. 313–323. Prentice Hall, Englewood Cliffs (1971)Google Scholar
  17. 17.
    Rui, Y., Huang, T., Mehrotra, S.: Content-based image retrieval with relevance feedback in mars. In: IEEE Proc. of Int. Conf. on Image Processing (1997)Google Scholar
  18. 18.
    Rui, Y., Huang, T., Ortega, M., Mehrotra, S.: Relevance feedback: A power tool for interactive content-based image retrieval. IEEE Trans. Circuits and Systems for Video Technology (1998)Google Scholar
  19. 19.
    Salton, G.: The use of extended boolean logic in information retrieval. In: SIGMOD (1984)Google Scholar
  20. 20.
    Wu, L., Faloutsos, C., Sycara, K., Payne, T.: FALCON: Feedback adaptive loop for content-based retrieval. In: VLDB (2000)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Yiming Ma
    • 1
  • Sharad Mehrotra
    • 1
  • Dawit Yimam Seid
    • 1
  • Qi Zhong
    • 1
  1. 1.Department of Computer ScienceUniversity of CaliforniaIrvineUSA

Personalised recommendations