Skip to main content

RAF: An Activation Framework for Refining Similarity Queries Using Learning Techniques

  • Conference paper
Database Systems for Advanced Applications (DASFAA 2006)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3882))

Included in the following conference series:

Abstract

In numerous applications that deal with similarity search, a user may not have an exact specification of his information need and/or may not be able to formulate a query that exactly captures his notion of similarity. A promising approach to mitigate this problem is to enable the user to submit a rough approximation of the desired query and use relevance feedback on retrieved objects to refine the query. In this paper, we explore such a refinement strategy for a general class of structured similarity queries. Our approach casts the refinement problem as that of learning concepts using the tuples on which the user provides feedback as a labeled training set. Under this setup, similarity query refinement consists of two learning tasks: learning the structure of the query and learning the relative importance of query components. The paper develops machine learning approaches suitable for the two learning tasks. The primary contribution of the paper is the Refinement Activation Framework (RAF) that decides when each learner is invoked. Experimental analysis over many real life datasets shows that our strategy significantly outperforms existing approaches in terms of retrieval quality.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Baeza-Yates, R., Ribeiro-Neto,: Modern Information Retrieval. ACM Press Series. Addison Wesley, New York (1999)

    Google Scholar 

  2. Bloedorn, E., Michalski, R.S., Wnek, J.: Multistrategy constructive induction: AQ17-MCI. In: Proc. of the 2nd Int. Workshop on Multistrategy Learning, pp. 188–203 (1993)

    Google Scholar 

  3. Clark, P., Niblett, T.: The CN2 Induction Algorithm. Machine Learning 3(4), 261–283 (1989)

    Google Scholar 

  4. Fagin, R.: Combining Fuzzy Information from Multiple Systems. In: Proc. of the 15th ACM Symp. on PODS (1996)

    Google Scholar 

  5. Fagin, R., Lotem, A., Naor, M.: Optimal aggregation algorithms for middleware. In: PODS (2001)

    Google Scholar 

  6. I.: IBM linear optimization package: http://www-3.ibm.com/software/data/bi/osl/pubs/lpsol/lpuser.htm

  7. Ishikawa, Y., Subramanya, R., Faloutsos, C.: Mindreader: Querying databases through multiple examples. In: VLDB (1998)

    Google Scholar 

  8. Mangasarian, O.L., Setiono, R., Wolberg, W.H.: Pattern recognition via linear programming: Theory and application to medical diagnosis. In: SIAM (1990)

    Google Scholar 

  9. Mehrotra, S., Rui, Y., Ortega, M., Huang, T.: Supporting content-based queries over images in mars. In: Proc. of IEEE-ICMCS 1997 (1997)

    Google Scholar 

  10. Merz, C.J., Murphy, P.: UCI Repository of Machine Learning Databases (1996), http://www.cs.uci.edu/~mlearn/MLRepository.html

  11. Raymond, J., Mooney, R.J.: Encouraging Experimental Results on learning CNF. Machine Learning 19(1), 79–92 (1995)

    Google Scholar 

  12. Ortega, M., Rui, Y., Chakrabarti, K., Porkaew, K., Mehrotra, S., Huang, T.: Supporting ranked boolean similarity queries in mars. IEEE Trans. on Data Engineering 10(6) ( December 1998)

    Google Scholar 

  13. Ortega-Binderberger, M., Chakrabarti, K., Mehrotra, S.: An Approach to Integrating Query Refinement in SQL. In: Chaudhri, A.B., Unland, R., Djeraba, C., Lindner, W. (eds.) EDBT 2002. LNCS, vol. 2490, Springer, Heidelberg (2002)

    Google Scholar 

  14. Porkaew, K., Mehrotra, S., Ortega, M., Chakrabarti, K.: Similarity search using multiple examples in mars. In: Huijsmans, D.P., Smeulders, A.W.M. (eds.) VISUAL 1999. LNCS, vol. 1614, Springer, Heidelberg (1999)

    Chapter  Google Scholar 

  15. Quinlan, R.: C4.5: Program for Machine Learning. Morgan Kaufmann, San Francisco (1992)

    Google Scholar 

  16. Rocchio, J.: Relevance feedback in information retrieval. In: Salton, G. (ed.) The SMART Retrieval System: Experiments in Automatic Document Processing, pp. 313–323. Prentice Hall, Englewood Cliffs (1971)

    Google Scholar 

  17. Rui, Y., Huang, T., Mehrotra, S.: Content-based image retrieval with relevance feedback in mars. In: IEEE Proc. of Int. Conf. on Image Processing (1997)

    Google Scholar 

  18. Rui, Y., Huang, T., Ortega, M., Mehrotra, S.: Relevance feedback: A power tool for interactive content-based image retrieval. IEEE Trans. Circuits and Systems for Video Technology (1998)

    Google Scholar 

  19. Salton, G.: The use of extended boolean logic in information retrieval. In: SIGMOD (1984)

    Google Scholar 

  20. Wu, L., Faloutsos, C., Sycara, K., Payne, T.: FALCON: Feedback adaptive loop for content-based retrieval. In: VLDB (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ma, Y., Mehrotra, S., Seid, D.Y., Zhong, Q. (2006). RAF: An Activation Framework for Refining Similarity Queries Using Learning Techniques. In: Li Lee, M., Tan, KL., Wuwongse, V. (eds) Database Systems for Advanced Applications. DASFAA 2006. Lecture Notes in Computer Science, vol 3882. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11733836_41

Download citation

  • DOI: https://doi.org/10.1007/11733836_41

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-33337-1

  • Online ISBN: 978-3-540-33338-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics