Skip to main content
  • 1011 Accesses

Abstract

With multimedia databases it is difficult to specify queries directly and explicitly. Relevance feedback interactively learns a user’s desired output or query concept by asking the user whether certain proposed multimedia objects (e.g., images, videos, and songs) are relevant or not. For a learning algorithm to be effective, it must learn a user’s query concept accurately and quickly, while also asking the user to label only a small number of data instances. In addition, the concept-learning algorithm should consider the complexity of a concept in determining its learning strategies. This chapter\(^\dagger\) presents the use of support vector machines active learning in a concept-dependent way (\(\hbox{SVM}^{\rm CD}_{\rm Active}\) for conducting relevance feedback. A concept’s complexity is characterized using three measures: hit-rate, isolation and diversity. To reduce concept complexity so as to improve concept learnability, a multimodal learning approach is designed to use the semantic labels of data instances to intelligently adjust the sampling strategy and the sampling pool of \(\hbox{SVM}^{\rm CD}_{\rm Active}.\) Empirical study on several datasets shows that active learning outperforms traditional passive learning, and concept-dependent learning is superior to concept-independent relevance-feedback schemes.

© ACM, 2004. This chapter is written based on the author’s work with Simon Tong [1], Kingshy Goh, and Wei-Cheng Lai [2]. Permission to publish this chapter is granted under copyright licenses #2587600971756 and #2587601214143.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The input space (denoted as X in machine learning and statistics literature) is defined as the original space in which the data vectors are located, and the feature space (denoted as F ) is the space into which the data are projected, either linearly or non-linearly.

  2. 2.

    Query expansion is a vital component of any retrieval systems and it remains a challenging research area. This component is not deployed in our system yet and is part of our ongoing research.

  3. 3.

    Unlike some recently developed systems [23] that contain a semantic layer between image features and queries to assist query refinement, our system does not have an explicit semantic layer. We argue that having a hard-coded semantic layer can make a retrieval system restrictive. Rather, dynamically learning the semantics of a query concept is more flexible and hence makes the system more useful.

  4. 4.

    A query such as “animals”, “women”, and “European architecture” does not reside contiguously in the space formed by the image features.

References

  1. S. Tong, E. Chang, Support vector machine active learning for image retrieval, in Proceedings of the ACM international conference on Multimedia, pp. 107–118 (2001)

    Google Scholar 

  2. K.S. Goh, E.Y. Chang, W.C. Lai, Multimodal concept-dependent active learning for image retrieval, in Proceedings of the ACM international conference on Multimedia, pp. 564–571 (2004)

    Google Scholar 

  3. D. Lewis, W. Gale, A sequential algorithm for training text classifiers, in Proceedings of the Seventeenth Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval (Springer, Heidelberg, 1994), pp. 3–12

    Google Scholar 

  4. A. McCallum, K. Nigam, Employing EM in pool-based active learning for text classification, In Proceedings of the Fifteenth International Conference on Machine Learning (Morgan Kaufmann, San Francisco, 1998), pp. 350–358

    Google Scholar 

  5. C. Burges, A tutorial on support vector machines for pattern recognition, in Proceedings of ACM KDD, pp. 121–167 (1998)

    Google Scholar 

  6. V. Vapnik, Estimation of Dependences Based on Empirical Data (Springer, Heidelberg, 1982)

    Google Scholar 

  7. C. Campbell, N. Cristianini, A. Smola, Query learning with large margin classifiers, in Proceedings of the Seventeenth International Conference on Machine Learning, pp. 111–118 (2000)

    Google Scholar 

  8. G. Schohn, D. Cohn, Less is more: active learning with support vector machines, in Proceedings of the Seventeenth International Conference on Machine Learning, pp. 839–846 (2000)

    Google Scholar 

  9. S. Tong, D. Koller, Support vector machine active learning with applications to text classification, in Proceedings of the 17th International Conference on Machine Learning, pp. 401–412 (June 2000)

    Google Scholar 

  10. M. Mendel, G. Poliner, D. Ellis, Support vector machine active learning for music retrieval. Multime’d. Syst. 12(1), 3–13 (2006)

    Article  Google Scholar 

  11. T. Mitchell, Generalization as search. Artif. Intel. 28, 203–226 (1982)

    Article  Google Scholar 

  12. J. Shawe-Taylor, N. Cristianini, Further results on the margin distribution, in Proceedings of the Twelfth Annual Conference on Computational Learning Theory, pp. 278–285 (1999)

    Google Scholar 

  13. V. Vapnik, Statistical Learning Theory (Wiley, NY, 1998)

    Google Scholar 

  14. R. Herbrich, T. Graepel, C. Campbell, Bayes point machines: estimating the bayes point in kernel space, in International Joint Conference on Artificial IntelligenceWorkshop on Support Vector Machines, pp. 23–27 (1999)

    Google Scholar 

  15. K. Brinker, Incorporating diversity in active learning with support vector machines, in Proceedings of the Twentieth International Conference on Machine Learning (ICML), pp. 59–66 (August 2003)

    Google Scholar 

  16. N. Roy, A. McCallum, Toward optimal active learning through sampling estimation of error reduction, in Proceedings of the Eighteenth International Conference on Machine Learning (ICML), pp. 441–448 (August 2001)

    Google Scholar 

  17. W.C. Lai, K. Goh, E.Y. Chang, On scalability of active learning for formulating query concepts, in Proceedings of Workshop on Computer Vision Meets Databases (CVDB) in cooperation with ACM International Conference on Management of Data (SIGMOD), pp. 11–18 (2004)

    Google Scholar 

  18. R. Agrawal, Fast algorithms for mining association rules in large databases, in Proceedings of VLDB, pp. 487–499 (1994)

    Google Scholar 

  19. R. Duda, P. Hart, D.G. Stork, Pattern Classification. 2nd edn. (Wiley, New York, 2001)

    MATH  Google Scholar 

  20. C. Li, E. Chang, H. Garcia-Molina, G. Wilderhold, Clindex: Approximate similarity queries in high-dimensional spaces.. IEEE Trans. Knowl. Data Eng. (TKDE) 14(4), 792–808 (2002)

    Article  Google Scholar 

  21. E. Chang, B. Li, MEGA: the maximizing expected generalization algorithm for learning complex query concepts. ACM Trans. Inf. Syst. 21(4), 347–382 (2003)

    Article  MathSciNet  Google Scholar 

  22. E. Chang, K. Goh, G. Sychay, G. Wu, Content-based soft annotation for multimodal image retrieval using bayes point machines. IEEE Trans. Circuits. Syst. Video Technol. Special Issue Concept. Dynamical Aspects Multime’d. Content Descr. 13(1), 26–38 (2003)

    Google Scholar 

  23. J. Wang, J. Li, G. Wiederhold, Simplicity: semantics-sensitive integrated matching for picture libraries, in Proceedings of ACM Multimedia Conference, pp. 483–484 (2000)

    Google Scholar 

  24. C. Bishop, Neural Networks for Pattern Recognition (Oxford University Press, Oxford, 1998)

    Google Scholar 

  25. M. Kearns, U. Vazirani, An Introduction to Computational Learning Theory (MIT Press, USA, 1994)

    Google Scholar 

  26. T.M. Mitchell, Machine Learning (McGraw-Hill, NY, 1997)

    Google Scholar 

  27. X.S. Zhou, T.S. Huang, Comparing discriminating transformations and svm for learning during multimedia retrieval, in Proceedings of ACM Conference on Multimedia, pp. 137–146 (2001)

    Google Scholar 

  28. X.S. Zhou, T.S. Huang, Relevance feedback for image retrieval: a comprehensive review. ACM Multime’d. Syst. J., Special Issue on CBIR 8, 536–544 (2003)

    Google Scholar 

  29. K.S. Jones, P.Willett (eds.), Readings in Information Retrieval (Morgan Kaufman, San Francisco, July 1997)

    Google Scholar 

  30. K. Porkaew, K. Chakrabarti, S. Mehrotra, Query refinement for multimedia similarity retrieval in mars, in Proceedings of ACM International Conference on Multimedia, pp. 235–238 (1999)

    Google Scholar 

  31. L. Wu, C. Faloutsos, K. Sycara, T.R. Payne, Falcon: feedback adaptive loop for contentbased retrieval, in Proceedings of the 26th VLDB Conference, pp. 279–306 (September 2000)

    Google Scholar 

  32. M. Ortega-Binderberger, S. Mehrotra, Relevance feedback techniques in the MARS image retrieval system. Multime’d. Syst. 9(6), 535–547 (2004)

    Article  Google Scholar 

  33. L. Breiman, Bagging predicators. Mach. Learn. 24(2), 123–140 (1996)

    Google Scholar 

  34. L. Breiman, Arcing classifiers. Ann. Statist. 26(3), 801–849 (1998)

    Google Scholar 

  35. A. Grove, D. Schuurmans, Boosting in the limit: maximizing the margin of learned ensembles, in Proceedings of 15th National Conference on Artificial Intelligence (AAAI), pp. 692–699 (1998)

    Google Scholar 

  36. R. Schapire, Y. Freund, P. Bartlett, W. Lee, Boosting the margin: a new explanation for the effectiveness of voting methods, in Proceeding of the Fourteenth International Conference on Machine Learning (Morgan Kaufmann, San Francisco, 1997) pp. 322–330

    Google Scholar 

  37. H. Wu, H. Lu, S. Ma, A practical svm-based algorithm for ordinal regression in image retri eval, in Proceedings of ACM International Conference on Multimedia, pp. 612–621 (2003)

    Google Scholar 

  38. T. Dietterich, G. Bakiri, Solving multiclass learning problems via error-correcting output codes. J. Artif. Intell. Res. 2, 263–286 (1995)

    MATH  Google Scholar 

  39. G. James, T. Hastie, Error coding and substitution PaCTs, in Proceedings of NIPS, (1997)

    Google Scholar 

  40. M. Moreira, E. Mayoraz, Improved pairwise coupling classification with error correcting classifiers, in Proceedings of ECML, pp. 160–171 (April 1998)

    Google Scholar 

  41. D. Cohn, Z. Ghahramani, M. Jordan, Active learning with statistical models. J. Artif. Intell. Res. 4, 129–145 (1996)

    MATH  Google Scholar 

  42. N. Cesa-Bianchi, Y. Freund, D. Haussler, D.P. Helmbold, R.E. Schapire, M.K. Warmuth, How to use expert advice. J. ACM 44(3), 427–485 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  43. T. Jaakkola, H. Siegelmann, Active information retrieval, in Proceedings of NIPS, pp. 777–784 (2001)

    Google Scholar 

  44. S. Tong, E. Chang, Support vector machine active learning for image retrieval, in Proceedings of ACM International Conference on Multimedia, pp. 107–118 (October 2001)

    Google Scholar 

  45. Y. Freund, H. Seung, E. Shamir, N. Tishby, Selective sampling using the query by committee algorithm. Mach. Learn. 28, 133–168 (1997)

    Article  MATH  Google Scholar 

  46. H. Seung, M. Opper, H. Sompolinsky, Query by committee, in Proceedings of the Fifth Workshop on Computational Learning Theory, (Morgan Kaufmann, San Francisco, 1992), pp. 287–294

    Google Scholar 

  47. I. Dagan, S. Engelson, Committee-based sampling for training probabilistic classifiers, in Proceedings of the Twelfth International Conference on Machine Learning (Morgan Kaufmann, San Francisco, 1995) pp. 150–157

    Google Scholar 

  48. T. Joachims, Text categorization with support vector machines, in Proceedings of ECML, (Springer, Heidelberg, 1998) pp. 137–142

    Google Scholar 

  49. S.T. Dumais, J. Platt, D. Heckerman, M. Sahami, Inductive learning algorithms and representations for text categorization, in Proceedings of the Seventh International Conference on Information and Knowledge Management, (ACM Press, NY, 1998) pp. 148–155

    Google Scholar 

  50. I.J. Cox, M.L. Miller, S.M. Omohundo, P.N. Yianilos, Pichunter: Bayesian relevance feedback for image retrieval, in Proceedings of International Conference on Pattern Recognition, pp. 361–369 (August 1996)

    Google Scholar 

  51. I.J. Cox, M.L. Miller, T.P. Minka, T.V. Papathomas, P.N. Yianilos, The Bayesian image retrieval system, Pichunter: theory, implementation and psychological experiments. IEEE Trans. Image Process. 9(1), 20–37 (2000)

    Article  Google Scholar 

  52. E.Y. Chang, B. Li, G. Wu, K.S. Goh, Statistical learning for effective visual information retrieval (invited paper), in Proceedings of IEEE International Conference on Image Processing (ICIP), pp. 609–612 (2003)

    Google Scholar 

  53. J.J. Rocchio, Relevance feedback in information retrieval, in The SMART Retrieval System—Experiments in Automatic Document Processing, ed. by G. Salton (Prentice Hall, NJ, 1971) pp. 313–323

    Google Scholar 

  54. Y. Ishikawa, R. Subramanya, C. Faloutsos, Mindreader: querying databases through multiple examples, in Proceedings of VLDB, pp. 218–227 (1998)

    Google Scholar 

  55. M. Ortega, Y. Rui, K. Chakrabarti, A. Warshavsky, S. Mehrotra, T.S. Huang, Supporting ranked boolean similarity queries in mars. IEEE Trans. Knowl. Data Eng. 10(6), 905–925 (1999)

    Article  Google Scholar 

  56. M. Ortega, Y. Rui, K. Chakrabarti, S. Mehrotra, T.S. Huang, Supporting similarity queries in mars, in Proceedings of ACM International Conference on Multimedia, pp. 403–413 (1997)

    Google Scholar 

  57. Y. Rui, T.S. Huang, M. Ortega, S. Mehrotra, Relevance feedback: A power tool in interactive content-based image retrieval. IEEE Trans. Circuits Syst. Video Technol. 8(5), 644–655 (1998)

    Article  Google Scholar 

  58. K. Porkaew, S. Mehrota, M. Ortega, Query reformulation for content based multimedia retrieval in mars, in Proceedings of ICMCS, pp. 747–751 (1999)

    Google Scholar 

  59. M. Flickner, H. Sawhney, J. Ashley, Q. Huang, B. Dom, M. Gorkani, J. Hafner, D. Lee, D. Petkovic, D. Steele, P. Yanker, Query by image and video content: the QBIC system. IEEE Comput. 28(9), 23–32 (1995)

    Google Scholar 

  60. A. Gupta, R. Jain, Visual information retrieval. Commun. ACM 40(5), 69–79 (1997)

    Article  Google Scholar 

  61. K.A. Hua, K. Vu, J.H. Oh, Sammatch: a flexible and efficient sampling-based image retrieval technique for image databases, in Proceedings of ACM Multimedia, pp. 225–234 (1999)

    Google Scholar 

  62. B.S. Manjunath, W.Y. Ma, Texture features for browsing and retrieval of image data. IEEE Trans. Pattern Anal. Mach. Intell. 18(8), 837–842 (1996)

    Article  Google Scholar 

  63. J.R. Smith, S.F. Chang, VisualSEEk: a fully automated content-based image query system, in Proceedings of ACM Multimedia, pp. 87–98 (1996)

    Google Scholar 

  64. J.Z. Wang, G. Wiederhold, O. Firschein, S.X. Wei, Wavelet-based image indexing techniques with partial sketch retrieval capability, in Proceedings of the ADL, pp. 13–24 (May 1997)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Edward Y. Chang .

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg and Tsinghua University Pres

About this chapter

Cite this chapter

Chang, E.Y. (2011). Query Concept Learning. In: Foundations of Large-Scale Multimedia Information Management and Retrieval. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20429-6_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-20429-6_3

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-20428-9

  • Online ISBN: 978-3-642-20429-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics