Skip to main content

Multiclass Boosting Framework for Multimodal Data Analysis

  • Conference paper
  • 3807 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8936))

Abstract

A large number of multimedia documents containing texts and images have appeared on the internet, hence cross-modal retrieval in which the modality of a query is different from that of the retrieved results is being an interesting search paradigm. In this paper, a multimodal multiclass boosting framework (MMB) is proposed to capture intra-modal semantic information and inter-modal semantic correlation. Unlike traditional boosting methods which are confined to two classes or single modality, MMB could simultaneously deal with multimodal data. The empirical risk, which takes both intra-modal and inter-modal losses into account, is designed and then minimized by gradient descent in the multidimensional functional spaces. More specifically, the optimization problem is solved in turn for each modality. Semantic space can be naturally attained by applying sigmoid function to the quasi-margins. Extensive experiments on the Wiki and NUS-WIDE datasets show that the performance of our method significantly outperforms those of existing approaches for cross-modal retrieval.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Blaschko, M.B., Lampert, C.H.: Correlational Spectral Clustering. In: Proceeding of IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2008)

    Google Scholar 

  2. Bronstein, M.M., Bronstein, A.M., Michel, F., Paragios, N.: Data Fusion through Cross-modality Metric Learning Using Similarity-sensitive Hashing. In: Proceeding of IEEE Conference on Computer Vision and Pattern Recognition, pp. 3594–3601 (2010)

    Google Scholar 

  3. Carneiro, G., Chan, A.B., Moreno, P.J., Vasconcelos, N.: Supervised Learning of Semantic Classes for Image Annotation and Retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence 29(3), 394–410 (2007)

    Article  Google Scholar 

  4. Chua, T.S., Tang, J., Hong, R., Li, H., Luo, Z., Zheng, Y.: NUS-WIDE: a real-world web image database from National University of Singapore. In: Proceedings of the ACM International Conference on Image and Video Retrieval, pp. 48–56 (2009)

    Google Scholar 

  5. Clinchant, S., Ah-Pine, J., Csurka, G.: Semantic Combination of Textual and Visual Information in Multimedia Retrieval. In: Proceeding of the 1st ACM International Conference on Multimedia Retrieval (2011)

    Google Scholar 

  6. Coxeter, H.S.M.: Regular polytopes. Courier Dover Publications (1973)

    Google Scholar 

  7. Freund, Y., Schapire, R.E.: A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting. Journal of Computer and System Sciences 55(1), 119–139 (1997)

    Article  MATH  MathSciNet  Google Scholar 

  8. Hotelling, H.: Relations Between Two Sets of Variates. Biometrika 28(3-4), 321–337 (1936)

    Article  MATH  Google Scholar 

  9. Kidron, E., Schechner, Y.Y., Elad, M.: Pixels That Sound. In: Proceeding of IEEE Conference on Computer Vision and Pattern Recognition, pp. 88–95 (2005)

    Google Scholar 

  10. Manning, C.D., Raghavan, P., Schtze, H.: Introduction to information retrieval. Cambridge University Press, Cambridge (2008)

    Book  MATH  Google Scholar 

  11. Rasiwasia, N., Moreno, P.J., Vasconcelos, N.: Bridging the gap: Query by semantic example. IEEE Transactions on Multimedia 9(5), 923–938 (2007)

    Article  Google Scholar 

  12. Rasiwasia, N., Costa Pereira, J., Coviello, E., Doyle, G., Lanckriet, G.R., Levy, R., Vasconcelos, N.: A New Approach to Cross-modal Multimedia Retrieval. In: Proceedings of the ACM International Conference on Multimedia, pp. 251–260 (2010)

    Google Scholar 

  13. Saberian, M.J., Masnadi-Shirazi, H., Vasconcelos, N.: Taylorboost: First and Second-order Boosting Algorithms with Explicit Margin Control. In: Proceeding of IEEE Conference on Computer Vision and Pattern Recognition, pp. 2929–2934 (2011)

    Google Scholar 

  14. Saberian, M.J., Vasconcelos, N.: Multiclass Boosting: Theory and Algorithms. In: Advances in Neural Information Processing Systems, pp. 2124–2132 (2011)

    Google Scholar 

  15. Shen, J., Cheng, Z.: Personalized Video Similarity Measure. Multimedia Systems 17(5), 421–433 (2011)

    Article  MathSciNet  Google Scholar 

  16. Turnbull, D., Barrington, L., Torres, D., Lanckriet, G.: Semantic Annotation and Retrieval of Music and Sound Effects. IEEE Transactions on Audio, Speech, and Language Processing 16(2), 467–476 (2008)

    Article  Google Scholar 

  17. Typke, R., Wiering, F., Veltkamp, R.C.: A Survey of Music Information Retrieval Systems. In: Proceeding of ISMIR, pp. 153–160 (2005)

    Google Scholar 

  18. Zhen, Y., Yeung, D.Y.: A Probabilistic Model for Multimodal Hash Function Learning. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 940–948 (2012)

    Google Scholar 

  19. Zhu, J., Zou, H., Rosset, S., Hastie, T.: Multi-class Adaboost. Statistics and Its Interface 2, 349–360 (2009)

    Article  MATH  MathSciNet  Google Scholar 

  20. Zhu, X., Huang, Z., Shen, H.T., Zhao, X.: Linear Cross-modal Hashing for Efficient Multimedia Search. In: Proceedings of the 21st ACM International Conference on Multimedia, pp. 143–152 (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Wang, S., Pan, P., Lu, Y., Jiang, S. (2015). Multiclass Boosting Framework for Multimodal Data Analysis. In: He, X., Luo, S., Tao, D., Xu, C., Yang, J., Hasan, M.A. (eds) MultiMedia Modeling. MMM 2015. Lecture Notes in Computer Science, vol 8936. Springer, Cham. https://doi.org/10.1007/978-3-319-14442-9_60

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-14442-9_60

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-14441-2

  • Online ISBN: 978-3-319-14442-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics