Multiclass Boosting Framework for Multimodal Data Analysis

Wang, Shixun; Pan, Peng; Lu, Yansheng; Jiang, Sheng

doi:10.1007/978-3-319-14442-9_60

Multiclass Boosting Framework for Multimodal Data Analysis

Shixun Wang²⁰,
Peng Pan²⁰,
Yansheng Lu²⁰ &
…
Sheng Jiang²⁰

Conference paper

3807 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8936))

Abstract

A large number of multimedia documents containing texts and images have appeared on the internet, hence cross-modal retrieval in which the modality of a query is different from that of the retrieved results is being an interesting search paradigm. In this paper, a multimodal multiclass boosting framework (MMB) is proposed to capture intra-modal semantic information and inter-modal semantic correlation. Unlike traditional boosting methods which are confined to two classes or single modality, MMB could simultaneously deal with multimodal data. The empirical risk, which takes both intra-modal and inter-modal losses into account, is designed and then minimized by gradient descent in the multidimensional functional spaces. More specifically, the optimization problem is solved in turn for each modality. Semantic space can be naturally attained by applying sigmoid function to the quasi-margins. Extensive experiments on the Wiki and NUS-WIDE datasets show that the performance of our method significantly outperforms those of existing approaches for cross-modal retrieval.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Blaschko, M.B., Lampert, C.H.: Correlational Spectral Clustering. In: Proceeding of IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2008)
Google Scholar
Bronstein, M.M., Bronstein, A.M., Michel, F., Paragios, N.: Data Fusion through Cross-modality Metric Learning Using Similarity-sensitive Hashing. In: Proceeding of IEEE Conference on Computer Vision and Pattern Recognition, pp. 3594–3601 (2010)
Google Scholar
Carneiro, G., Chan, A.B., Moreno, P.J., Vasconcelos, N.: Supervised Learning of Semantic Classes for Image Annotation and Retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence 29(3), 394–410 (2007)
Article Google Scholar
Chua, T.S., Tang, J., Hong, R., Li, H., Luo, Z., Zheng, Y.: NUS-WIDE: a real-world web image database from National University of Singapore. In: Proceedings of the ACM International Conference on Image and Video Retrieval, pp. 48–56 (2009)
Google Scholar
Clinchant, S., Ah-Pine, J., Csurka, G.: Semantic Combination of Textual and Visual Information in Multimedia Retrieval. In: Proceeding of the 1st ACM International Conference on Multimedia Retrieval (2011)
Google Scholar
Coxeter, H.S.M.: Regular polytopes. Courier Dover Publications (1973)
Google Scholar
Freund, Y., Schapire, R.E.: A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting. Journal of Computer and System Sciences 55(1), 119–139 (1997)
Article MATH MathSciNet Google Scholar
Hotelling, H.: Relations Between Two Sets of Variates. Biometrika 28(3-4), 321–337 (1936)
Article MATH Google Scholar
Kidron, E., Schechner, Y.Y., Elad, M.: Pixels That Sound. In: Proceeding of IEEE Conference on Computer Vision and Pattern Recognition, pp. 88–95 (2005)
Google Scholar
Manning, C.D., Raghavan, P., Schtze, H.: Introduction to information retrieval. Cambridge University Press, Cambridge (2008)
Book MATH Google Scholar
Rasiwasia, N., Moreno, P.J., Vasconcelos, N.: Bridging the gap: Query by semantic example. IEEE Transactions on Multimedia 9(5), 923–938 (2007)
Article Google Scholar
Rasiwasia, N., Costa Pereira, J., Coviello, E., Doyle, G., Lanckriet, G.R., Levy, R., Vasconcelos, N.: A New Approach to Cross-modal Multimedia Retrieval. In: Proceedings of the ACM International Conference on Multimedia, pp. 251–260 (2010)
Google Scholar
Saberian, M.J., Masnadi-Shirazi, H., Vasconcelos, N.: Taylorboost: First and Second-order Boosting Algorithms with Explicit Margin Control. In: Proceeding of IEEE Conference on Computer Vision and Pattern Recognition, pp. 2929–2934 (2011)
Google Scholar
Saberian, M.J., Vasconcelos, N.: Multiclass Boosting: Theory and Algorithms. In: Advances in Neural Information Processing Systems, pp. 2124–2132 (2011)
Google Scholar
Shen, J., Cheng, Z.: Personalized Video Similarity Measure. Multimedia Systems 17(5), 421–433 (2011)
Article MathSciNet Google Scholar
Turnbull, D., Barrington, L., Torres, D., Lanckriet, G.: Semantic Annotation and Retrieval of Music and Sound Effects. IEEE Transactions on Audio, Speech, and Language Processing 16(2), 467–476 (2008)
Article Google Scholar
Typke, R., Wiering, F., Veltkamp, R.C.: A Survey of Music Information Retrieval Systems. In: Proceeding of ISMIR, pp. 153–160 (2005)
Google Scholar
Zhen, Y., Yeung, D.Y.: A Probabilistic Model for Multimodal Hash Function Learning. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 940–948 (2012)
Google Scholar
Zhu, J., Zou, H., Rosset, S., Hastie, T.: Multi-class Adaboost. Statistics and Its Interface 2, 349–360 (2009)
Article MATH MathSciNet Google Scholar
Zhu, X., Huang, Z., Shen, H.T., Zhao, X.: Linear Cross-modal Hashing for Efficient Multimedia Search. In: Proceedings of the 21st ACM International Conference on Multimedia, pp. 143–152 (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, 430074, China
Shixun Wang, Peng Pan, Yansheng Lu & Sheng Jiang

Authors

Shixun Wang
View author publications
You can also search for this author in PubMed Google Scholar
Peng Pan
View author publications
You can also search for this author in PubMed Google Scholar
Yansheng Lu
View author publications
You can also search for this author in PubMed Google Scholar
Sheng Jiang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Technology, P.O. Box 123, 2007, Sydney, NSW, Australia
Xiangjian He
University of Newcastle, University Dr, Callaghan, 2308, NSW, Australia
Suhuai Luo
University of Technology, P.O. Box 123, 2007, Sydney, NSW, Australia
Dacheng Tao & Muhammad Abul Hasan &
National Lab of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, 95, Zhongguancun East Road, 100190, Beijing, P.R. China
Changsheng Xu
Shanghai Jitotong University, 800 Dong Chuan Rd, 200240, Shanghai, China
Jie Yang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, S., Pan, P., Lu, Y., Jiang, S. (2015). Multiclass Boosting Framework for Multimodal Data Analysis. In: He, X., Luo, S., Tao, D., Xu, C., Yang, J., Hasan, M.A. (eds) MultiMedia Modeling. MMM 2015. Lecture Notes in Computer Science, vol 8936. Springer, Cham. https://doi.org/10.1007/978-3-319-14442-9_60

Download citation

DOI: https://doi.org/10.1007/978-3-319-14442-9_60
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-14441-2
Online ISBN: 978-3-319-14442-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics