Abstract
In this paper, we focus on the generation of contextual descriptions for images by learning an ontology-aware dictionary. Ontology deals with questions concerning what entities exist and how such entities can be related with a hierarchy. Thus, if we incorporate the semantic hierarchies of visual concepts into a learned visual dictionary, which consists of visual atoms, we can generate contextual descriptions of testing images through the reconstruction. This paper proposes to learn the ontology-aware dictionary by integrating hierarchical dictionary learning and multi-task regression into a joint framework. By utilizing a hierarchical regularization term defined on the multiple semantic categories, the hierarchical structures are introduced into the multi-task regression. The joint optimization of the sparse coding and multi-task regression makes the semantic hierarchies embedded into the learned dictionary. Experiments on two benchmark datasets show the better performance of the proposed algorithm. Examples of the ontology-aware dictionary and generated image descriptions successfully demonstrate the effectiveness of the proposed framework.
Y. Han was partly supported by the NSFC (under Grant 61202166 and 61472276) and the Major Project of National Social Science Fund of China (under Grant 14ZDB153).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Balcan, N., Blum, A., Mansour, Y.: Exploiting ontology structures and unlabeled data for learning. In: ICML, pp. 1112–1120 (2013)
Chen, X., Lin, Q., Kim, S., Carbonell, J.G., Xing, E.P., et al.: Smoothing proximal gradient method for general structured sparse regression. Ann. Appl. Stat. 6(2), 719–752 (2012)
Efron, B., Hastie, T., Johnstone, I., Tibshirani, R., et al.: Least angle regression. Ann. Stat. 32(2), 407–499 (2004)
Griffin, G., Holub, A., Perona, P.: Caltech-256 object category dataset. CNS-TR-2007-001, Technical report, California Institute of Technology (2007)
Hwang, S.J., Grauman, K., Sha, F.: Semantic kernel forests from multiple taxonomies. In: NIPS, pp. 1718–1726 (2012)
Jenatton, R., Mairal, J., Bach, F.R., Obozinski, G.R.: Proximal methods for sparse hierarchical dictionary learning. In: ICML, pp. 487–494 (2010)
Jiang, Z., Lin, Z., Davis, L.S.: Label consistent k-svd: learning a discriminative dictionary for recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(11), 2651–2664 (2013)
Mairal, J., Bach, F., Ponce, J.: Task-driven dictionary learning. IEEE Trans. Pattern Anal. Mach. Intell. 34(4), 791–804 (2012)
Mairal, J., Ponce, J., Sapiro, G., Zisserman, A., Bach, F.R.: Supervised dictionary learning. In: NIPS, pp. 1033–1040 (2009)
Schapire, R.E., Singer, Y.: Boostexter: a boosting-based system for text categorization. Mach. Learn. 39(2–3), 135–168 (2000)
Shen, L., Wang, S., Sun, G., Jiang, S., Huang, Q.: Multi-level discriminative dictionary learning towards hierarchical visual categorization. In: CVPR, pp. 383–390. IEEE (2013)
Wang, H., Nie, F., Cai, W., Huang, H.: Semi-supervised robust dictionary learning via efficient l-norms minimization. In: ICCV, pp. 1145–1152. IEEE (2013)
Wang, M., Gao, Y., Lu, K., Rui, Y.: View-based discriminative probabilistic modeling for 3d object retrieval and recognition. IEEE Trans. Image Process. 22(4), 1395–1407 (2013)
Zhang, L., Wang, M., Hong, R., Yin, B.C., Li, X.: Large-scale aerial image categorization using a multitask topological codebook. IEEE Trans. Cybern. (2015). doi:10.1109/TCYB.2015.2408592
Zhang, M.L., Wu, L.: Lift: multi-label learning with label-specific features. IEEE Trans. Pattern Anal. Mach. Intell. 37(1), 107–120 (2015)
Zhang, M.L., Zhou, Z.H.: Ml-knn: a lazy learning approach to multi-label learning. Pattern Recogn. 40(7), 2038–2048 (2007)
Zheng, J., Jiang, Z.: Tag taxonomy aware dictionary learning for region tagging. In: CVPR, pp. 369–376. IEEE (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Zhang, C., Han, Y. (2016). Describing Images with Ontology-Aware Dictionary Learning. In: Tian, Q., Sebe, N., Qi, GJ., Huet, B., Hong, R., Liu, X. (eds) MultiMedia Modeling. MMM 2016. Lecture Notes in Computer Science(), vol 9516. Springer, Cham. https://doi.org/10.1007/978-3-319-27671-7_29
Download citation
DOI: https://doi.org/10.1007/978-3-319-27671-7_29
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-27670-0
Online ISBN: 978-3-319-27671-7
eBook Packages: Computer ScienceComputer Science (R0)