Abstract
Learning a proper distance metric is crucial for many computer vision and image classification applications. Neighborhood Components Analysis (NCA) is an effective distance metric learning method which maximizes the kNN leave-out-one score on the training data by considering visual similarity between images. However, only using visual similarity to learn image distances could not satisfactorily cope with the diversity and complexity of a large number of real images with many concepts. To overcome this problem, integrating concrete semantic relations of images into the distance metric learning procedure can be a useful solution. This can more accurately model the image similarities and better reflect the perception of human in the classification system. In this paper, we propose Semantic NCA (SNCA), a novel approach which integrates semantic similarity into NCA, where neighborhood relations between images in the training dataset are measured by both visual characteristics and their concept relations. We evaluated several semantic similarity measures based on the WordNet tree. Experimental results show that the proposed approach improves the performance compared to the traditional distance metric learning methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Cover, T.M., Hart, P.E.: Nearest neighbor pattern classification. IEEE Transactions on Information Theory 13(1), 21–27 (1967)
Boiman, O., Shechtman, E., Irani, M.: In defense of nearest-neighbor based image classification. In: Proc. of IEEE Conf. Computer Vision and Pattern Recognition, pp. 1–8 (2008)
Shen, C., Kim, J., Wang, L.: A scalable dual approach to semidefinite metric learning. In: Proc. of IEEE Computer Vision and Pattern Recognition, pp. 2601–2608 (2011)
Singh-Miller, N., Collins, M., Hazen, T.J.: Dimensionality reduction for speech recognition using neighborhood components analysis. In: INTERSPEECH 2007, pp. 1158–1161 (2007)
Globerson, A., Roweis, S.: Metric learning by collapsing classes. In: Proc. of the Conference on Advances in Neural Information Processing Systems (2006)
Weinberger, K.Q., Saul, L.K.: Distance metric learning for large margin nearest neighbor classification. Journal of Machine Learning Research 10, 207–244 (2009)
Wang, Z., Hu, Y., Chia, L.-T.: Image-to-Class Distance Metric Learning for Image Classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 706–719. Springer, Heidelberg (2010)
Davis, J., Kulis, B., Sra, S., Dhillon, I.: Information-theoretic metric learning. In: Proc. of the International Conference on Machine Learning, pp. 209–216. ACM, New York (2007)
Bronstein, M.M., Bronstein, A.M.: Data fusion through cross-modality metric learning using similarity-sensitive hashing. In: Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 3594–3601 (2010)
Goldberger, J., Roweis, S., Hinton, G., Salakhutdinov, R.: Neighbourhood components analysis. In: Proc. of the Conference on Advances in Neural Information Processing Systems (2005)
Sugiyama, M.: Dimensionality reduction of multimodal labeled data by local fisher discriminant analysis. Journal of Machine Learning Research 8, 1027–1061
Xing, E.P., Ng, A.Y., Jordan, M.I., Russell, S.: Distance metric learning, with application to clustering with side-information. In: Proc. of the Conference on Advances in Neural Information Processing Systems, vol. 40 (2003)
Hwang, S.J., Grauman, K., Sha, F.: Learning a tree of metrics with disjoint visual features. In: Proc. of the Conference on Advances in Neural Information Processing Systems (2011)
Li, L., Jiang, S., Huang, Q.: Learning hierarchical semantic description via mixed-norm regularization for image understanding. IEEE Transactions on Multimedia 14 (2012)
Fellbaum, C.: WordNet: An electronic lexical Database (1998)
Resnik, P.: Using information content to evaluate semantic similarity. In: Proc. of the International Joint Conference on Artificial Intelligence, pp. 448–453 (1995)
Leacock, C., Chodorow, M.: Combining local context and WordNet similarity for word sense identification. In: WordNet: An Electronic Lexical Database. MIT Press (1998)
Fergus, R., Bernal, H., Weiss, Y., Torralba, A.: Semantic Label Sharing for Learning with Many Categories. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 762–775. Springer, Heidelberg (2010)
Hope, D.: Java WordNet similarity (2008), http://www.sussex.ac.uk/Users/drh21
Griffin, G., Holub, A., Perona, P.: Caltech-256 object category dataset. Tech. Rep. 7694, California Institute of Technology (2007)
Deng, J., Dong, W., Socher, R., Jia Li, L., Li, K., Fei-Fei, L.: ImageNet: a large scale hierarchical image database. In: Proc. of IEEE Computer Vision and Pattern Recognition, pp. 248–255 (2009)
Oliva, A., Torralba, A.: Modeling the shape of the scene: A holistic representation of the spatial envelope. International Journal of Computer Vision 42, 145–175 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wang, F., Jiang, S., Herranz, L., Huang, Q. (2012). Improving Image Distance Metric Learning by Embedding Semantic Relations. In: Lin, W., et al. Advances in Multimedia Information Processing – PCM 2012. PCM 2012. Lecture Notes in Computer Science, vol 7674. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34778-8_39
Download citation
DOI: https://doi.org/10.1007/978-3-642-34778-8_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-34777-1
Online ISBN: 978-3-642-34778-8
eBook Packages: Computer ScienceComputer Science (R0)