Advertisement

International Journal of Computer Vision

, Volume 121, Issue 1, pp 65–94 | Cite as

Learning a Distance Metric from Relative Comparisons between Quadruplets of Images

  • Marc T. LawEmail author
  • Nicolas Thome
  • Matthieu Cord
Article

Abstract

This paper is concerned with the problem of learning a distance metric by considering meaningful and discriminative distance constraints in some contexts where rich information between data is provided. Classic metric learning approaches focus on constraints that involve pairs or triplets of images. We propose a general Mahalanobis-like distance metric learning framework that exploits distance constraints over up to four different images. We show how the integration of such constraints can lead to unsupervised or semi-supervised learning tasks in some applications. We also show the benefit on recognition performance of this type of constraints, in rich contexts such as relative attributes, class taxonomies and temporal webpage analysis.

Keywords

Metric learning Relative attributes Web mining Change detection 

Notes

Acknowledgments

This work was partially supported by the SCAPE Project cofunded by the European Union under FP7 ICT 2009.4.1 (Grant Agreement nb 270137).

References

  1. Adar, E., Teevan, J., & Dumais, S. (2009). Resonance on the web: web dynamics and revisitation patterns. In ACM CHI conference on human factors in computing systems (CHI) Google Scholar
  2. Adar, E., Teevan, J., Dumais, S., & Elsas, J. (2009). The web changes everything: Understanding the dynamics of web content. In ACM WSDM conference series web search and data mining (WSDM).Google Scholar
  3. Agarwal, S., Wills, J., Cayton, L., Lanckriet, G., Kriegman, D.J., & Belongie, S. (2007). Generalized non-metric multidimensional scaling. In International conference on artificial intelligence and statistics (AISTATS) (pp. 11–18).Google Scholar
  4. Avila, S., Thome, N., Cord, M., Valle, E., & Araújo, Ad A. (2013). Pooling in image representation: The visual codeword point of view. Computer Vision and Image Understanding (CVIU), 117(5), 453–465.CrossRefGoogle Scholar
  5. Ben Saad, M., & Gançarski, S. (2011). Archiving the web using page changes pattern: A case study. In Joint conference on digital library (JCDL) Google Scholar
  6. Borg, I., & Groenen, P. (2005). Modern multidimensional scaling: Theory and applications. Springer series in statistics. New York: Springer.Google Scholar
  7. Boyd, S., & Vandenberghe, L. (2008). Subgradient. Notes for EE364b. Stanford University, Winter 2006–2007. http://see.stanford.edu/materials/lsocoee364b/01-subgradients_notes.pdf.
  8. Boyd, S. P., & Vandenberghe, L. (2004). Convex optimization. Cambridge: Cambridge University Press.CrossRefzbMATHGoogle Scholar
  9. Cai, D., Yu, S., Wen, J., & Ma, W. (2003). Vips: A vision-based page segmentation algorithm. Microsoft Technical Report. MSR-TR-2003-79-2003Google Scholar
  10. Chapelle, O. (2007). Training a support vector machine in the primal. Neural Computation, 19(5), 1155–1178.MathSciNetCrossRefzbMATHGoogle Scholar
  11. Chapelle, O., & Keerthi, S. S. (2010). Efficient algorithms for ranking with svms. Information Retrieval, 13(3), 201–215.CrossRefGoogle Scholar
  12. Chechik, G., Sharma, V., Shalit, U., & Bengio, S. (2010). Large scale online learning of image similarity through ranking. The Journal of Machine Learning Research (JMLR), 11, 1109–1135.MathSciNetzbMATHGoogle Scholar
  13. Cord, M., & Cunningham, P. (2008). Machine learning techniques for multimedia. Santa Clara, CA: Springer.CrossRefGoogle Scholar
  14. Davis, J. V., Kulis, B., Jain, P., Sra, S., & Dhillon, I.S. (2007) . Information-theoretic metric learning. In International conference on machine learning (ICML) Google Scholar
  15. Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., & Fei-Fei, L. (2009). Imagenet: A large-scale hierarchical image database. In IEEE conference on computer vision and pattern recognition (CVPR) Google Scholar
  16. Douze, M., Jégou, H., Sandhawalia, H., Amsaleg, L., & Schmid, C. (2009). Evaluation of gist descriptors for web-scale image search. In ACM international conference on image and video retrieval (CIVR) Google Scholar
  17. Finley, T., & Joachims, T. (2005). Supervised clustering with support vector machines. In International conference on machine learning (ICML) (pp. 217–224). New York: ACM.Google Scholar
  18. Finley, T., & Joachims, T. (2008). Supervised k-means clustering. Cornell Computing and Information Science Technical Report.Google Scholar
  19. Frome, A., Singer, Y., & Malik, J. (2006). Image retrieval and classification using local distance functions. In Advances in neural information processing systems (NIPS) Google Scholar
  20. Frome, A., Singer, Y., Sha, F., & Malik, J. (2007). Learning globally-consistent local distance functions for shape-based image retrieval and classification. In IEEE international conference on computer vision (ICCV).Google Scholar
  21. Goh, H., Thome, N., Cord, M., & Lim, J. (2012). Unsupervised and supervised visual codes with restricted Boltzmann machines. In European conference on computer vision (ECCV).Google Scholar
  22. Guillaumin, M., Verbeek, J., & Schmid, C. (2009), Is that you? Metric learning approaches for face identification. In IEEE international conference on computer vision (ICCV).Google Scholar
  23. Hocking, T. D., Schleiermacher, G., Janoueix-Lerosey, I., Boeva, V., Cappo, J., Delattre, O., et al. (2013). Learning smoothing models of copy number profiles using breakpoint annotations. BMC Bioinformatics, 14(1), 164.CrossRefGoogle Scholar
  24. Hwang, S.J., Grauman, K., & Sha, F. (2011). Learning a tree of metrics with disjoint visual features. In Advances in neural information processing systems (NIPS).Google Scholar
  25. Hwang, S.J., Grauman, K., & Sha, F. (2013). Analogy-preserving semantic embedding for visual object categorization. In International conference on machine learning (ICML).Google Scholar
  26. Jain, P., Kulis, B., & Grauman, K. (2008). Fast image search for learned metrics. In IEEE conference on computer vision and pattern recognition (CVPR).Google Scholar
  27. Joachims, T. (2002). Optimizing search engines using clickthrough data. In Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 133–142). New York: ACM.Google Scholar
  28. Joachims, T. (2005). A support vector method for multivariate performance measures. In Proceedings of the 22nd international conference on machine learning (pp. 377–384). New York: ACM.Google Scholar
  29. Joachims, T. (2006). Training linear svms in linear time. In Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 217–226). New York: ACM.Google Scholar
  30. Joachims, T., Finley, T., & Yu, C. N. J. (2009). Cutting-plane training of structural svms. Machine Learning, 77(1), 27–59.CrossRefzbMATHGoogle Scholar
  31. Keerthi, S. S., & DeCoste, D. (2005). A modified finite Newton method for fast solution of large scale linear svms. Journal of Machine Learning Research, 6(1), 341.MathSciNetzbMATHGoogle Scholar
  32. Kendall, M. G., & Gibbons, J. D. (1990). Rank correlation methods (5th ed.). New York: Oxford University Press.Google Scholar
  33. Kruskal, J. B. (1964). Nonmetric multidimensional scaling: A numerical method. Psychometrika, 29(2), 115–129.MathSciNetCrossRefzbMATHGoogle Scholar
  34. Kulis, B. (2012). Metric learning: A survey. Foundations and Trends in Machine Learning, 5(4), 287–364.MathSciNetCrossRefzbMATHGoogle Scholar
  35. Kumar, M., Torr, P., & Zisserman, A. (2007). An invariant large margin nearest neighbour classifier. In IEEE international conference on computer vision (ICCV).Google Scholar
  36. Kumar, N., Berg, A., Belhumeur, P., & Nayar, S. (2009). Attribute and simile classifiers for face verification. In IEEE international conference on computer vision (ICCV).Google Scholar
  37. Lajugie, R., Bach, F., & Arlot, S. (2014). Large-margin metric learning for constrained partitioning problems. In International conference on machine learning (ICML) (pp. 297–305).Google Scholar
  38. Lampert, C. H., Nickisch, H., & Harmeling, S. (2009) Learning to detect unseen object classes by between-class attribute transfer. In IEEE conference on computer vision and pattern recognition (CVPR).Google Scholar
  39. Law, M. T., Sureda Gutierrez, C., Thome, N., Gançarski, S., & Cord, M. (2012). Structural and visual similarity learning for web page archiving. In 10th workshop on content-based multimedia indexing (CBMI).Google Scholar
  40. Law, M. T., Thome, N., & Cord, M. (2013). Quadruplet-wise image similarity learning. In IEEE international conference on computer vision (ICCV) (pp. 249–256).Google Scholar
  41. Law, M. T., Thome, N., Gançarski, S., & Cord, M. (2012). Structural and visual comparisons for web page archiving. In ACM symposium on document engineering (DocEng).Google Scholar
  42. Luo, P., Fan, J., Liu, S., Lin, F., Xiong, Y., & Liu, J. (2009). Web article extraction for web printing: A dom+ visual based approach. In ACM symposium on document engineering (DocEng). New York: ACM.Google Scholar
  43. McFee, B., & Lanckriet, G. (2009). Partial order embedding with multiple kernels. In International conference on machine learning (ICML) (pp. 721–728). New York: ACM.Google Scholar
  44. McFee, B., & Lanckriet, G. (2010). Metric learning to rank. In International conference on machine learning (ICML).Google Scholar
  45. Mensink, T., Verbeek, J., Perronnin, F., & Csurka, G. (2012). Metric learning for large-scale image classification: Generalizing to new classes at near-zero cost. In European conference on computer vision (ECCV).Google Scholar
  46. Mignon, A., & Jurie, F. (2012). PCCA: A new approach for distance learning from sparse pairwise constraints. In IEEE conference on computer vision and pattern recognition (CVPR).Google Scholar
  47. Oliva, A., & Torralba, A. (2001). Modeling the shape of the scene: A holistic representation of the spatial envelope. International Journal of Computer Vision (IJCV), 42(3), 145–175.CrossRefzbMATHGoogle Scholar
  48. Parikh, D., & Grauman, K. (2011). Relative attributes. In IEEE international conference on computer vision (ICCV).Google Scholar
  49. Parkash, A., & Parikh, D. (2012). Attributes for classifier feedback. In European conference on computer vision (ECCV).Google Scholar
  50. Serre, T., Wolf, L., Bileschi, S., Riesenhuber, M., & Poggio, T. (2007). Robust object recognition with cortex-like mechanisms. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 29(3), 411–426.CrossRefGoogle Scholar
  51. Shaw, B., Huang, B. C., & Jebara, T. (2011). Learning a distance metric from a network. In Advances in neural information processing systems (NIPS) (pp. 1899–1907).Google Scholar
  52. Shepard, R. N. (1962a). The analysis of proximities: Multidimensional scaling with an unknown distance function. I. Psychometrika, 27(2), 125–140.MathSciNetCrossRefzbMATHGoogle Scholar
  53. Shepard, R. N. (1962b). The analysis of proximities: Multidimensional scaling with an unknown distance function. ii. Psychometrika, 27(3), 219–246.MathSciNetCrossRefzbMATHGoogle Scholar
  54. Sivic, J., & Zisserman, A. (2003). Video Google: A text retrieval approach to object matching in videos. In IEEE international conference on computer vision (ICCV).Google Scholar
  55. Song, R., Liu, H., Wen, J., & Ma, W. (2004). Learning block importance models for web pages. In World wide web conference (WWW).Google Scholar
  56. Spengler, A., & Gallinari, P. (2010). Document structure meets page layout: Loopy random fields for web news content extraction. In ACM symposium on document engineering (DocEng).Google Scholar
  57. Tenenbaum, J., De Silva, V., & Langford, J. (2000). A global geometric framework for nonlinear dimensionality reduction. Science, 290(5500), 2319–2323.CrossRefGoogle Scholar
  58. Theriault, C., Thome, N., & Cord, M. (2013). Extended coding and pooling in the hmax model. IEEE Transactions on Image Processing, 22(2), 764–777.MathSciNetCrossRefGoogle Scholar
  59. Torresani, L., & Lee, K. (2007). Large margin component analysis. In Advances in neural information processing systems (NIPS).Google Scholar
  60. Verma, N., Mahajan, D., Sellamanickam, S., & Nair, V. (2012). Learning hierarchical similarity metrics. In IEEE conference on computer vision and pattern recognition (CVPR).Google Scholar
  61. Weinberger, K., & Chapelle, O. (2008). Large margin taxonomy embedding with an application to document categorization. Advances in Neural Information Processing Systems (NIPS), 21, 1737–1744.Google Scholar
  62. Weinberger, K., & Saul, L. (2009). Distance metric learning for large margin nearest neighbor classification. The Journal of Machine Learning Research (JMLR), 10, 207–244.Google Scholar
  63. Xing, E., Ng, A., Jordan, M., & Russell, S. (2002). Distance metric learning, with application to clustering with side-information. In Advances in neural information processing systems (NIPS).Google Scholar
  64. Yang, J., Yu, K., Gong, Y., Huang, T. (2009) Linear spatial pyramid matching using sparse coding for image classification. In IEEE conference on computer vision and pattern recognition (CVPR).Google Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  1. 1.Sorbonne Universités, UPMC Univ Paris 06, UMR 7606, LIP6ParisFrance

Personalised recommendations