Skip to main content

The Earth Mover's Distance as a Metric for Image Retrieval

Abstract

We investigate the properties of a metric between two distributions, the Earth Mover's Distance (EMD), for content-based image retrieval. The EMD is based on the minimal cost that must be paid to transform one distribution into the other, in a precise sense, and was first proposed for certain vision problems by Peleg, Werman, and Rom. For image retrieval, we combine this idea with a representation scheme for distributions that is based on vector quantization. This combination leads to an image comparison framework that often accounts for perceptual similarity better than other previously proposed methods. The EMD is based on a solution to the transportation problem from linear optimization, for which efficient algorithms are available, and also allows naturally for partial matching. It is more robust than histogram matching techniques, in that it can operate on variable-length representations of the distributions that avoid quantization and other binning problems typical of histograms. When used to compare distributions with the same overall mass, the EMD is a true metric. In this paper we focus on applications to color and texture, and we compare the retrieval performance of the EMD with that of other distances.

This is a preview of subscription content, access via your institution.

References

  1. Ahuja, R.K., Magnanti, T.L., and Orlin, J.B. 1993. Network Flows. Prentice Hall: Englewood Cliffs, NJ.

    Google Scholar 

  2. Belongie, S., Carson, C., Greenspan, H., and Malik, J. 1998. Colorand texture-based image segmentation using EM and its application to content-based image retrieval. In IEEE International Conference on Computer Vision, Bombay, India. pp. 675–682.

  3. Bentley, J.L. 1975. Multidimensional binary search trees used for associative searching. Communications of the ACM, 18:509–517.

    Google Scholar 

  4. Bigün, J. and du Buf, J.M. 1994. N-folded symmetries by complex moments in Gabor space and their application to unsupervised texture segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 16(1):80–87.

    Google Scholar 

  5. Bovik, A.C., Clark, M., and Geisler, W.S. 1990. Multichannel texture analysis using localized spatial filters. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(12):55–73.

    Google Scholar 

  6. Bozkaya, T. and Ozsoyoglu, M. 1997. Distance-based indexing for high-dimensional metric spaces. SIGMOD Record (ACM Special Interest Group on Management of Data), 26(2):357–368.

    Google Scholar 

  7. Brodatz, P. 1966. Textures: A Photographic Album for Artists and Designers: Dover: New York, NY.

    Google Scholar 

  8. Clarkson, K.L. 1997. Nearest neighbor queries in metric spaces. In ACM Symposium on the Theory of Computing, pp. 609–617.

  9. Cover, T.M. and Thomas, J.A. 1991. Elements of Information Theory. Wiley Series in Telecommunications. John Wiley & Sons: New York, NY, USA.

    Google Scholar 

  10. Das, M., Riseman, E.M., and Draper, B.A. 1997. FOCUS: Searching for multi-colored objects in a diverse image database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 756–761.

  11. Daugman, J.D. 1998. Complete discrete 2-d Gabor transforms by neural networks for image analysis and compression. IEEE Transactions on Acoustics, Speech, and Signal Processing, 36:1169–1179.

    Google Scholar 

  12. Duda, R.O. and Hart, P.E. 1973. Pattern Classification and Scene Analysis. Wiley: New York.

    Google Scholar 

  13. Farrokhnia, F. and Jain, A.K. 1991. A multi-channel filtering approach to texture segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 364–370.

  14. Field, D.J. 1987. Relations between the statistics of natural images and the response properties of cortical cells. Journal of the Optical Society of America A, 4(12):2379–2394.

    Google Scholar 

  15. Gabor, D. 1946. Theory of communication. The Journal of the Institute of Electrical Engineers, Part III, 93(21):429–457.

    Google Scholar 

  16. Hafner, J., Sawhney, H.S., Equitz, W., Flickner, M., and Niblack, W. 1995. Efficient color histogram indexing for quadratic form distance functions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(7):729–735.

    Google Scholar 

  17. Hillier, F.S. and Lieberman, G.J. 1990. Introduction to Mathematical Programming. McGraw-Hill, New York, NY.

    Google Scholar 

  18. Hitchcock, F.L. 1941. The distribution of a product from several sources to numerous localities. J. Math. Phys., 20:224–230.

    Google Scholar 

  19. Karmarkar, N. 1984. A new polynomial-time algorithm for linear programming. In Proceedings of the Sixteenth Annual ACM Symposium on Theory of Computing, Washington, DC. pp. 302–311.

  20. Klee, V. and Minty, G. 1972. How good is the simplex algorithm. In Inequalities, Vol. III, O. Shisha (Ed.). Academic Press: New York, NY, pp. 159–175.

    Google Scholar 

  21. Kullback, S. 1968. Information Theory and Statistics Dover: New York, NY.

    Google Scholar 

  22. Liu, F. and Picard, R.W. 1996. Periodicity, directionality, and randomness: Wold features for image modeling and retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(7):722–733.

    Google Scholar 

  23. Manjunath, B.S. and Ma, W.Y. 1996. Texture features for browsing and retrieval of image data. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(8):837–842.

    Google Scholar 

  24. Nasrabad, N.M. and King, R.A. 1988. Image coding using vector quantization: A review. IEEE Transactions on Communication, 36(8):957–971.

    Google Scholar 

  25. Niblack, W., Barber, R., Equitz, W., Flickner, M.D., Glasman, E.H., Petkovic, D., Yanker, P., Faloutsos, C., Taubin, G., and Heights, Y. 1993. Querying images by content, using color, texture, and shape. In SPIE Conference on Storage and Retrieval for Image and Video Databases, Vol. 1908, pp. 173–187.

    Google Scholar 

  26. Peleg, S., Werman, M., and Rom, H. 1989. A unified approach to the change of resolution: Space and gray-level. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11:739–742.

    Google Scholar 

  27. Poynton, C. 1996. A Technical Introduction to Digital Video. John Wiley and Sons, New York, NY.

    Google Scholar 

  28. Puzicha, J., Hofmann, T., and Buhmann, J. 1997. Non-parametric similarity measures for unsupervised texture segmentation and image retrieval. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 267–272.

  29. Rachev, S.T. 1984. The Monge-Kantorovich mass transference problem and its stochastic applications. Theory of Probability and its Applications, XXIX(4):647–676.

    Google Scholar 

  30. Russell, E.J. 1969. Extension of Dantzig's algorithm to finding an initial near-optimal basis for the transportation problem. Operations Research, 17:187–191.

    Google Scholar 

  31. Shen, H.C. and Wong, A.K.C. 1983. Generalized texture representation and metric. Computer, Vision, Graphics, and Image Processing, 23:187–206.

    Google Scholar 

  32. Smith, J.R. 1997. Integrated Spatial and Feature Image Systems: Retrieval, Analysis and Compression. Ph.D. Thesis, Columbia University.

  33. Stolfi, J. 1994. Personal communication.

  34. Stricker, M. and Orengo, M. 1995. Similarity of color images. In SPIE Conference on Storage and Retrieval for Image and Video Databases III, Vol. 2420, pp. 381–392.

    Google Scholar 

  35. Swain, M.J. and Ballard, D.H. 1991. Color indexing. International Journal of Computer Vision, 7(1):11–32.

    Google Scholar 

  36. Tversky, A. 1977. Features of similarity. Psychological Review, 84(4):327–352.

    Google Scholar 

  37. Werman, M., Peleg, S., and Rosenfeld, A. 1985. A distance metric for multi-dimensional histograms. Computer, Vision, Graphics, and Image Processing, 32:328–336.

    Google Scholar 

  38. Wyszecki, G. and Stiles, W.S. 1982. Color Science: Concepts and Methods, Quantitative Data and Formulae. John Wiley and Sons: New York, NY.

    Google Scholar 

  39. Zikan, K. 1990. The Theory and Applications of Algebraic Metric Spaces. Ph.D. Thesis, Stanford University.

Download references

Author information

Affiliations

Authors

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Rubner, Y., Tomasi, C. & Guibas, L.J. The Earth Mover's Distance as a Metric for Image Retrieval. International Journal of Computer Vision 40, 99–121 (2000). https://doi.org/10.1023/A:1026543900054

Download citation

  • image retrieval
  • perceptual metrics
  • color
  • texture
  • Earth Mover's Distance