Combination of Distances and Image Features for Clustering Image Data Bases

  • Sarah FrostEmail author
  • Daniel Baier
Part of the Studies in Classification, Data Analysis, and Knowledge Organization book series (STUDIES CLASS)


Daily, millions of pictures are released online but it is hard to analyze them automatically for marketing purposes. This paper tries to show how methods from the content-based image retrieval could be used to classify image data and make them usable for marketing applications. There are a number of different image features which can be extracted from the images to calculate dissimilarities between them afterwards with different kinds of distance measures (Manjunath et al. 2001). We focus especially on mass-transportation-problems, like the Earth Mover’s Distance (EMD) (Rubner et al., Int J Comput Vis 40(2):99–121, 2000), because they fit the human perception on dissimilarities. Furthermore there are already some studies that show that they are robust to disturbances like changes in resolution, contrast, or noise (Frost and Baier, Algorithms from and for nature and life. Studies in classification, data analysis, and knowledge organization, vol 45. Springer, Heidelberg, 2013). We compare some approximations of the EMD (e.g., Pele and Werman 2009) with an approximation algorithm developed by ourselves. The aim is to find a combination of features and distances which allows to cluster large image data bases in a way that fits the human perception.


Color Histogram Image Block Rand Index Adjusted Rand Index Transportation Distance 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. Buturovic, A. (2005). MPEG-7 color structure descriptor: For visual information retrieval project VizIR. Institute for Software Technology and Interactive Systems Technical University Vienna, Technical Report.Google Scholar
  2. Chatzichristofis, S., & Boutalis, Y. (2008a). CEDD: Color and edge directivity descriotor - A compact description for image indexing and retrieval. In A. Gasteratos, M. Vincze, & J. Tsotsos (Eds.), Computer vision systems (pp. 312–322). Heidelberg: Springer.CrossRefGoogle Scholar
  3. Chatzichristofis, S., & Boutalis, Y. (2008b). FCTH: Fuzzy color and texture histogram - A low level feature for accurate image retrieval. In 9th International Workshop on Image Analysis for Multimedia Interactive Services (pp. 191–196).Google Scholar
  4. Deza, M., & Deza, E. (2009). Encyclopedia of distances. Berlin: Springer.CrossRefzbMATHGoogle Scholar
  5. Frost, S., & Baier, D. (2013). Comparing earth mover’s distance and its approximations for clustering images. In B. Lausen, D. van den Poel, & A. Ultsch (Eds.), Algorithms from and for nature and life. Studies in classification, data analysis, and knowledge organization (Vol. 45). Heidelberg: SpringerGoogle Scholar
  6. Geman, D., Geman, S., Graffigne, C., & Dong, P. (1990). Boundary detection by constrained optimization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(7), 609–628.CrossRefGoogle Scholar
  7. Hafner, J., Sawhney, H., Equitz, W., Flickner, M., & Niblack, W. (1995). Efficient color histogram indexing for quadratic form distance functions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(7), 729–736.CrossRefGoogle Scholar
  8. Hubert, L., & Arabie, P. (1985). Comparing partitions. Journal of Classification, 2, 193–218.CrossRefGoogle Scholar
  9. Manjunath, B., Ohm, J.-R., Vasudevan, V., & Yamada, A. (2001). Color and texture descriptors. IEEE Transactions on Circuits and Systems for Video Technology, 11(6), 703–715.CrossRefGoogle Scholar
  10. Manjunath, B., & Sikora, T. (2002). Overview over visual descriptors. In B. S. Manjunath, P. Salembier, & T. Sikora (Eds.), Introduction to MPEG-7 (pp. 179–185). Chichester: Wiley.Google Scholar
  11. Minkowski, H. (1910). Geometrie der Zahlen (2nd ed.). Leipzig: Teubner.zbMATHGoogle Scholar
  12. Naundorf, R., Baier, D., & Schmitt, I. (2012). Statistical software for clustering images. Institute of Computer Science, Brandenburg University of Technology Cottbus, Technical Report.Google Scholar
  13. Niblack, W., Barber, R., Equitz, W., Flickner, M., Glasman, E., Petkovic, D., et al. (1993). The QBIC project: Querying images by content using color, texture, and shape. In SPIE Storage and Retrieval for Image and Video Databases, 1908 (pp. 173–187).Google Scholar
  14. Ojala, J.-R., Pietikänien, M., & Harwood, D. (1996). A comparative study of texture measures with classification-based on feature distributions. Pattern Recognition, 29(1), 51–59.CrossRefGoogle Scholar
  15. Park, D., Jeon, Y., & Won, C. (2000). Efficient use of local edge histogram descriptor. In MULTIMEDIA ’00 Proceedings of the 2000 ACM Workshops on Multimedia (pp. 51–54).Google Scholar
  16. Pele, O., & Werman, M. (2009). Fast and robust earth mover’s distances. In IEEE 12th International Conference on Computervision, 12(1), 460–467.Google Scholar
  17. Rubner, Y., Guibas, L., & Tomasi, C. (1997). The earth mover’s distance, multi- dimensional scaling, and color-based image retrieval. In Proceedings of the ARPA Image Understanding Workshop (pp. 661–668).Google Scholar
  18. Rubner, Y., Tomasi, C., & Guibas, L. J. (2000). The earth mover’s distance as a metric for image retrieval. International Journal of Computer Vision, 40(2), 99–121.CrossRefzbMATHGoogle Scholar
  19. Swain, M., & Ballard, D. (1991). Color indexing. International Journal of Computer Vision, 7(1), 11–32.CrossRefGoogle Scholar
  20. Tamura, H., Mori, S., & Yamawaki, T. (1978). Textural features corresponding to visual perception. IEEE Transactions on Systems, Man, and Cybernetics, 8(6), 460–473.CrossRefGoogle Scholar
  21. Werman, M., Peleg, S., & Rosenfeld, A. (1985). A distance metric for multidimensional histograms. Computer Vision, Graphics, and Image Processing, 32, 328–336.CrossRefzbMATHGoogle Scholar
  22. Wyszecki, G., & Stiles, W. (2000). Color science: Concepts and methods, quantitative data and formulae (2nd ed.). New York: Wiley.Google Scholar
  23. Zhang, D., & Lu, G. (2003). Evaluation of similarity measurements for image retrieval. In IEEE International Conference on Neural Networks and Signal Processing (pp. 928–931).Google Scholar
  24. Zhang, D., Wong, A., Indrawan, M., & Lu, G. (2000). Content-based image retrieval using gabor texture features. In First IEEE Pacific-Rim Conference on Multimedia (pp. 392–395).Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  1. 1.Institute of Business Administration and EconomicsBrandenburg University of Technology CottbusCottbusGermany

Personalised recommendations