Image Clustering Using Multimodal Keywords

  • Rajeev Agrawal
  • William Grosky
  • Farshad Fotouhi
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4306)


Extending our previous work on visual keywords, we use the concept of template-based visual keywords using MPEG-7 color descriptors. MPEG-7, also called the Multimedia Content Description Interface, has been a standard for many years. These color descriptors have the ability to characterize perceptual color similarity and need relatively low complexity operations to extract them, besides being scalable and interoperable. We then demonstrate the power of these visual keywords for image clustering, when used in tandem with textual keyword annotations, in the context of latent semantic analysis, a popular technique in classical information retrieval which has been used to reveal the underlying semantic structure of document collections.


MPEG-7 visual keywords textual keywords latent semantic analysis singular value decomposition adjusted rand index 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Vailaya, A., Figueiredo, M., Jain, A., Zhang, H.: Image Classification for Content-Based Indexing. IEEE Transaction on Image Processing 10(1) (2001)Google Scholar
  2. 2.
    Carson, C., Belonge, S., Greenspan, H., Malik, J.: Blobworld: A System for Region-Based Image Indexing and Retrieval. In: Huijsmans, D.P., Smeulders, A.W.M. (eds.) VISUAL 1999. LNCS, vol. 1614, pp. 509–517. Springer, Heidelberg (1999)CrossRefGoogle Scholar
  3. 3.
    Smeulders, A.W.M., Worring, M., Santini, S., Gupta, A., Jain, R.: Content based image retrieval at the end of the early years. IEEE Transactions on Pattern Analysis and Machine Intelligence 22(12), 1349–1380 (2000)CrossRefGoogle Scholar
  4. 4.
    Bhattacharya, A., Ljosa, V., Pan, J., Verardo, M.R., Yang, H., Faloutsos, C., Singh, A.K.: ViVo: Visual Vocabulary Construction for Mining Biomedical Images. In: ICDM (2005)Google Scholar
  5. 5.
    Sreenath, D.V., Grosky, W.I., Fotouhi, F.: Using Coherent Semantic Subpaths to Derive Emergent Semantics. In: Negoita, M.G., Howlett, R.J., Jain, L.C. (eds.) KES 2004. LNCS (LNAI), vol. 3215, pp. 173–179. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  6. 6.
    Dhillon, I.S., Modha, d.S.: Concept Decompositions for Large Sparse Text Data Using Clustering. Machine Learning 42(1), 143–175 (2001)zbMATHCrossRefGoogle Scholar
  7. 7.
    Salton, G., McGill, M.J.: Introduction to Modern retrieval. McGraw Hill Book Company, New York (1983)zbMATHGoogle Scholar
  8. 8.
    Berry, M.W., Dumais, S.T., O’Brien, G.W.: Using Linear Algebra for Intelligent Information Retrieval. SIAM Review 37(4), 573–595 (1995)zbMATHCrossRefMathSciNetGoogle Scholar
  9. 9.
    Manjunath, B.S., Salembier, P., Sikora, T. (eds.): Introduction to MPEG-7- Multimedia Content Description Interface. John Wiley & Sons, Chichester (2002)Google Scholar
  10. 10.
    van Rijsbergen, C.J., Robertson, S.E., Porter, M.F.: New models in probabilistic information retrieval. British Library Research and Development Report, no. 5587 (1980)Google Scholar
  11. 11.
    Barnard, K., Duygulu, P., de Freitas, N., Forsyth, D., Blei, D., Jordan, M.: Matching words and pictures. Journal of Machine Learning Research 3, 1107–1135 (2003)zbMATHCrossRefGoogle Scholar
  12. 12.
  13. 13.
    MPEG-7: Visual experimentation model (xm) version 10.0. ISO/IEC/JTC1/SC29/WG11, Doc. N4062 (2001)Google Scholar
  14. 14.
    Turk, M.A., Pentland, A.P.: Eigenfaces for recognition. Journal of Cognitive Neuroscience 3(1), 71–96 (1991)CrossRefGoogle Scholar
  15. 15.
    Draper, B.A., Baek, K., Barlett, M.S., Beveridge, J.R.: Recognizing faces with PCA and ICA. Comp. Vis. And Image Understanding (91), 115–137 (2003)Google Scholar
  16. 16.
    Kasutani, E., Yamada, A.: The MPEG-7 Color Layout Descriptor: A Compact Image Feature Description for High-Speed Image/Video Segment Retrieval. In: ICIP 2001, October 2001, vol. I, pp. 674–677 (2001)Google Scholar
  17. 17.
    Manjunath, B.S., Ohm, J.R., Vasudevan, V.V., Yamada, A.: Color and Texture Descriptors. IEEE Transactions on Circuits and Systems for Video Technology 11(6) (2001)Google Scholar
  18. 18.
    Deerwester, A., Dumais, S.T., Landauer, T.K., Furnas, G.W., Harshman, R.A.: Indexing by latent semantic analysis. Journal of the American Society of Information Science 41(6), 391–407 (1990)CrossRefGoogle Scholar
  19. 19.
    Eckart, C., Young, G.: The Approximation of One Matrix by another of Lower Rank. Psychometrika 1, 211–218 (1936)CrossRefGoogle Scholar
  20. 20.
    Karypis, G.: CLUTO: A Clustering Toolkit Release 2.1.1, University of Minnesota, Department of Computer Science, Minneapolis, MN 55455, Technical Report: #02-017 (November 28, 2003)Google Scholar
  21. 21.
    Text retrieval Conference,
  22. 22.
    Markkula, M., Sormunen, E.: Searching for photos — journalists’ practices in pictorial IR. In: The Challenge of Image Retrieval. Electronic Workshops in computing (1988)Google Scholar
  23. 23.
    Smeaton, A.F., Quigley, I.: Experiments on Using Semantic Distances Between Words in Image Caption Retrieval. In: Proceedings of SIGIR 1996, pp. 174–180 (1996)Google Scholar
  24. 24.
  25. 25.
    Zeimpekis, D., Gallopoulos, E.: TMG: A MATLAB toolbox for generating term-document matrices from text collections. Technical Report HPCLAB-SCG 1/01-05, Computer Engineering & Informatics Dept., University of Patras, Greece, January (2005); Kogan, J., Nicholas, C., Teboulle, M. (eds.): Grouping Multidimensional Data: Recent Advances in Clustering. Springer, Heidelberg (2005) (to appear) Google Scholar
  26. 26.
    Carson, C., Belonge, S., Greenspan, H., Malik, J.: Blobworld: Image Segmentation using Expectation-Maximization and its application to image querying. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(8), 1026–1038 (2002)CrossRefGoogle Scholar
  27. 27.
    Frankel, C., Swain, M.J., Athios, V.: Webseer: An Image Search Engine for the World Wide Web. U. Chicago TR-96-14 (1996)Google Scholar
  28. 28.
    Hubert, L., Arabie, P.: Comparing partitions. Journal of Classification, 193–218 (1985)Google Scholar
  29. 29.
    Kuncheva, L.I., Hadjitodorov, S.T.: Using Diversity in Cluster Ensembles. In: IEEE International Conference on Systems, Man and Cybernetics, vol. 2, pp. 1214–1219 (2004)Google Scholar
  30. 30.
    Russell, B.C., Torralba, A., Murphy, K.P., Freeman, W.T.: LabelMe: a database and web based tool for image annotation. MIT AI Lab Memo AIM-2005-025 (2005)Google Scholar
  31. 31.
  32. 32.
    Tang, J., Hare, J.S., Lewis, P.H.: Image Auto-annotation using a Statistical Model with Salient Regions (Speech). In: IEEE International Conference on Multimedia & Expo (ICME 2006) (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Rajeev Agrawal
    • 1
    • 2
  • William Grosky
    • 3
  • Farshad Fotouhi
    • 2
  1. 1.Kettering UniversityFlintUSA
  2. 2.Wayne State UniversityDetroitUSA
  3. 3.The University of Michigan – DearbornDearbornUSA

Personalised recommendations