Multimedia Systems

, Volume 23, Issue 1, pp 41–52 | Cite as

Graph-based clustering and ranking for diversified image search

  • Yan Yan
  • Gaowen Liu
  • Sen Wang
  • Jian ZhangEmail author
  • Kai Zheng
Special Issue Paper


In this paper, we consider the problem of clustering and re-ranking web image search results so as to improve diversity at high ranks. We propose a novel ranking framework, namely cluster-constrained conditional Markov random walk (CCCMRW), which has two key steps: first, cluster images into topics, and then perform Markov random walk in an image graph conditioned on constraints of image cluster information. In order to cluster the retrieval results of web images, a novel graph clustering model is proposed in this paper. We explore the surrounding text to mine the correlations between words and images and therefore the correlations are used to improve clustering results. Two kinds of correlations, namely word to image and word to word correlations, are mainly considered. As a standard text process technique, tf-idf method cannot measure the correlation of word to image directly. Therefore, we propose to combine tf-idf method with a novel feature of word, namely visibility, to infer the word-to-image correlation. By latent Dirichlet allocation model, we define a topic relevance function to compute the weights of word-to-word correlations. Taking word to image correlations as heterogeneous links and word-to-word correlations as homogeneous links, graph clustering algorithms, such as complex graph clustering and spectral co-clustering, are respectively used to cluster images into topics in this paper. In order to perform CCCMRW, a two-layer image graph is constructed with image cluster nodes as upper layer added to a base image graph. Conditioned on the image cluster information from upper layer, Markov random walk is constrained to incline to walk across different image clusters, so as to give high rank scores to images of different topics and therefore gain the diversity. Encouraging clustering and re-ranking outputs on Google image search results are reported in this paper.


Web image clustering Ranking Diversity Visibility Graph model 



This work is partially supported by the National Natural Science foundation of China (No.61303143) and the Scientific Research Fund of Zhejiang Provincial Education Department (No.Y201326609).


  1. 1.
    Agrawal, R., Gollapudi, S., Halverson, A., Ieong, S.: Diversifying search results. In: Proceedings of the Second ACM International Conference on Web Search and Data Mining, pp. 784–791 (2009)Google Scholar
  2. 2.
    Barnard, K., Duygulu, P., Forsyth, D., de Freitas, N., Blei, D.M., Jordan, M.I.: Matching words and pictures. J. Mach. Learn. Res. 3, 1107–1135 (2003)zbMATHGoogle Scholar
  3. 3.
    Berg, T.L., Forsyth, D.A.: Animals on the web. In: Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR)Google Scholar
  4. 4.
    Blei, D.M., Jordan, M.I.: Modeling annotated data. In: Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), pp. 127–134 (2003)Google Scholar
  5. 5.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)zbMATHGoogle Scholar
  6. 6.
    Boyce, B.: Beyond topicality: a two stage view of relevance and the retrieval process. Info. Process. Manag. 18(3), 105–109 (1982)CrossRefGoogle Scholar
  7. 7.
    Cai, D., He, X., Li, Z., Ma, W.Y., Wen, J.R.: Hierarchical clustering of www image search results using visual, textual and link information. In: Proceedings of the 13th Annual ACM International Conference on Multimedia (ACM MM), pp. 952–959 (2004)Google Scholar
  8. 8.
    Chang, X., Shen, H., Wang, S., Liu, J., Li, X.: Semi-supervised feature analysis for multimedia annotation by mining label correlation. In: Advances in knowledge discovery and data mining, pp. 74–85. Springer, Berlin (2014)Google Scholar
  9. 9.
    Clarke, C.L., Kolla, M., Cormack, G.V., Vechtomova, O., Ashkan, A., Bttcher, S., MacKinnon, I.: Novelty and diversity in information retrieval evaluation. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), pp. 659–666 (2008)Google Scholar
  10. 10.
    Coelho, T.A., Calado, P.P., Souza, L.V., Ribeiro-Neto, B., Muntz, R.: Image retrieval using multiple evidence ranking. IEEE Trans. Knowl. Data Eng. 16(4), 408–417 (2004)CrossRefGoogle Scholar
  11. 11.
    Datta, R., Joshi, D., Li, J., Wang, J.Z.: Image retrieval: Ideas, influences, and trends of the new age. ACM Computing Surveys.Google Scholar
  12. 12.
    Deschacht, K., Moens, M.F.: Text analysis for automatic image annotation. In: Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 1000–1007 (2007)Google Scholar
  13. 13.
    Dhillon, I.S.: Co-clustering documents and words using bipartite spectral graph partitioning. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD), pp. 269–274 (2001)Google Scholar
  14. 14.
    Ding, H., Liu, J., Lu, H.: Hierarchical clustering-based navigation of image search results. In: Proceedings of the 16th Annual ACM International Conference on Multimedia (ACM MM), pp. 741–744 (2008)Google Scholar
  15. 15.
    Goffman, W.: A searching procedure for information retrieval. Info. Storage Retr. 2, 73–78Google Scholar
  16. 16.
    Han, Y., Wei, X., Cao, X., Yang, Y., Zhou, X.: Augmenting image descriptions using structured prediction output. IEEE Trans. Multimed. doi: 10.1109/TMM.2014.2321530
  17. 17.
    Han, Y., Wu, F., Lu, X., Tian, Q., Zhuang, Y., Luo, J.: Correlated attribute transfer with multi-task graph-guided fusion. In: Proceedings of the 20th ACM International Conference on Multimedia, pp. 529–538. ACM Press, New York (2012)Google Scholar
  18. 18.
    Han, Y., Wu, F., Tian, Q., Zhuang, Y.: Image annotation by input-output structural grouping sparsity. IEEE Trans. Image Process. 21(6), 3066–3079 (2012)MathSciNetCrossRefGoogle Scholar
  19. 19.
    Han, Y., Yang, Y., Ma, Z., Shen, H., Sebe, N., Zhou, X.: Image attribute adaptation. IEEE Trans. Multimed. 16(4), 1115–1126 (2014)CrossRefGoogle Scholar
  20. 20.
    Han, Y., Yang, Y., Yan, Y., Ma, Z., Sebe, N., Zhou, X.: Semisupervised feature selection via spline regression for video semantic recognition. IEEE Trans. Neural Netw. Learn. Syst. doi: 10.1109/TNNLS.2014.2314123
  21. 21.
    Hu, Y., Yu, N., Li, Z., Li, M.: Image search result clustering and re-ranking via partial grouping. In: Proceedings of the 2007 IEEE International Conference on Multimedia and Expo (ICME)Google Scholar
  22. 22.
    Jing, F., Wang, C., Yao, Y., Deng, K., Zhang, L., Ma, W.Y.: Igroup: a web image search engine with semantic clustering of search results. In: Proceedings of the 14th Annual ACM International Conference on Multimedia (ACM MM), pp. 377–384 (2006)Google Scholar
  23. 23.
    Li, H., Tang, J., Li, G., Chua, T.S.: Word2image: Towards visual interpretation of words. In: Proceedings of the 16th Annual ACM International Conference on Multimedia (ACM MM), pp. 813–816 (2008)Google Scholar
  24. 24.
    Liu, T.Y., Ma, W.Y.: Webpage importance analysis using conditional markov random walk. In: Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence, pp. 515–521 (2005)Google Scholar
  25. 25.
    Long, B., Zhang, M.Z., Yu, P.S., Xu, T.: Clustering on complex graphs. In: Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence (AAAI), pp. 659–664 (2008)Google Scholar
  26. 26.
    Ma, Z., Yang, Y., Nie, F., Sebe, N., Yan, S., Hauptmann, A.G.: Harnessing lab knowledge for real-world action recognition. Int. J. Comput. Vis. 109(1–2), 60–73 (2014)CrossRefzbMATHGoogle Scholar
  27. 27.
    Ma, Z., Yang, Y., Sebe, N., Hauptmann, A.G.: Knowledge adaptation with partially shared features for event detection using few exemplars. IEEE Trans.Pattern Anal. Mach. Intell. 36(9), 1789–1802 (2014)Google Scholar
  28. 28.
    Radlinski, F., Kleinberg, R., Joachims, T.: Learning diverse rankings with multi-armed bandits. In: Proceedings of the 25th International Conference on Machine Learning (ICML) (2008)Google Scholar
  29. 29.
    Rege, M., Dong, M., Hua, J.: Graph theoretical framework for simultaneously integrating visual and textual features for efficient web image clustering. In: Proceedings of the 17th International Conference on World Wide Web (WWW), pp. 317–326 (2008)Google Scholar
  30. 30.
    Saenko, K., Darrell, T.: Unsupervised learning of visual sense models for polysemous words. In: Proceedings of Twenty-Second Annual Conference on Neural Information Processing Systems (NIPS) (2008)Google Scholar
  31. 31.
    Schroff, F., Criminisi, A., Zisserman, A.: Harvesting image databases from the web. In: ICCV 2007. IEEE 11th International Conference on Computer Vision, pp. 1–8 (2007)Google Scholar
  32. 32.
    Strehl, A., Ghosh, J.: Cluster ensembles—a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 3, 583–617 (2003)MathSciNetzbMATHGoogle Scholar
  33. 33.
    Wan, X., Yang, J.: Multi-document summarization using cluster-based link analysis. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 299–306. ACM Press, New York (2008)Google Scholar
  34. 34.
    Wang, X.J., Ma, W.Y., Li, X.: Data-driven approach for bridging the cognitive gap in image retrieval. In: Proceedings of IEEE International Conference on Multimedia and Expo (ICME), pp. 2231–2234 (2004)Google Scholar
  35. 35.
    Wang, X.J., Zhang, L., Li, X., Ma, W.Y.: Annotating images by mining image search results. IEEE Trans. Pattern Anal. Mach. Intell. 30(11), 1919–1932 (2008)CrossRefGoogle Scholar
  36. 36.
    Wu, F., Han, Y., Zhuang, Y.: Multiple hypergraph clustering of web images by mining word2image correlations. J. Comput. Sci. Technol. 25(4), 750–760 (2010)Google Scholar
  37. 37.
    Yang, Y., Ma, Z., Hauptmann, A.G., Sebe, N.: Feature selection for multimedia analysis by sharing information among multiple tasks. IEEE Trans. Multimed. 15(3), 661–669 (2013)CrossRefGoogle Scholar
  38. 38.
    Yang, Y., Nie, F., Xu, D., Luo, J., Zhuang, Y., Pan, Y.: A multimedia retrieval framework based on semi-supervised ranking and relevance feedback. IEEE Trans. Pattern Anal. Mach. Intell. 34(4), 723–742 (2012)Google Scholar
  39. 39.
    Yang, Y., Xu, D., Nie, F., Yan, S., Zhuang, Y.: Image clustering using local discriminant models and global integration. IEEE Trans. Image Process. 19(10), 2761–2773 (2010)MathSciNetCrossRefGoogle Scholar
  40. 40.
    Yang, Y., Zhuang, Y.T., Wu, F., Pan, Y.H.: Harmonizing hierarchical manifolds for multimedia document semantics understanding and cross-media retrieval. IEEE Trans. Multimed. 10(3), 437–446 (2008)CrossRefGoogle Scholar
  41. 41.
    Zhai, C.X., Cohen, W.W., Lafferty, J.: Beyond independent relevance: methods and evaluation metrics for subtopic retrieval. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval, pp. 10–17. ACM Press, New York (2003)Google Scholar
  42. 42.
    Zhang, L., Han, Y., Yang, Y., Song, M., Yan, S., Tian, Q.: Discovering discrminative graphlets for aerial image categories recognition. IEEE Trans. Image Process. 22(12), 5071–5084 (2013)MathSciNetCrossRefGoogle Scholar
  43. 43.
    Zhang, L., Yang, Y., Gao, Y., Wang, C., Yu, Y., Li, X.: A probabilistic associative model for segmenting weakly-supervised images. IEEE Trans. Image Process. 23(9), 4150–4159 (2014)MathSciNetCrossRefGoogle Scholar
  44. 44.
    Zhu, X., Goldberg, A.B., Eldawy, M., Dyer, C.R., Strock, B.: A text-to-picture synthesis system for augmenting communication. In: Proceedings of the 22nd Conference on Artificial Intelligence: Integrated Intelligence Track (AAAI), pp. 1590–1595 (2007)Google Scholar
  45. 45.
    Zhu, X., Goldberg, A.B., Gael, J.V., Andrzejewski, D.: Improving diversity in ranking using absorbing random walks. In: Proceedings of NAACL HLT, pp. 97–104 (2007)Google Scholar
  46. 46.
    Zhuang, Y.T., Yang, Y., Wu, F.: Mining semantic correlation of heterogeneous multimedia data for cross-media retrieval. IEEE Trans. Multimed. 10(2), 221–229 (2008)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  • Yan Yan
    • 1
  • Gaowen Liu
    • 2
  • Sen Wang
    • 1
  • Jian Zhang
    • 3
    Email author
  • Kai Zheng
    • 1
  1. 1.School of Information Technology and Electrical EngineeringThe University of QueenslandBrisbaneAustralia
  2. 2.Department of Information Engineering and Computer ScienceUniversity of TrentoTrentoItaly
  3. 3.School of Science and TechnologyZhejiang International Studies UniversityZhejiangChina

Personalised recommendations