Personalized Web Image Organization

  • Lei MengEmail author
  • Ah-Hwee Tan
  • Donald C. Wunsch II
Part of the Advanced Information and Knowledge Processing book series (AI&KP)


Due to the problem of semantic gap, i.e. the visual content of an image may not represent its semantics well, existing efforts on web image organization usually transform this task to clustering the surrounding text. However, because the surrounding text is usually short and the words therein usually appear only once, existing text clustering algorithms can hardly use the statistical information for image representation and may achieve downgraded performance with higher computational cost caused by learning from noisy tags. This chapter presents using the Probabilistic ART with user preference architecture, as introduced in Sects.  3.5 and  3.4, for personalized web image organization. This fused algorithm is named Probabilistic Fusion ART (PF-ART), which groups images of similar semantics together and simultaneously mines the key tags/topics of individual clusters. Moreover, it performs semi-supervised learning using the user-provided taggings for images to give users direct control of the generated clusters. An agglomerative merging strategy is further used to organize the clusters into a hierarchy, which is of a multi-branch tree structure rather than a binary tree generated by traditional hierarchical clustering algorithms. The entire two-step algorithm is called Personalized Hierarchical Theme-based Clustering (PHTC) , for tag-based web image organization. Two large-scale real-world web image collections, namely the NUS-WIDE and the Flickr datasets, are used to evaluate PHTC and compare it with existing algorithms in terms of clustering performance and time cost. The content of this chapter is summarized and extended from the prior study [17] (©2012 IEEE. Reprinted, with permission, from [17]).


  1. 1.
    Aichholzer O, Aurenhammer F (1996) Classifying hyperplanes in hypercubes. SIAM J Discret Math 9:225–232MathSciNetCrossRefGoogle Scholar
  2. 2.
    Cai D, He X, Li Z, Ma W, Wen J (2004) Hierarchical clustering of www image search results using visual, textual and link information. In: Proceedings ACM multimedia, pp 952–959Google Scholar
  3. 3.
    Carpenter GA, Grossberg S, Rosen DB (1991) Fuzzy ART: fast stable learning and categorization of analog patterns by an adaptive resonance system. Neural Netw 4:759–771CrossRefGoogle Scholar
  4. 4.
    Chen L, Xu D, Tsang IW, Luo J (2012) Tag-based image retrieval improved by augmented features and group-based refinement. IEEE Trans Multimed (T-MM) 14:1057–1067CrossRefGoogle Scholar
  5. 5.
    Chen Y, Dong M, Wan W (2007) Image co-clustering with multi-modality features and user feedbacks. In: MM pp 689–692Google Scholar
  6. 6.
    Chen Y, Rege M, Dong M, Hua J (2007) Incorporating user provided constraints into document clustering. In: ICDM pp 103–112Google Scholar
  7. 7.
    Chua T, Tang J, Hong R, Li H, Luo, Z, Zheng Y (2009) NUS-WIDE: a real-world web image database from national university of Singapore. In: CIVR pp 1–9Google Scholar
  8. 8.
    Cilibrasi R, Vitanyi PMB (2007) The google similarity distance. TKDE 19(3):370–383CrossRefGoogle Scholar
  9. 9.
    Ding H, Liu J, Lu H (2008) Hierarchical clustering-based navigation of image search results. In: Proceedings of ACM Multimedia, pp 741–744Google Scholar
  10. 10.
    Gower J, Ross G (1969) Minimum spanning trees and single linkage clustering analysis. J R Stat Soc Ser C 595–616Google Scholar
  11. 11.
    He J, Tan AH, Tan CL, Sung SY (2003) On quantitative evaluation of clustering systems. In: Clustering and Information Retrieval, Kluwer Academic Publishers, pp 105–133Google Scholar
  12. 12.
    Hsu C, Caverlee J, Khabiri E (2011) Hierarchical comments-based clustering. In: Proceedings ACM SAC, pp 1130–1137Google Scholar
  13. 13.
    Hu X, Sun N, Zhang C, Chua TS (2009) Exploiting internal and external semantics for the clustering of short texts using world knowledge. In: Proceedings of ACM conference on information and knowledge management, pp 919–928Google Scholar
  14. 14.
    Jing F, Wang C, Yao Y, Zhang L, Ma W (2006) Igroup: web image search results clustering. In: Proceedings of ACM Multimedia, pp 377–384Google Scholar
  15. 15.
    Li L, Liang Y (2010) A hierarchical fuzzy clustering algorithm. In: Proceedings ICCASM, pp 248–255Google Scholar
  16. 16.
    Liu D, Hua X, Yang L, Wang M, Zhang H (2009) Tag ranking. In: Proceedings of international conference on World Wide Web, pp 351–360Google Scholar
  17. 17.
    Meng L, Tan AH (2012) Semi-supervised hierarchical clustering for personalized web image organization. In: Proceedings of international joint conference on neural networks (IJCNN), pp 1–8Google Scholar
  18. 18.
    Pedersen T, Patwardhan S, Michelizzi J (2004) Wordnet: similarity: measuring the relatedness of concepts. Demonstration papers at HLT-NAACLGoogle Scholar
  19. 19.
    Rege M, Dong M, Fotouhi F (2006) Co-clustering documents and words using bipartite isoperimetric graph partitioning. In: Proceedings of international conference on data mining, pp 532–541Google Scholar
  20. 20.
    Schtze H, Silverstein C (1997) Projections for efficient document clustering. In: Proceedings SIGIR, pp 74–81Google Scholar
  21. 21.
    Shi X, Fan W, Yu PS (2010) Efficient semi-supervised spectral co-clustering with constraints. In: ICDM, pp 532–541Google Scholar
  22. 22.
    Xu W, Liu X, Gong Y (2003) Document clustering based on non-negative matrix factorization. In: Proceedings of SIGIR conference on research and development in information retrieval, pp 268–273Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.NTU-UBC Research Center of Excellence in Active Living for the Elderly (LILY)Nanyang Technological UniversitySingaporeSingapore
  2. 2.School of Computer Science and EngineeringNanyang Technological UniversitySingaporeSingapore
  3. 3.Applied Computational Intelligence LaboratoryMissouri University of Science and TechnologyRollaUSA

Personalised recommendations