Skip to main content
Log in

Bidirectional-isomorphic manifold learning at image semantic understanding & representation

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

From relevant textual information to improve visual content understanding and representation is an effective way for deeply understanding web image content. However, the description of images is usually imprecise at the semantic level, which is caused by the noisy and redundancy information in both text (such as surrounding text in HTML pages) and visual (such as intra-class diversity) aspects. This paper considers the solution from the association analysis for image content and presents a Bidirectional- Isomorphic Manifold learning strategy to optimize both visual feature space and textual space, in order to achieve more accurate comprehension for image semantics and relationships. To achieve this optimization between two different models, Bidirectional-Isomorphic Manifold Learning utilizes a novel algorithm to unify adjustments in both models together to a topological structure, which is called the reversed Manifold mapping. We also demonstrate its correctness and convergence from a mathematical perspective. Image annotation and keywords correlation analysis are applied. Two groups of experiments are conducted: The first group is carried on the Corel 5000 image database to validate our method’s effectiveness by comparing with state-of-the-art Generalized Manifold Ranking Based Image Retrieval and SVM, while the second group carried on a web-downloaded Flickr dataset with over 6,000 images to testify the proposed method’s effectiveness in real-world application. The promising results show that our model attains a significant improvement over state-of-the-art algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Barnard K, Duygulu P, Forsyth D, Blei D, Jordan M (2003) Matching words and pictures. J Mach Learning Res vol. 3

  2. Blei DM, Jordan MI (2003) Modeling annotated data. In Proceedings of ACM SIGIR Conference 2003, pp. 127–134

  3. Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet Allocation. In J Mach Learning Res 3:1532–4435

  4. Blum A, Mitchell T (1998) Combining labeled and unlabeled data with Co-Training. In Proceedings of Computational Learning Theory, pp. 92~100

  5. Cao L, Luo J, Kautz H, Huang TS (2009) Image annotation within the context of personal photo collections using hierarchical event and scene models. IEEE Transactions on Multimedia 11(2):208–219

    Article  Google Scholar 

  6. Culp M, Michailidis G (2007) Graph-based semi-supervised learning. IEEE Transactions on Pattern Analysis and Machine Intelligence 2(10):856–860

    Google Scholar 

  7. Datta R, Joshi D, Li J, Wang JZ (2008) Image retrieval: ideas, influences and trends of the new age. ACM Computer Survey 40(2):1–60

    Article  Google Scholar 

  8. Fellbaum C (1998) WordNet: An electronic lexical database, Bradford Book, May

  9. Freedman D (2002) Efficient simplicial reconstructions of manifolds from their samples. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(10):1349–1357

    Article  Google Scholar 

  10. Golder S, Huberman BA (2006) Usage patterns of collaborative tagging systems. Journal of Information Science 32(2):198–208

    Article  Google Scholar 

  11. Goldman S, Zhou Y (2000) Enhancing supervised learning with unlabeled data. In Proceedings of ACM International Conference on Machine Learning, pp. 327–334

  12. Guan H, Turk M (2007) The hierarchical isometric self-organizing map for manifold representation. IEEE Conference on Computer Vision and Pattern Recognition, 17–22 June 2007, Page 1–8

  13. Haralick RM, Shanmugam K, Dinstein I (1973) Texture features for image classification. IEEE Transaction on Systems Man and Cybernetics 3(11):610–621

    Article  Google Scholar 

  14. He J, Li M, Zhang H-J, Tong H, Zhang C (2004) Manifold-ranking based image retrieval. In Proceedings of ACM International Conference on Multimedia, pp. 9–16

  15. He J, Li M, Zhang H-J, Tong H, Zhang C (2006) Generalized manifold-ranking-based image retrieval. IEEE Transactions on Image Processing 15(10):3170–3177

    Article  Google Scholar 

  16. Jarvelin K, Kekalainen J (2002) Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems 20:422–446

    Article  Google Scholar 

  17. Ji R, Yao H (2007) Visual & textual fusion for region retrieval from both Bayesian reasoning and fuzzy matching aspects. In Proceedings of ACM International Workshop on Multimedia Information Retrieval

  18. Ji R, Yao H, Xu P, Sun X, Liu X (2008) Real-time image annotation by manifold-based biased fisher discriminate learning. In Proceedings of Visual Communications and Image Processing

  19. Jing F, Li M, Zhang H, Zhang B (2000) A unified framework for image retrieval using keyword and visual features. IEEE Transactions on Image Processing 14(7):979–989

    Article  Google Scholar 

  20. Joachims T (2003) Transductive learning via spectral graph partitioning. In Proceedings of ACM International Conference on Machine Learning, 2003

  21. Klema V, Laub A (1980) The singular value decomposition: Its computation and some applications. IEEE Transactions on Automatic Control, pp. 164–176, April

  22. Lang S (1996) Differential and riemannian manifolds. Springer- Verlag, 1996

  23. Lee JM (2000) Introduction to topological manifolds. Springer- Verlag, 2000

  24. Liu J, Li M, Ma W-Y, Liu Q, Lu H (2006) An adaptive graph model for automatic image annotation. ACM SIGMM Workshop on Multimedia Information Retrieval, pp. 61–70

  25. Liu X, Yao H, Ji R, Xu P, Sun X (2009) What is a complete set of keywords for image description & annotation on the web. In Proceedings of ACM International Conference on Multimedia

  26. Liu D, Hua XS, Yang L, Wang M (2009) Tag ranking. In Proceedings of ACM International Conference on World Wide Web, pp. 351–360

  27. Nigam K, Ghani R (2000) Analyzing the effectiveness and applicability of co-training. In Proceedings International Conference on Information and Knowledge Management, Page 86–93

  28. Rui X, Li M, Li Z, Ma W, Yu N (2007) Bipartite graph reinforcement model for web image annotation. In Proceedings ACM International Conference on Multimedia, 2007, pp. 585–594

  29. Salton G, Buckley C (1998) Term-weighting approaches in automatic text retrieval. Information Processing and Management 24:513–523

    Article  Google Scholar 

  30. Seeger M (2002) Learning with labeled and unlabeled data. Inst. for Adaptive and Neural Computation, technical report

  31. Sigurbjorsnsson B, van Zwol R (2008) Flickr tag recommendation based on collective knowledge. In Proceedings of International Conference on World Wide Web Conference, pp. 327–336

  32. Teh YW, Jordan MI, Beal MJ, Blei DM (2006) Hierarchical dirichlet processes. In Journal of American Statistical Association, 101(476):1566–1581

    Google Scholar 

  33. Tenenbaum JB, Silva V, Langford JC (2000) A global geometric framework for nonlinear dimensionality reduction. Science 290:2319–2323

    Article  Google Scholar 

  34. Wang X, Ma W, Xue G, Li X (2004) Multi-model similarity propagation and its application for web image retrieval. In Proceedings of ACM International Conference on Multimedia, 2004, pp. 944–951

  35. Weinberger K, Slaney M, van Zwol R (2008) Resolving tag ambiguity. In Proceedings of ACM International Conference on Multimedia, pp. 111–120

  36. Zhang Z, Zha H (2005) Principal Manifolds and nonlinear dimensionality reduction via tangent space alignment. SIAM Journal of Scientific Computing 26(1):313–338

    Article  MathSciNet  Google Scholar 

  37. Zhou ZH, Li M (2005) Semi-supervised regression with co-training. In Proceedings of International Joint Conference on Artificial Intelligence, pp. 908–913

  38. Zhou ZH, Chen K-J, Dai H-B (2006) Enhancing relevance feedback in image retrieval using unlabeled data. ACM Transactions on Information System 24(2):219–244

    Article  Google Scholar 

  39. Zhu X (2006) Semi-supervised learning literature survey. Computer Science, University of Wisconsin-Madison

Download references

Acknowledgement

The work was supported in part by the National Science Foundation of China No. 61071180, and Key Program Grant of National Science Foundation of China No. 61133003.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hongxun Yao.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, X., Yao, H., Ji, R. et al. Bidirectional-isomorphic manifold learning at image semantic understanding & representation. Multimed Tools Appl 64, 53–76 (2013). https://doi.org/10.1007/s11042-011-0947-2

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-011-0947-2

Keywords

Navigation