Learning to detect concepts with Approximate Laplacian Eigenmaps in large-scale and online settings

  • Eleni Mantziou
  • Symeon Papadopoulos
  • Yiannis Kompatsiaris
Regular Paper

Abstract

We present a versatile and effective manifold learning approach to tackle the concept detection problem in large-scale and online settings. We demonstrate that Approximate Laplacian Eigenmaps, which constitute a latent representation of the manifold underlying a set of images, offer a compact yet effective feature representation for the problem of concept detection. We expose the theoretical principles of the approach and present an extension that renders the approach applicable in online settings. We evaluate the approach on a number of well-known and two new datasets, coming from the social media domain, and demonstrate that it achieves equal or slightly better detection accuracy compared to supervised methods, while at the same time offering substantial speedup, enabling for instance the training of ten concept detectors using 1.5M images in just 3 min on a commodity server. We also explore a number of factors that affect the detection accuracy of the proposed approach, including the size of training set, the role of unlabelled samples in semi-supervised learning settings, and the performance of the approach across different concepts.

Keywords

Concept detection Semi-supervised learning Laplacian Eigenmaps Online learning 

References

  1. 1.
    Balasubramanian M, Schwartz EL (2002) The isomap algorithm and topological stability. Science 295(5552):7Google Scholar
  2. 2.
    Bart T, Adrian P (2012) Overview of the clef 2012 flickr photo annotation and retrieval task. In: The working notes for the clef 2012 labs and workshop, Rome, ItalyGoogle Scholar
  3. 3.
    Bay H, Ess A, Tuytelaars T, Van Gool L (2008) Speeded-up robust features (surf). Comput Vis Image Underst 110(3):346–359CrossRefGoogle Scholar
  4. 4.
    Belkin M, Niyogi P, Sindhwani V (2006) Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J Mach Learn Res 7:2399–2434MATHMathSciNetGoogle Scholar
  5. 5.
    Bengio Y, Delalleau O, Roux N, Paiement J, Vincent P, Ouimet M (2004) Learning eigenfunctions links spectral embedding and kernel pca. Neural Comput 16(10):2197–2219CrossRefMATHGoogle Scholar
  6. 6.
    Chen X, Mu Y, Yan S, Chua TS (2010) Efficient large-scale image annotation by probabilistic collaborative multi-label propagation. In: Proceedings of the international conference on multimedia, MM ’10ACM, New York, NY, USA, pp 35–44Google Scholar
  7. 7.
    Chua TS, Tang J, Hong R, Li H, Luo Z, Zheng YT (2009) Nus-wide: a real-world web image database from national university of singapore. In: Proceedings of ACM conference on image and video retrieval (CIVR’09), Santorini, Greece, 8–10 July 2009Google Scholar
  8. 8.
    Fan RE, Chang KW, Hsieh CJ, Wang XR, Lin CJ (2008) LIBLINEAR: a library for large linear classification. J Mach Learn Res 9:1871–1874MATHGoogle Scholar
  9. 9.
    Fergus R, Weiss Y, Torralba A (2009) Semi-supervised learning in gigantic image collections. Adv Neural Inf Process Syst 22:522–530Google Scholar
  10. 10.
    Guillaumin M, Verbeek J, Schmid C (2010) Multimodal semi-supervised learning for image classification. In: IEEE conference on computer vision and pattern recognition, pp 902–909Google Scholar
  11. 11.
    Hofmann T (ed) (1999) Probabilistic latent semantic analysis. In: Proceedings of uncertainty in artificial intelligence. UAI99, StockholmGoogle Scholar
  12. 12.
    Huiskes MJ, Lew MS (2008) The mir flickr retrieval evaluation. In: Proceedings of the 2008 ACM MIR ’08: ACM, New York, NY, USAGoogle Scholar
  13. 13.
    Jégou H, Chum O Negative evidences and co-occurrences in image retrieval: the benefit of PCA and whitening. http://hal.inria.fr/hal-00722622
  14. 14.
    Jégou H, Douze M, Schmid C, Pérez P (2010) Aggregating local descriptors into a compact image representation. In: IEEE conference on, CVPR, pp 3304–3311Google Scholar
  15. 15.
    Ji M, Yang T, Lin B, Jin R, Han J (2012) A simple algorithm for semi-supervised learning with improved generalization error bound. arXiv:1206.6412
  16. 16.
    Jia P, Yin J, Huang X, Hu D (2009) Incremental laplacian eigenmaps by preserving adjacent information between data points. Pattern Recognit Lett 30(16):1457–1463CrossRefGoogle Scholar
  17. 17.
    Kong T, Tian Y, Shen H (2011) A fast incremental spectral clustering for large data sets. In: Parallel and distributed computing, applications and technologies (PDCAT), 2011 12th international conference on, pp 1–5, IEEEGoogle Scholar
  18. 18.
    Kouropteva O, Okun O, Pietikäinen M (2005) Incremental locally linear embedding. Pattern Recognit 38(10):1764–1767CrossRefMATHGoogle Scholar
  19. 19.
    Law MH, Jain AK (2006) Incremental nonlinear dimensionality reduction by manifold learning. Pattern Anal Mach Intell IEEE Trans 28(3):377–391CrossRefGoogle Scholar
  20. 20.
    Liu W, He J, Chang SF (2010) Large graph construction for scalable semi-supervised learning. In: Proceedings of the 27th international conference on machine learning (ICML-10), pp 679–686. Omnipress, Haifa, IsraelGoogle Scholar
  21. 21.
    Liu X, Yin J, Feng Z, Dong J (2006) Incremental manifold learning via tangent space alignment. In: Artificial neural networks in pattern recognition. Springer, pp 107–121Google Scholar
  22. 22.
    Manning CD, Raghavan P, Schütze H (2008) Introduction to information retrieval. Cambridge University Press, New YorkCrossRefMATHGoogle Scholar
  23. 23.
    Mantziou E, Papadopoulos S, Kompatsiaris I (2013) Large-scale semi-supervised learning by approximate laplacian eigenmaps, VLAD and pyramids. In WIAMISGoogle Scholar
  24. 24.
    Mantziou E, Papadopoulos S, Kompatsiaris Y (2013) Scalable training with approximate incremental laplacian eigenmaps and pca. In: Proceedings of the 21st ACM international conference on multimedia, MM ’13ACM, New York, NY, USA, pp 381–384Google Scholar
  25. 25.
    Nadler B, Lafon S, Coifman RR, Kevrekidis IG (2006) Diffusion maps, spectral clustering and reaction coordinates of dynamical systems. Appl Comput Harmon Anal 21(1):113–127Google Scholar
  26. 26.
    Ning H, Xu W, Chi Y, Gong Y, Huang TS (2007) Incremental spectral clustering with application to monitoring of evolving blog communities. In: SDM, pp 261–272Google Scholar
  27. 27.
    Oliva A, Torralba A (2001) Modeling the shape of the scene: a holistic representation of the spatial envelope. Int J Comput Vis 42:145–175CrossRefMATHGoogle Scholar
  28. 28.
    Papadopoulos S, Sagonas C, Kompatsiaris I, Vakali A (2013) Semi-supervised concept detection by learning the structure of similarity graphs. In: 19th international conference on MMMGoogle Scholar
  29. 29.
    Perronnin F, Sánchez J, Liu Y (2010) Large-scale image categorization with explicit data embedding. In: CVPR, pp 2297–2304Google Scholar
  30. 30.
    van de Sande KEA, Gevers T, Snoek CGM (2011) Empowering visual categorization with the gpu. IEEE Trans Multimedia 13(1):60–70CrossRefGoogle Scholar
  31. 31.
    Saul LK, Roweis ST (2003) Think globally, fit locally: unsupervised learning of low dimensional manifolds. J Mach Learn Res 4:119–155MathSciNetGoogle Scholar
  32. 32.
    Sinha K, Belkin M (2009) Semi-supervised learning using sparse eigenfunction bases. In: Bengio Y, Schuurmans D, Lafferty J, Williams CKI, Culotta A (eds) Advances in Neural Information Processing Systems 22, pp 1687–1695Google Scholar
  33. 33.
    Spyromitros-Xioufis E, Papadopoulos S, Kompatsiaris I, Tsoumakas G, Vlahavas I (2012) An empirical study on the combination of surf features with vlad vectors for image search. In: 13th international workshop on image analysis for multimedia interactive services (WIAMIS), pp 1–4, IEEEGoogle Scholar
  34. 34.
    Talwalkar A, Kumar S, Rowley H (2008) Large-scale manifold learning. In: IEEE CVPR, 2008, pp 1–8Google Scholar
  35. 35.
    Tang J, Yan S, Hong R, Qi GJ, Chua TS (2009) Inferring semantic concepts from community-contributed images and noisy tags. In: Proceedings of the 17th ACM international conference on multimedia, MM ’09ACM, New York, NY, USA, pp 223–232Google Scholar
  36. 36.
    Wang H, Huang H, Ding CHQ (2011) Image annotation using bi-relational graph of images and semantic labels. In: CVPR, pp 793–800, IEEEGoogle Scholar
  37. 37.
    Wang M, Hua XS (2009) Beyond distance measurement: constructing neighborhood similarity for video annotation. IEEE Trans Multimed 11(3):465–476CrossRefMathSciNetGoogle Scholar
  38. 38.
    Zhang K, Kwok JT (2010) Clustered nyström method for large scale manifold learning and dimension reduction. IEEE Trans Neural Netw 21(10):1576–1587CrossRefGoogle Scholar
  39. 39.
    Zhang K, Kwok JT, Parvin B (2009) Prototype vector machine for large scale semi-supervised learning. In: Proceedings of the 26th annual international conference on machine learning, ICML ’09ACM, New York, NY, USA, pp 1233–1240Google Scholar
  40. 40.
    Zhang Z, Zha H (2004) Principal manifolds and nonlinear dimensionality reduction via tangent space alignment. SIAM J Sci Comput 26(1):313–338CrossRefMATHMathSciNetGoogle Scholar
  41. 41.
    Zheng J, Yu H, Shen F, Zhao J (2010) An online incremental learning support vector machine for large-scale data. In: International conference on artificial neural networks ICANN, lecture notes in computer science. SpringerGoogle Scholar
  42. 42.
    Zhou D, Bousquet O, Lal TN, Weston J, Schlkopf B (2004) Learning with local and global consistency. In: Advances in Neural Information Processing Systems 16:321–328, MIT PressGoogle Scholar
  43. 43.
    Zhu X (2008) Semi-supervised learning literature survey. Technical report TR-1530, Computer Sciences, University of Wisconsin-MadisonGoogle Scholar
  44. 44.
    Zhu X, Ghahramani Z, Lafferty J (2003) Semi-supervised learning using gaussian fields and harmonic functions. In: IN ICML, pp 912–919Google Scholar
  45. 45.
    Zhu X, Kandola J, Laerty J, Ghahramani Z (2006) Graph Kernels by spectral transforms. MIT Press, pp 1–17. http://pages.cs.wisc.edu/~jerryzhu/pub/ssl-book.pdf

Copyright information

© Springer-Verlag London 2015

Authors and Affiliations

  • Eleni Mantziou
    • 1
  • Symeon Papadopoulos
    • 1
  • Yiannis Kompatsiaris
    • 1
  1. 1.Information Technologies Institute (ITI)Centre for Research and Technology Hellas (CERTH)ThessaloníkiGreece

Personalised recommendations