International Journal of Computer Vision

, Volume 113, Issue 2, pp 113–127 | Cite as

Multi-Class Active Learning by Uncertainty Sampling with Diversity Maximization

  • Yi Yang
  • Zhigang Ma
  • Feiping Nie
  • Xiaojun Chang
  • Alexander G. Hauptmann


As a way to relieve the tedious work of manual annotation, active learning plays important roles in many applications of visual concept recognition. In typical active learning scenarios, the number of labelled data in the seed set is usually small. However, most existing active learning algorithms only exploit the labelled data, which often suffers from over-fitting due to the small number of labelled examples. Besides, while much progress has been made in binary class active learning, little research attention has been focused on multi-class active learning. In this paper, we propose a semi-supervised batch mode multi-class active learning algorithm for visual concept recognition. Our algorithm exploits the whole active pool to evaluate the uncertainty of the data. Considering that uncertain data are always similar to each other, we propose to make the selected data as diverse as possible, for which we explicitly impose a diversity constraint on the objective function. As a multi-class active learning algorithm, our algorithm is able to exploit uncertainty across multiple classes. An efficient algorithm is used to optimize the objective function. Extensive experiments on action recognition, object classification, scene recognition, and event detection demonstrate its advantages.


Active learning Uncertainty sampling Diversity maximization 



This paper was partially supported by the US Department of Defense the U. S. Army Research Office (W911NF-13-1-0277), partially supported by the ARC DECRA project DE130101311, and partially supported by the Tianjin Key Laboratory of Cognitive Computing and Application.


  1. Bertsekas, D. (1999). Nonlinear programming (2nd ed.). Belmont, MA: Athena Scientific.zbMATHGoogle Scholar
  2. Brinker, K. (2003). Incorporating diversity in active learning with support vector machines. In International conference on machine learning.Google Scholar
  3. Campbell, C., Cristianini, N., & Smola, A. J. (2000). Query learning with large margin classifiers. In ICML.Google Scholar
  4. Chattopadhyay, R., Wang, Z., Fan, W., Davidson, I., Panchanathan, S., & Ye, J. (2012). Batch mode active sampling based on marginal probability distribution matching. In KDD (pp. 741–749).Google Scholar
  5. Chen, M., & Hauptmann, A. (2009). Mosift: Recognizing human actions in surveillance videos. In Technical Report CMU-CS-09-161.Google Scholar
  6. Cohn, D. A., Ghahramani, Z., & Jordan, M. I. (1996). Active learning with statistical models. Journal of Artificial Intelligence Research (JAIR), 4, 129–145.zbMATHGoogle Scholar
  7. Delbos, F., & Gilbert, J. (2005). Global linear convergence of an augmented lagrangian algorithm to solve convex quadratic optimization problems. Journal of Convex Analysis, 12(1), 45–69.zbMATHMathSciNetGoogle Scholar
  8. Doyle, P. G., & Shell, J. (1984). Random walks and electric networks. Washington, DC: Mathematical Association of America.zbMATHGoogle Scholar
  9. Gong, B., Grauman, K., & Sha, F. (2014). Learning kernels for unsupervised domain adaptation with applications to visual object recognition. International Journal of Computer Vision, 109(1–2), 3–27.CrossRefMathSciNetGoogle Scholar
  10. Han, Y., Yang, Y., Yan, Y., Ma, Z., Sebe, N., & Zhou, X. (2014). Semi-supervised feature selection via spline regression for video semantic recognition. IEEE Transactions on Neural Networks and Learning Systems. doi: 10.1109/TNNLS.2014.2314123.
  11. He, X., Min, W., Cai, D., & Zhou, K. (2007). Laplacian optimal design for image retrieval. In SIGIR.Google Scholar
  12. Hoi, S., Jin, R., Zhu, J., & Lyu, M. (2008). Semi-supervised SVM batch mode active learning for image retrieval. In CVPR.Google Scholar
  13. Hoi, S., Jin, R., Zhu, J., & Lyu, M. (2009). Semisupervised svm batch mode active learning with applications to image retrieval. ACM Transactions on Information Systems, 27(3), 16:1–16:29.Google Scholar
  14. Hoi, S., & Lyu, M. (2005). A semi-supervised active learning framework for image retrieval. CVPR, 2, 302–309.Google Scholar
  15. Jain, P., & Kapoor, A. (2009). Active learning for large multi-class problems. In CVPR.Google Scholar
  16. Jegelka, S., Kapoor, A., & Horvitz, E. (2014). An interactive approach to solving correspondence problems. International Journal of Computer Vision, 108(1–2), 49–58.CrossRefMathSciNetGoogle Scholar
  17. Joshi, A., Porikli, F., & Papanikolopoulos, N. (2009). Multi-class active learning for image classification. In CVPR.Google Scholar
  18. Kapoor, A., Grauman, K., Urtasun, R., & Darrell, T. (2010). Gaussian processes for object categorization. International Journal of Computer Vision, 88(2), 169–188.CrossRefGoogle Scholar
  19. Kowdle, A., Chang, Y., Gallagher, A., & Chen, T. (2011). Active learning for piecewise planar 3D reconstruction. In CVPR.Google Scholar
  20. Laptev, I., Marszalek, M., Schmid, C., & Rozenfeld, B. (2008). Recognizing realistic actions from videos in the wild. In CVPR.Google Scholar
  21. Lazebnik, S., Schmid, C., & Ponce, J. (2006). Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In CVPR.Google Scholar
  22. Li, H., Shi, Y., Chen, M., Hauptmann, A., & Xiong, Z. (2010). Hybrid active learning for cross-domain video concept detection. In ACM Multimedia.Google Scholar
  23. Li, M., & Sethi, I. K. (2006). Confidence-based active learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(8), 1251–1261.CrossRefGoogle Scholar
  24. Li, X., Wang, L., & Sung, E. (2004). Multilabel SVM active learning for image classification. In ICIP.Google Scholar
  25. Lindenbaum, M., Markovitch, S., & Rusakov, D. (2004). Selective sampling for nearest neighbor classifiers. Machine Learning, 54(2), 125–152.CrossRefzbMATHGoogle Scholar
  26. Liu, J., Luo, J., & Shah, M. (2009). Recognizing realistic actions from videos in the wild. In CVPR.Google Scholar
  27. Ma, Z., Yang, Y., Nie, F., Sebe, N., Yan, S., & Hauptmann, A. (2014). Harnessing lab knowledge for real-world action recognition. International Journal of Computer Vision, 109(1–2), 60–73.CrossRefGoogle Scholar
  28. Ma, Z., Yang, Y., Sebe, N., & Hauptmann, A. (2014). Knowledge adaptation with partiallyshared features for event detection using few exemplars. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(9), 1789–1802.CrossRefGoogle Scholar
  29. Nene, S., Nayar, S., & Murase, H. (1996). Columbia object image library (coil-20). Technical Report CUCS-005-96.Google Scholar
  30. Schüldt, C., Laptev, I., & Caputo, B. (2004). Recognizing human actions: A local SVM approach. In ICPR.Google Scholar
  31. Shen, H., Yu, S.-I., Yang, Y., Meng, D., & Hauptmann, A. (2014). Unsupervised video adaptation for parsing human motion. In ECCV.Google Scholar
  32. Spielman, D., & Teng, S.-H. (2004). Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems. In STOC.Google Scholar
  33. Tenenbaum, J., Silva, V., & Langford, J. C. (2000). A global geometric framework for nonlinear dimensionality reduction. Science, 290(5500), 2319–2323.Google Scholar
  34. Tong, S., & Chang, E. (2001). Support vector machine active learning for image retrieval. In ACM Multimedia.Google Scholar
  35. Vondrick, C., & Ramanan, D. (2011). Video annotation and tracking with active learning. In NIPS.Google Scholar
  36. Wang, L., Chan, K. L., & Zhang, Z. (2003). Bootstrapping SVM active learning by incorporating unlabelled images for image retrieval. In CVPR (pp. 629–634).Google Scholar
  37. Yan, R., Yang, J., & Hauptmann, A. (2003). Automatically labeling video data using multi-class active learning. In ICCV.Google Scholar
  38. Yang, Y., Ma, Z., Hauptmann, A., & Sebe, N. (2013). Feature selection for multimedia analysis by sharing information among multiple tasks. IEEE Transactions on Multimedia, 15(3), 661–669.CrossRefGoogle Scholar
  39. Yang, Y., Ma, Z., Xu, Z., Yan, S., & Hauptmann, A. (2013). How related exemplars help complex event detection in web videos. In ICCV.Google Scholar
  40. Yang, Y., Nie, F., Xu, D., Luo, J., Zhuang, Y., & Pan, Y. (2012). A multimedia retrieval framework based on semi-supervised ranking and relevance feedback. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(4), 723–742.CrossRefGoogle Scholar
  41. Yu, K., Bi, J., & Tresp, V. (2006). Active learning via transductive experimental design. In ICML (pp. 1081–1088).Google Scholar
  42. Zhu, X. (2008). Semi-supervised learning literature survey. Technical Report, University of Wisconsin-Madison.Google Scholar
  43. Zhu, X., Ghahramani, Z., & Lafferty, J.D. (2003). Semi-supervised learning using gaussian fields and harmonic functions. In ICML (pp. 912–919).Google Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  • Yi Yang
    • 1
  • Zhigang Ma
    • 2
  • Feiping Nie
    • 3
  • Xiaojun Chang
    • 1
  • Alexander G. Hauptmann
    • 2
  1. 1.Centre for Quantum Computation and Intelligent SystemsUniversity of Technology SydneySydneyAustralia
  2. 2.School of Computer ScienceCarnegie Mellon UniversityPittsburghUSA
  3. 3.The Center for OPTical IMagery Analysis and LearningNorthwestern Polytechnical UniversityXi’anChina

Personalised recommendations