Multi-Class Active Learning by Uncertainty Sampling with Diversity Maximization


As a way to relieve the tedious work of manual annotation, active learning plays important roles in many applications of visual concept recognition. In typical active learning scenarios, the number of labelled data in the seed set is usually small. However, most existing active learning algorithms only exploit the labelled data, which often suffers from over-fitting due to the small number of labelled examples. Besides, while much progress has been made in binary class active learning, little research attention has been focused on multi-class active learning. In this paper, we propose a semi-supervised batch mode multi-class active learning algorithm for visual concept recognition. Our algorithm exploits the whole active pool to evaluate the uncertainty of the data. Considering that uncertain data are always similar to each other, we propose to make the selected data as diverse as possible, for which we explicitly impose a diversity constraint on the objective function. As a multi-class active learning algorithm, our algorithm is able to exploit uncertainty across multiple classes. An efficient algorithm is used to optimize the objective function. Extensive experiments on action recognition, object classification, scene recognition, and event detection demonstrate its advantages.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19


  1. 1.

  2. 2.

    Intel(R)Xeon Processor, 24 cores


  1. Bertsekas, D. (1999). Nonlinear programming (2nd ed.). Belmont, MA: Athena Scientific.

    MATH  Google Scholar 

  2. Brinker, K. (2003). Incorporating diversity in active learning with support vector machines. In International conference on machine learning.

  3. Campbell, C., Cristianini, N., & Smola, A. J. (2000). Query learning with large margin classifiers. In ICML.

  4. Chattopadhyay, R., Wang, Z., Fan, W., Davidson, I., Panchanathan, S., & Ye, J. (2012). Batch mode active sampling based on marginal probability distribution matching. In KDD (pp. 741–749).

  5. Chen, M., & Hauptmann, A. (2009). Mosift: Recognizing human actions in surveillance videos. In Technical Report CMU-CS-09-161.

  6. Cohn, D. A., Ghahramani, Z., & Jordan, M. I. (1996). Active learning with statistical models. Journal of Artificial Intelligence Research (JAIR), 4, 129–145.

    MATH  Google Scholar 

  7. Delbos, F., & Gilbert, J. (2005). Global linear convergence of an augmented lagrangian algorithm to solve convex quadratic optimization problems. Journal of Convex Analysis, 12(1), 45–69.

    MATH  MathSciNet  Google Scholar 

  8. Doyle, P. G., & Shell, J. (1984). Random walks and electric networks. Washington, DC: Mathematical Association of America.

    MATH  Google Scholar 

  9. Gong, B., Grauman, K., & Sha, F. (2014). Learning kernels for unsupervised domain adaptation with applications to visual object recognition. International Journal of Computer Vision, 109(1–2), 3–27.

    Article  MathSciNet  Google Scholar 

  10. Han, Y., Yang, Y., Yan, Y., Ma, Z., Sebe, N., & Zhou, X. (2014). Semi-supervised feature selection via spline regression for video semantic recognition. IEEE Transactions on Neural Networks and Learning Systems. doi:10.1109/TNNLS.2014.2314123.

  11. He, X., Min, W., Cai, D., & Zhou, K. (2007). Laplacian optimal design for image retrieval. In SIGIR.

  12. Hoi, S., Jin, R., Zhu, J., & Lyu, M. (2008). Semi-supervised SVM batch mode active learning for image retrieval. In CVPR.

  13. Hoi, S., Jin, R., Zhu, J., & Lyu, M. (2009). Semisupervised svm batch mode active learning with applications to image retrieval. ACM Transactions on Information Systems, 27(3), 16:1–16:29.

  14. Hoi, S., & Lyu, M. (2005). A semi-supervised active learning framework for image retrieval. CVPR, 2, 302–309.

    Google Scholar 

  15. Jain, P., & Kapoor, A. (2009). Active learning for large multi-class problems. In CVPR.

  16. Jegelka, S., Kapoor, A., & Horvitz, E. (2014). An interactive approach to solving correspondence problems. International Journal of Computer Vision, 108(1–2), 49–58.

    Article  MathSciNet  Google Scholar 

  17. Joshi, A., Porikli, F., & Papanikolopoulos, N. (2009). Multi-class active learning for image classification. In CVPR.

  18. Kapoor, A., Grauman, K., Urtasun, R., & Darrell, T. (2010). Gaussian processes for object categorization. International Journal of Computer Vision, 88(2), 169–188.

    Article  Google Scholar 

  19. Kowdle, A., Chang, Y., Gallagher, A., & Chen, T. (2011). Active learning for piecewise planar 3D reconstruction. In CVPR.

  20. Laptev, I., Marszalek, M., Schmid, C., & Rozenfeld, B. (2008). Recognizing realistic actions from videos in the wild. In CVPR.

  21. Lazebnik, S., Schmid, C., & Ponce, J. (2006). Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In CVPR.

  22. Li, H., Shi, Y., Chen, M., Hauptmann, A., & Xiong, Z. (2010). Hybrid active learning for cross-domain video concept detection. In ACM Multimedia.

  23. Li, M., & Sethi, I. K. (2006). Confidence-based active learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(8), 1251–1261.

    Article  Google Scholar 

  24. Li, X., Wang, L., & Sung, E. (2004). Multilabel SVM active learning for image classification. In ICIP.

  25. Lindenbaum, M., Markovitch, S., & Rusakov, D. (2004). Selective sampling for nearest neighbor classifiers. Machine Learning, 54(2), 125–152.

    Article  MATH  Google Scholar 

  26. Liu, J., Luo, J., & Shah, M. (2009). Recognizing realistic actions from videos in the wild. In CVPR.

  27. Ma, Z., Yang, Y., Nie, F., Sebe, N., Yan, S., & Hauptmann, A. (2014). Harnessing lab knowledge for real-world action recognition. International Journal of Computer Vision, 109(1–2), 60–73.

    Article  Google Scholar 

  28. Ma, Z., Yang, Y., Sebe, N., & Hauptmann, A. (2014). Knowledge adaptation with partiallyshared features for event detection using few exemplars. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(9), 1789–1802.

    Article  Google Scholar 

  29. Nene, S., Nayar, S., & Murase, H. (1996). Columbia object image library (coil-20). Technical Report CUCS-005-96.

  30. Schüldt, C., Laptev, I., & Caputo, B. (2004). Recognizing human actions: A local SVM approach. In ICPR.

  31. Shen, H., Yu, S.-I., Yang, Y., Meng, D., & Hauptmann, A. (2014). Unsupervised video adaptation for parsing human motion. In ECCV.

  32. Spielman, D., & Teng, S.-H. (2004). Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems. In STOC.

  33. Tenenbaum, J., Silva, V., & Langford, J. C. (2000). A global geometric framework for nonlinear dimensionality reduction. Science, 290(5500), 2319–2323.

  34. Tong, S., & Chang, E. (2001). Support vector machine active learning for image retrieval. In ACM Multimedia.

  35. Vondrick, C., & Ramanan, D. (2011). Video annotation and tracking with active learning. In NIPS.

  36. Wang, L., Chan, K. L., & Zhang, Z. (2003). Bootstrapping SVM active learning by incorporating unlabelled images for image retrieval. In CVPR (pp. 629–634).

  37. Yan, R., Yang, J., & Hauptmann, A. (2003). Automatically labeling video data using multi-class active learning. In ICCV.

  38. Yang, Y., Ma, Z., Hauptmann, A., & Sebe, N. (2013). Feature selection for multimedia analysis by sharing information among multiple tasks. IEEE Transactions on Multimedia, 15(3), 661–669.

    Article  Google Scholar 

  39. Yang, Y., Ma, Z., Xu, Z., Yan, S., & Hauptmann, A. (2013). How related exemplars help complex event detection in web videos. In ICCV.

  40. Yang, Y., Nie, F., Xu, D., Luo, J., Zhuang, Y., & Pan, Y. (2012). A multimedia retrieval framework based on semi-supervised ranking and relevance feedback. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(4), 723–742.

    Article  Google Scholar 

  41. Yu, K., Bi, J., & Tresp, V. (2006). Active learning via transductive experimental design. In ICML (pp. 1081–1088).

  42. Zhu, X. (2008). Semi-supervised learning literature survey. Technical Report, University of Wisconsin-Madison.

  43. Zhu, X., Ghahramani, Z., & Lafferty, J.D. (2003). Semi-supervised learning using gaussian fields and harmonic functions. In ICML (pp. 912–919).

Download references


This paper was partially supported by the US Department of Defense the U. S. Army Research Office (W911NF-13-1-0277), partially supported by the ARC DECRA project DE130101311, and partially supported by the Tianjin Key Laboratory of Cognitive Computing and Application.

Author information



Corresponding author

Correspondence to Feiping Nie.

Additional information

Communicated by Kristen Grauman.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Yang, Y., Ma, Z., Nie, F. et al. Multi-Class Active Learning by Uncertainty Sampling with Diversity Maximization. Int J Comput Vis 113, 113–127 (2015).

Download citation


  • Active learning
  • Uncertainty sampling
  • Diversity maximization