Skip to main content
Log in

A survey on instance selection for active learning

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Active learning aims to train an accurate prediction model with minimum cost by labeling most informative instances. In this paper, we survey existing works on active learning from an instance-selection perspective and classify them into two categories with a progressive relationship: (1) active learning merely based on uncertainty of independent and identically distributed (IID) instances, and (2) active learning by further taking into account instance correlations. Using the above categorization, we summarize major approaches in the field, along with their technical strengths/weaknesses, followed by a simple runtime performance comparison, and discussion about emerging active learning applications and instance-selection challenges therein. This survey intends to provide a high-level summarization for active learning and motivates interested readers to consider instance-selection approaches for designing effective active learning solutions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Aminian M (2005) Active learning with scarcely labeled instances via bias variance reduction. In: Proceedings of international conference on artificial intelligence and machine learning (ICAIML 2005), Cairo, pp 41–45

  2. Becker M, Hachey B, Alex B, Grover C (2005) Optimising selective sampling for boostrapping named entity recognition. In : Workshop on learning with multiple view the 22nd international conference on machine learning (ICML 2005), Bonn, pp 5–11

  3. Bilgic M, Mihalkova L, Getoor L (2010) Active learning for networked data. In: Proceedings of the 27th international conference on machine learning (ICML 2010), ACM, Haifa, pp 79–86

  4. Bottou L (1991) One approche theorique del apprentissage connexionniste: applications. Ala reconnaissance de la parole. Doctoral dissertation, Universite de Paris XI

  5. Burl MC, Wang E (2009) Active learning for directed exploration of complex systems. In: Proceedings of the 26th international conference on machine learning (ICML 2009), Montreal, pp 89–96

  6. Campbell C, Cristianini N, Smola A (2000) Query learning with large margin classifiers. In: Proceedings of the 17th international conference of machine learning (ICML 2000), CA, pp 111–118

  7. Carlson A, Berreridge J, Wang R, Hruschka ER, Mitchell TM (2010) Coupling semi-supervised learning of information extraction. In: Proceedings of the ACM international conference on web search and data mining (ICWSDM-2010), Washington, pp 101–110

  8. Chang MW, Ratinov L, Rizzolo N, Roth D (2008) Learning and inference with constraints. In: Proceedings of the 23rd national conference on artificial intelligence (AAAI 2008), Chicago, pp 1513–1518

  9. Chang MW, Ratinov LA, Roth D (2007) Guiding semi-supervision with constraint-driven learning. In: Proceedings of the 45th annual meeting of the association for computational linguistics (ACL 2007), Prague, pp 280–287

  10. Chen Y, Subramani M (2010) Study of active learning in the challenge. In: Proceedings of the international joint conference on neural network (IJCNN 2010), Barcelona, pp 1–7

  11. Cheng H, Zhang R, Peng Y, Mao J, Tan P (2008) Maximum margin active learning for sequence labeling with different length. In: Proceedings of the 8th industrial conference on advances in data mining: medical applications E-commerce marketing and theoretical aspects (ICADM 2008), Leipzig, pp 345–359

  12. Copa L, Devis T, Michele V, Mikhail K (2010) Unbiased query-by-bagging active learning for VHR image classification. In: Proceedings of conference on image and signal processing for remote sensing XVI (ISPRS 2010), vol 7830, Toulouse, pp 78300K–78300K-8

  13. Escudeiro N, Jorge A (2010) D-confidence: an active learning strategy which efficiently identifies small classes. In: Proceedings of workshop on active learning for natural language processing (ALNLP 2010), Los Angels, pp 18–26

  14. Fine S, Bachrach RG, Shamir E (2002) Query by committee liner separation and random walks. Theor Comput Sci 284(1): 25–51

    Article  MATH  Google Scholar 

  15. Fuji A, Tokunaga T, Inui K, Tanaka H (1998) Selective sampling for example based word sense disambiguation. Comput Linguist 24(4): 573–597

    Google Scholar 

  16. Gilad-Bachrach R, Navor A (2003) Kernel query by committee algorithm. Technology report no. 2003-88 Leibniz centre, The Hebrew University

  17. Godec et al (2010) Context-driven clustering by multi-class classification in an active learning framework. In 2010 IEEE computer society conference on computer vision and pattern recognition workshops, pp 19–24

  18. Hassanzadeh H, Keyvanpour M (2011) A variance based active learning approach for named entity recognition. In: Intelligent computing and information science, vol 135, Springer, Berlin, pp 347–352

  19. Hoi SCH, Jin R, Lyu MR (2006) Large-scale text categorization by batch model active learning. In: The international conference on the world wide web (WWW 2006), ACM Press, New york, pp 633–642

  20. Hoi SHC, Jin R, Zhu J, Lyu MR (2006) Batch mode active learning and its application to medical image classification. In: The 23rd international conference on machine learning (ICML 2006), Pittsburgh, pp 417–424

  21. Holub A, Perona P (2008) Entropy-based active learning for object recognition. In: IEEE computer society conference on computer vision and pattern recognition workshop anchorage (CVPR 2008), pp 1–8

  22. Huang A, Milne D, Frank E, Witten I (2008) Clustering documents with active learning using wikipedia. In: The 8th IEEE international conference on data mining (ICDM 2008), Pisa, pp 839–844

  23. Huang J, Milne D, Frank E, Witten I (2007) Efficient multiclass boosting classification with active learning. In: The SIAM international conference on data mining (SDM 2007), Minnesota, pp 297–308

  24. Ishihara T, Abe KI, Takeda H (1988) Extensions of innovations dual control. Int J Syst Sci 19: 653–667

    Article  MATH  Google Scholar 

  25. Jones R, Ghani R, Mitchell T, Rilo E (2003) Active learning for information extraction with multiple view feature sets. In: Proceedings of ECML Workshop on Adaptive Text Extraction and Mining (ATEM-2003)

  26. Kim J, Song Y, Kim S, Cha J, Lee G (2006) MMr-based active machine learning for bionamed entity recognition. In: Human language technology and the North American association for computational linguistics, ACL Press, pp 69–72

  27. Kunegis J, Lommatzsch A, Bauckhage C (2008) Alternative similarity functions for graph kernels. In: Proceedings of international conference on pattern recognition (ICPR 2008), Florida, pp 1–4

  28. Lewis D, Gale W (1994) A sequential algorithm for training text classifiers. In: Proceedings of the ACM SIGIR conference on research and development in information retrieval (SIGIR 1994), Dublin, pp 3–12

  29. Li B, Yu S, Lu Q (2003) An improved k-nearest neighbor algorithm for text categorization. In: Proceedings of the 20th international conference on computer processing of oriental languages (CPOL 2003), Shenyang, pp 12–19

  30. Li D, Qian F, Fu P (2002) Variance minimization approach for a class of dual control problems. In: Proceedings of the 2002 American control conference (ACC 2002), Alaska, pp 3759–3764

  31. Li M, Ishwar KS (2006) Confidence-based active learning. IEEE Trans Pattern Anal Mach Intell 28(8): 1251–1261

    Article  Google Scholar 

  32. Long B, Chapelle O, Zhang Y, Chang Y, Zheng Z, Tseng B (2010) Active learning for ranking through expected loss optimization. In: Proceedings of the 33rd international ACM SIGIR conference on research and development in information retrieval (SIGIR 2010), Geneva, pp 267–274

  33. Mann G, McCallum A (2007) Effiecent computation of entropy gradient for semi-supervised conditional random fields. In: Proceedings of the conference of the North American chapter of the association for computational linguistics (NAACL 2007), PA, pp 109–112

  34. McCallum AK, Nigam K (1998) Employing EM in pool-based active learning for text classification. In: Proceedings of the international conference on machine learning (ICML 1998), Morgan, pp 359–367

  35. Milito R, Padilla C, Padilla R, Cadorin D (1982) An innovations approach to dual control. IEEE Trans Autom Control 27(1): 132–137

    Article  MATH  Google Scholar 

  36. Muslea I (2002) Active learning with multiple views. Doctoral dissertation, University of South California

  37. Nguyen HT, Smeulders A (2004) Active learning using pre-clustering. In: Proceedings of the 21st international conference on machine learning (ICML 2004), Banff, pp 839–846

  38. Nguyen HV, Li B (2010) Cosine similarity metric learning for face verification. In: Proceedings of Asian conference on computer vision (ACCV 2010), QueensTown, pp 709–720

  39. Olsson F (2009) A literature survey of active learning machine learning in the context of natural language procession. Swedish Institute of Computer Science, Technical report T2009:06

  40. Qi G, Hua X, Rui Y, Tang J, Zhang H (2008) Two-dimensional active learning for image classification. In: Proceedings of the 23rd IEEE conference on computer vision and pattern recognition (CVPR 2008), Alaska, pp 1–8

  41. Roy N, McCallum A (2001) Toward optimal active learning through sampling estimation of error reduction. In: Proceedings of the international conference on machine learning (ICML 2001), Morgan, pp 441–448

  42. Saar-Tsechansky M, Provost F (2000) Variance-based active learning. The CeDER working paper no. IS-00-05

  43. Seung H,S, Opper M, Sompolinsky H (1992) Query by committee. In: Proceedings of the 5th annual workshop on computational learning theory (COLT 1992), Pittsburgh, pp 287–294

  44. Settles B (2010) Active learning literature survey. Technical report 1648, University of Wisconsin, Madison

  45. Settles B, Craven M (2008) An analysis of active learning strategies for sequence labeling tasks. In: Proceedings of the conference on empirical methods in natural language processing (EMNLP-2008), Hawaii, pp 1070–1079

  46. Settles B, Craven M, Ray S (2008) Multiple-instance active learning. Adv Neural Inf Process Syst 20: 1289–1296

    Google Scholar 

  47. Shen D, Zhang J, Su J, Zhou G, Tan C (2004) Multi-criteria-based active learning for named entity recognition. In: Proceedings of the 42nd annual meeting of association for computational linguistics (ACL 2004), Barcelona, pp 589–596

  48. Shi S, Liu Y, Huang Y, Zhu S, Liu Y (2008) Active learning for knn based on bagging features. In: Proceedings of the 4th international conference on natural computation (ICNC 2008), Jinan, pp 61–64

  49. Shum S, Dehak N, Dehak R, Glass J (2010) Unsupervised speaker adaptation based on the consine similarity for text-independent speaker verification. In: Proceedings of the IEEE Odyssey workshop, Brno

  50. Stolfo S, Fan W, Lee W, Prodromidis A (1997) Credit card fraud detection using meta-learning: issues and initial results. In: Proceedings of AAAI workshop on fraud detection and risk management (AAAI 1997), California, pp 83–90

  51. Sun S (2010) Active learning with extremely sparse labeled examples. In: Proceedings of the 10th Brazilian symposium on neural networks (SBRN 2010), Sao Paulo, pp 2980–2984

  52. Sohn S, Comeau D, Kim W, Wilbur W (2009) Term-centric active learning for naive bayes document classification. Open Inf Syst J 3: 54–67

    Article  Google Scholar 

  53. Wang M, Hua X (2011) Active learning in multimedia annotation and retrieval: a survey. ACM Trans Intell Syst Technolo 2(2): 3–23

    MathSciNet  Google Scholar 

  54. Wang Z, Song Y, Zhang C (2009) Efficient active learning with boosting. In: Proceedings of the SIAM data mining conference (SDM 2009), Nevada, pp 1232–1243

  55. Weber JS, Pollack ME (2007) Entropy-driven online active learning for interactive calendar management. In: Proceedings of the 12th international conference on intelligent user interfaces (ICIUI 2007), Hawaii, pp 141–150

  56. Wittenmark B (1975) An active suboptimal dual controller for systems with stochastic parameters. Automat Control Theory Appl 3: 13–19

    MathSciNet  Google Scholar 

  57. Zhu X, Ghahramani Z, John L (2003) Semi-supervised learning using Gaussian fields and harmonic functions. In: Proceedings of the 20th international conference on machine learning (ICML 2003), Washington, pp 912–919

  58. Yan S (2005) Semi-automatic video semantic annotation based on active learning. Vis Commun Image Process 5960: 251–258

    Google Scholar 

  59. Zhang Y (2010) Multi-task active learning with output constraints. In: Proceedings of the 24th AAAI conference on artificial intelligence (AAAI 2010), Georgia, pp 667–672

  60. Zhao Y, Cao Y, Pan X (2008) A telecom clients credit risk rating model based on active learning. In: Proceedings of IEEE international conference on automation and logistics (ICAL 2008), Qingdao, pp 2590–2593

  61. Zhao Y, Xu C, Cao Y (2006) Research on query-by-committee method of active learning and application. In: Lecture notes on artificial intelligence (LNAI 2006), vol 4093, pp 985–991

  62. Zhou Z, Sun Y, Li Y (2009) Multi-instance learning by treating instances as non-i,i,d, samples. In: Proceedings of the 26th international conference on machine learning (ICML 2009), Montreal, pp 1249–1256

  63. Zhu J, Wang H, Tsou B, Ma M (2010) Active learning with sampling by uncertainty and density for instances annotations. IEEE Trans Audio Speech Lang Process 18(6): 1323–1331

    Article  Google Scholar 

  64. Zhu X (2008) Semi-supervised learning literature survey. In: Computer sciences TR 1530, University of Wisconsin, Madison

  65. Zhu X, Zhang P, Lin X, Shi Y (2007) Active learning from data streams. In: Proceedings of the 7th IEEE international conference on data mining (ICDM 2007), Nebraska, pp 757–762

  66. Bilgic M, Getoor L (2010) Active inference for collective classification. In: Proceedings of the 24th AAAI conference on artificial intelligence (AAAI 2010), Georgia, pp 1652–1655

  67. Chu W, Zinkevich M, Li L (2011) Unbiased online active learning in data streams. In: Proceedings of the 17th ACM SIGKDD conference on knowledge discovery and data mining (SIGKDD 2011), CA

  68. Zhang P, Zhu X, Tan J, Guo L (2010) Classifier and cluster ensembles formining concept drifting data streams. In: Proceedings of the 10th IEEE international conference on data mining (ICDM 2010), Sydney, pp 1175–1180

  69. Cesa-Bianchi N, Gentile C, Vitale F, Zappella G (2010) Active learing on trees and graphs. In: Proceedings of the 23rd international conference on learning theory, Haifa, pp 320–332

  70. Guillory A, Bilmes J (2009) Labeled selection on graphs. In: Proceedings of 23rd annual conference on neural information processing systems (NIPS 2009), Vancouver, pp 320–332

  71. Sheng VS, Provost F, Ipeirotis P (2008) Get another label? Improving data quality and data mining using multiple, noisy labelers. In: Proceedings of 16th ACM SIGKDD conference on knowledge discovery and data mining (KDD 2008), Washington, pp 615–622

  72. Zhao L, Sukthankar G, Sukthankar R (2011) Incremental relabeling for active learning with noisy crowdsourced annotations. In: Proceedings of the 2011 IEEE third international confernece on social computing (SocialCom 2011), Boston, pp 728–733

  73. Chan Y, Ng H (2007) Domain adaptation with active learning for word sense disambiguation. Comput Linguist 45: 49–56

    Google Scholar 

  74. Saha A, Rai P, Daume H, Venkatasubramanian S, DuVall S (2011) Active supervised domain adaptation. In: Proceedings of European conference on machine learning and principles and practice of knowledge discovery in databases (ECML/PKDD 2011), Athens

  75. Shi X, Fan W, Ren J (2008) Actively transfer domain knowledge. In: Proceedings of European conference on machine learning and principles and practice of knowledge discovery in databases (ECML/PKDD 2011), Antwerp

  76. Zhu Z, Zhu X, Ye Y, Guo Y, Xue X (2011) Transfer active learning. In: Proceedings of the 20th ACM international conference on information and knowledge management (CIKM 2011), Glasgow

  77. Zhu X (2011) Cross-domain semi-supervised learning using feature formulation. IEEE Trans Syst Man Cybern B 41(6): 1627–1638

    Article  Google Scholar 

  78. Zhu X, Wu X (2006) Scalable representative instance selection and ranking. In: Proceedings of the 18th international conference on pattern recognition (ICPR 2006), Hongkong, pp 352–355

  79. Fu Y, Li B, Zhu X, Zhang C (2011) Do they belong to the same class: active learning by querying pairwise label homogeneity. In: Proceedings of the 20th ACM conference on information and knowledge management (CIKM), Glasgow, pp 2161–2164

  80. Donmez P, Carbonell J (2008) Proactive learning: cost-sensitive active learning with multiple imperfect oracles. In: Proceedings of the ACM conference on information and knowledge management (CIKM 2008), pp 619–628

  81. Vijayanarasimhan S, Jain P, Grauman K (2010) Far-sighted active learning on a budget for image and video recognition. In: Proceedings of the 23rd IEEE conference on computer vision and pattern recognition (CVPR 2010). San Francisco, pp 3035–3042

  82. Abe N, Mamitsuka H (1998) Query learning strategies using boosting and bagging. In: Proceedings of the 15th international conference on machine learning (ICML 1998), pp 1–9

  83. Bifet A, Holmes G, Pfahringer B, Kirkby R, Gavalda R (2009) New ensemble methods for evolving data streams. In: Proceedings of the 15th ACM SIGKDD conference on knowledge discovery and data mining (SIGKDD 2009), Paris, pp 139–148

  84. Fan W, Huang Y, Wang H, Yu P(2004) Active mining of data streams. In: Proceedings of SIAM international conference on data mining (SDM 2004), Florida

  85. Brecheisen S, Kriegel H, Pfeifle M (2006) Multi-step density-based clustering. Knowl Inf Syst 9(3): 284–308

    Article  Google Scholar 

  86. Hovsepian K, Anselmo P, Mazumdar S (2011) Supervised inductive learning with Lotka–Volterra derived models. Knowl Inf Syst 26(2): 195–223

    Article  Google Scholar 

  87. Zhou Z, Li M (2010) Semi-supervised learning by disagreement. Knowl Inf Syst 24(3): 415–439

    Article  Google Scholar 

  88. Amini M, Gallinari P (2005) Semi-supervised learning with an imperfect supervisor. Knowl Inf Syst 13(1): 1–42

    Google Scholar 

  89. Sinohara Y, Miura T (2003) Active feature selection based on a very limited number of entities. Adv Intell Data Anal 2811: 611–622

    Google Scholar 

  90. Beygelzimer A, Dasgupa S, Langford J (2009) Important weighted active learning. In: Proceedings of the 26th international conference on machine learning (ICML 2009), Montreal, pp 49–56

  91. Bishan Y, Sun J, Wang T, Chen Z (2009) Effective multi-label active learning for text classification. In: Proceedings of the ACM SIGKDD conference on knowledge discovery and data mining (SIGKDD 2009), Paris, pp 917–925

  92. Vijayakumar S, Sugyama M, Ogawa H (1998) Training instances selection for optimal generalization with noise variance reduction in neural network. In: Proceedings of the 10th Italian workshop on neural nets, Vietri sul Mare, Italy, pp 1530–1547

  93. Culotta A, McCallum A (2005) Reducing labeling effort for stuctured prediction tasks. In: Proceedings of the 20th national conference on artificial intelligence (AAAI 2005), pp 746–751

  94. Zhao W, He Q, Ma H, Shi Z (2012) Effective semi-supervised document clustering via active learning with instance-level constraints. Knowl Inf Syst 3(3): 569–587

    Article  Google Scholar 

  95. Zhu X, Ding W, Yu P, Zhang C (2011) One-class learning and concept summarization for data streams. Knowl Inf Syst 28(3): 523–553

    Article  Google Scholar 

  96. Pan S, Zhang Y, Li X (2011) Dynamic classifier ensemble for positive unlabeled text stream classification. Knowl Inf Syst 1–21. doi:10.1007/s10115-011-0469-2

  97. Liu W, Wang T (2011) Online active multi-field learning for efficent email spam filtering. Knowl Inf Syst 1–20. doi:10.1007/s10115-011-0461-x

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yifan Fu.

Additional information

In this paper, model and classifier are interchangeable terms.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fu, Y., Zhu, X. & Li, B. A survey on instance selection for active learning. Knowl Inf Syst 35, 249–283 (2013). https://doi.org/10.1007/s10115-012-0507-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-012-0507-8

Keywords

Navigation