Annotation cost-sensitive active learning by tree sampling

  • Yu-Lin Tsou
  • Hsuan-Tien LinEmail author
Part of the following topical collections:
  1. Special Issue of the ACML 2018 Journal Track


Active learning is an important machine learning setup for reducing the labelling effort of humans. Although most existing works are based on a simple assumption that each labelling query has the same annotation cost, the assumption may not be realistic. That is, the annotation costs may actually vary between data instances. In addition, the costs may be unknown before making the query. Traditional active learning algorithms cannot deal with such a realistic scenario. In this work, we study annotation cost-sensitive active learning algorithms, which need to estimate the utility and cost of each query simultaneously. We propose a novel algorithm, the cost-sensitive tree sampling algorithm, that conducts the two estimation tasks together and solve it with a tree-structured model motivated from hierarchical sampling, a famous algorithm for traditional active learning. Extensive experimental results using datasets with simulated and true annotation costs validate that the proposed method is generally superior to other annotation cost-sensitive algorithms.


Annotation cost-sensitive Active learning Clustering Decision tree 



  1. Arora, S., Nyberg, E., & Rosé, C. P. (2009). Estimating annotation cost for active learning in a multi-annotator environment. In Proceedings of the NAACL HLT 2009 workshop on active learning for natural language processing, association for computational linguistics (pp. 18–26)Google Scholar
  2. Breiman, L., Friedman, J., Stone, C. J., & Olshen, R. A. (1984). Classification and regression trees. Boca Raton: CRC Press.zbMATHGoogle Scholar
  3. Chapelle, O., Weston, J., & Schölkopf, B. (2003). Cluster kernels for semi-supervised learning. Advances in neural information processing systems (pp. 601–608)Google Scholar
  4. Cohn, D., Atlas, L., & Ladner, R. (1994). Improving generalization with active learning. Machine Learning, 15(2), 201–221.Google Scholar
  5. Cuong, N., & Xu, H. (2016) Adaptive maximization of pointwise submodular functions with budget constraint. In Advances in neural information processing systems (pp. 1244–1252)Google Scholar
  6. Dasgupta, S. (2011). Two faces of active learning. Theoretical Computer Science, 412(19), 1767–1781.MathSciNetCrossRefzbMATHGoogle Scholar
  7. Dasgupta, S., & Hsu, D. (2008). Hierarchical sampling for active learning. In Proceedings of the 25th international conference on Machine learning (pp. 208–215). ACMGoogle Scholar
  8. Donmez, P., Carbonell, J. G. (2008). Proactive learning: cost-sensitive active learning with multiple imperfect oracles. In: Proceedings of the 17th ACM conference on Information and knowledge management (pp 619–628). ACMGoogle Scholar
  9. Golovin, D., & Krause, A. (2011). Adaptive submodularity: Theory and applications in active learning and stochastic optimization. Journal of Artificial Intelligence Research, 42, 427–486.MathSciNetzbMATHGoogle Scholar
  10. Greiner, R., Grove, A. J., & Roth, D. (2002). Learning cost-sensitive active classifiers. Artificial Intelligence, 139(2), 137–174.MathSciNetCrossRefGoogle Scholar
  11. Guillory, A., & Bilmes, J. (2009) Average-case active learning with costs. In International conference on algorithmic learning theory (pp 141–155). SpringerGoogle Scholar
  12. Haertel, R., Seppi, K. D., Ringger, E. K., & Carroll, J. L. (2008). Return on investment for active learning. In Proceedings of the NIPS workshop on cost-sensitive learningGoogle Scholar
  13. Holub, A., Perona, P., & Burl, M. C. (2008) Entropy-based active learning for object recognition. In IEEE computer society conference on computer vision and pattern recognition workshops (pp 1–8). IEEEGoogle Scholar
  14. Huang, K. H., & Lin, H. T. (2016) A novel uncertainty sampling algorithm for cost-sensitive multiclass active learning. In Proceedings of the IEEE International Conference on Data Mining (ICDM) Google Scholar
  15. Huang, S. J., Jin, R., & Zhou, Z. H. (2010) Active learning by querying informative and representative examples. In Advances in neural information processing systems (pp. 892–900)Google Scholar
  16. Huang, S. J., Chen, J. L., Mu, X., & Zhou, Z. H. (2017) Cost-effective active learning from diverse labelers. In Proceedings of the 26th international joint conference on artificial intelligence (pp 1879–1885). AAAI PressGoogle Scholar
  17. Kang, J., Ryu, K. R., & Kwon, H. C. (2004) Using cluster-based sampling to select initial training set for active learning in text classification. In Pacific-Asia conference on knowledge discovery and data mining (pp 384–388). Springer.Google Scholar
  18. King, R. D., Whelan, K. E., Jones, F. M., Reiser, P. G., Bryant, C. H., Muggleton, S. H., et al. (2004). Functional genomic hypothesis generation and experimentation by a robot scientist. Nature, 427(6971), 247–252.CrossRefGoogle Scholar
  19. Lewis, D. D., & Gale, W. A. (1994). A sequential algorithm for training text classifiers. In Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval (pp 3–12). New York: Springer.Google Scholar
  20. Lichman, M. (2013). UCI machine learning repository.
  21. Liu, A., Jun, G., & Ghosh, J. (2009) Spatially cost-sensitive active learning. In Proceedings of the 2009 SIAM international conference on data mining (pp 814–825). SIAMGoogle Scholar
  22. Liu, Y. (2004). Active learning with support vector machine applied to gene expression data for cancer classification. Journal of Chemical Information and Computer Sciences, 44(6), 1936–1941.CrossRefGoogle Scholar
  23. Margineantu, D. D. (2005). Active cost-sensitive learning. In Proceedings of international joint conference on artificial intelligence (pp 1622–1623)Google Scholar
  24. Nguyen, H.T., & Smeulders, A. (2004) Active learning using pre-clustering. In Proceedings of the 21th international conference on Machine learning (p. 79). ACM.Google Scholar
  25. Quinlan, J. R. (1986). Induction of decision trees. Machine Learning, 1(1), 81–106.Google Scholar
  26. Quinlan, J. R. (2014). C4.5: Programs for machine learning. Amsterdam: Elsevier.Google Scholar
  27. Ringger, E., McClanahan, P., Haertel, R., Busby, G., Carmen, M., Carroll, J., Seppi, K., Lonsdale, D. (2007). Active learning for part-of-speech tagging: Accelerating corpus annotation. In Proceedings of the Linguistic Annotation Workshop, Association for Computational Linguistics (pp 101–108)Google Scholar
  28. Seeger, M. (2000). Learning with labeled and unlabeled data. Tech. rep., technical report, University of EdinburghGoogle Scholar
  29. Settles, B. (2010). Active learning literature survey (p. 11). Madison: University of Wisconsin.Google Scholar
  30. Settles, B., Craven, M., & Friedland, L. (2008) Active learning with real annotation costs. In Proceedings of the NIPS workshop on cost-sensitive learning (pp 1–10)Google Scholar
  31. Tomanek, K., & Hahn, U. (2010). A comparison of models for cost-sensitive active learning. In Proceedings of the 23rd international conference on computational linguistics: Posters, association for computational linguistics (pp 1247–1255)Google Scholar
  32. Tong, S., & Koller, D. (2001). Support vector machine active learning with applications to text classification. Journal of Machine Learning Research, 2(Nov), 45–66.zbMATHGoogle Scholar
  33. Vapnik, V. (2013). The nature of statistical learning theory. Berlin: Springer.zbMATHGoogle Scholar
  34. Vijayanarasimhan, S., & Grauman, K. (2011). Cost-sensitive active visual category learning. International Journal of Computer Vision, 91(1), 24–44.CrossRefzbMATHGoogle Scholar
  35. Xu, Z., Yu, K., Tresp, V., Xu, X., & Wang, J. (2003) Representative sampling for text classification using support vector machines. In European conference on information retrieval (pp. 393–407). Springer.Google Scholar
  36. Yan, Y., & Huang, S. J. (2018) Cost-effective active learning for hierarchical multi-label classification. In IJCAI (pp. 2962–2968)Google Scholar

Copyright information

© The Author(s), under exclusive licence to Springer Science+Business Media LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Department of Computer Science and Information EngineeringNational Taiwan UniversityTaipeiTaiwan

Personalised recommendations