Abstract
Multi-label learning is an important machine learning task. In multi-label classification tasks, the label space is larger than the traditional single-label classification, and annotations of multi-label instances are typically more time-consuming or expensive to obtain. Thus, it is necessary to take advantage of active learning to solve this problem. In this paper, we present three active learning methods with the conditional Bernoulli mixture (CBM) model for multi-label classification. The first two methods utilize the least confidence and approximated entropy as the selection criteria to pick the most informative instances, respectively. Particularly, an efficient approximated calculation via dynamic programming is developed to compute the approximated entropy. The third method is based on the cluster information from the CBM, which implicitly takes the advantage of the label correlations. Finally, we demonstrate the effectiveness of the proposed methods through experiments on both synthetic and real-world datasets.
J. Chen and S. Sun—The authors contributed equally to this work.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
McCallum, A.: Multi-label text classification with a mixture model trained by EM. In: AAAI Workshop on Text Learning, pp. 1–7 (1999)
Schapire, R.E., Singer, Y.: Boostexter: a boosting-based system for text categorization. Mach. Learn. 39(2–3), 135–168 (2000)
Ueda, N., Saito, K.: Parametric mixture models for multi-labeled text. In: Advances in Neural Information Processing Systems, pp. 737–744 (2003)
Boutell, M.R., Luo, J., Shen, X., Brown, C.M.: Learning multi-label scene classification. Pattern Recogn. 37(9), 1757–1771 (2004)
Kazawa, H., Izumitani, T., Taira, H., Maeda, E.: Maximal margin labeling for multi-topic text categorization. In: Advances in Neural Information Processing Systems, pp. 649–656 (2005)
Katakis, I., Tsoumakas, G., Vlahavas, I.: Multilabel text classification for automated tag suggestion. In: Proceedings of the ECML/PKDD, pp. 1–9 (2008)
Song, Y., Zhang, L., Giles, C.L.: A sparse Gaussian processes classification framework for fast tag suggestions. In: Proceedings of the 17th ACM Conference on Information and Knowledge Management, pp. 93–102 (2008)
Joachims, T.: Text categorization with support vector machines: learning with many relevant features. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 137–142. Springer, Heidelberg (1998). https://doi.org/10.1007/BFb0026683
Lewis, D.D., Yang, Y., Rose, T.G., Li, F.: RCV1: a new benchmark collection for text categorization research. J. Mach. Learn. Res. 5(4), 361–397 (2004)
Tsoumakas, G., Katakis, I.: Multi-label classification: an overview. Int. J. Data Warehous. Min. 3(3), 1–13 (2007)
Li, C., Wang, B., Pavlu, V., Aslam, J.: Conditional Bernoulli mixtures for multi-label classification. In: Proceedings of the International Conference on Machine Learning, pp. 2482–2491 (2016)
Zhou, J., Sun, S.: Gaussian process versus margin sampling active learning. Neurocomputing 167, 122–131 (2015)
Brinker, K.: On active learning in multi-label classification. In: Spiliopoulou, M., Kruse, R., Borgelt, C., Nürnberger, A., Gaul, W. (eds.) From Data and Information Analysis to Knowledge Engineering, pp. 206–213. Springer, Berlin (2006). https://doi.org/10.1007/3-540-31314-1_24
Li, X., Wang, L., Sung, E.: Multilabel SVM active learning for image classification. In: Proceedings of the International Conference on Image Processing, pp. 2207–2210 (2004)
Qi, G.J., Hua, X.S., Rui, Y., Tang, J., Zhang, H.J.: Two-dimensional active learning for image classification. In: Computer Vision and Pattern Recognition, pp. 1–8 (2008)
Huang, S., Zhou, Z.: Active query driven by uncertainty and diversity for incremental multi-label learning. In: IEEE 13th International Conference on Data Mining, pp. 1079–1084 (2013)
Huang, S., Chen, S., Zhou, Z.: Multi-label active learning: query type matters. In: International Joint Conference on Artificial Intelligence, pp. 946–952 (2015)
Gao, N., Huang, S., Chen, S.: Multi-label active learning by model guided distribution matching. Front. Comput. Sci. 10(5), 845–855 (2016)
Dembczyński, K., Waegeman, W., Cheng, W., Hüllermeier, E.: On label dependence and loss minimization in multi-label classification. Mach. Learn. 88(1–2), 5–45 (2012)
Jordan, M.I., Jacobs, R.A.: Hierarchical mixtures of experts and the EM algorithm. Neural Comput. 6(2), 181–214 (1994)
Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27(3), 379–423 (1948)
Dasgupta, S., Hsu, D.: Hierarchical sampling for active learning. In: Proceedings of the International Conference on Machine Learning, pp. 208–215 (2008)
Sun, S.: A survey of multi-view machine learning. Neural Comput. Appl. 23, 2031–2038 (2013)
Zhao, J., Xie, X., Xu, X., Sun, S.: Multi-view learning overview: recent progress and new challenges. Inf. Fusion 38, 43–54 (2017)
Sun, S., Shawe-Taylor, J., Mao, L.: PAC-Bayes analysis of multi-view learning. Inf. Fusion 35, 117–131 (2017)
Acknowledgments
This work is supported by the National Natural Science Foundation of China under Project 61673179 and Shanghai Sailing Program.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Chen, J., Sun, S., Zhao, J. (2018). Multi-label Active Learning with Conditional Bernoulli Mixtures. In: Geng, X., Kang, BH. (eds) PRICAI 2018: Trends in Artificial Intelligence. PRICAI 2018. Lecture Notes in Computer Science(), vol 11012. Springer, Cham. https://doi.org/10.1007/978-3-319-97304-3_73
Download citation
DOI: https://doi.org/10.1007/978-3-319-97304-3_73
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-97303-6
Online ISBN: 978-3-319-97304-3
eBook Packages: Computer ScienceComputer Science (R0)