Combining instance-based learning and logistic regression for multilabel classification

Cheng, Weiwei; Hüllermeier, Eyke

doi:10.1007/s10994-009-5127-5

Combining instance-based learning and logistic regression for multilabel classification

Published: 23 July 2009

Volume 76, pages 211–225, (2009)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

Combining instance-based learning and logistic regression for multilabel classification

Download PDF

Weiwei Cheng¹ &
Eyke Hüllermeier¹

6209 Accesses
297 Citations
Explore all metrics

Abstract

Multilabel classification is an extension of conventional classification in which a single instance can be associated with multiple labels. Recent research has shown that, just like for conventional classification, instance-based learning algorithms relying on the nearest neighbor estimation principle can be used quite successfully in this context. However, since hitherto existing algorithms do not take correlations and interdependencies between labels into account, their potential has not yet been fully exploited. In this paper, we propose a new approach to multilabel classification, which is based on a framework that unifies instance-based learning and logistic regression, comprising both methods as special cases. This approach allows one to capture interdependencies between labels and, moreover, to combine model-based and similarity-based inference for multilabel classification. As will be shown by experimental studies, our approach is able to improve predictive accuracy in terms of several evaluation criteria for multilabel prediction.

References

Aha, D., Kibler, D., & Alber, M. (1991). Instance-based learning algorithms. Machine Learning, 6(1), 37–66.
Google Scholar
Boutell, M. R., Luo, J., Shen, X., & Brown, C. M. (2004). Learning multilabel scene classification. Pattern Recognition, 37(9), 1757–1771.
Article Google Scholar
Clare, A., & King, R. D. (2001). Knowledge discovery in multilabel phenotype data. In L. D. Raedt & A. Siebes (Eds.), Lecture notes in computer science (Vol. 2168, pp. 42–53). Berlin: Springer.
Google Scholar
Comite, F. D., Gilleron, R., & Tommasi, M. (2003). Learning multilabel alternating decision tree from texts and data. In P. Perner & A. Rosenfeld (Eds.), Lecture notes in computer science (Vol. 2734, pp. 35–49). Berlin: Springer.
Google Scholar
Dasarathy, B. V., editor (1991). Nearest neighbor (NN) norms: NN pattern classification techniques. Los Alamitos: IEEE Comput. Soc.
Google Scholar
Demsar, J. (2006). Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research, 7, 1–30.
MathSciNet Google Scholar
Elisseeff, A., & Weston, J. (2002). A kernel method for multilabelled classification. In T. G. Dietterich, S. Becker, & Z. Ghahramani (Eds.), Advances in neural information processing systems (Vol. 14, pp. 681–687). Cambridge: MIT Press.
Google Scholar
Getoor, L., & Taskar, B., editors (2007). Introduction to statistical relational learning. Cambridge: MIT Press.
MATH Google Scholar
Ghamrawi, N., & McCallum, A. (2005). Collective multilabel classification. In Proc. CIKM-05, Bremen, Germany.
Godbole, S., & Sarawagi, S. (2004). Discriminative methods for multilabeled classification. In LNCS: Vol. 3056. Advances in knowledge discovery and data mining (pp. 20–33). Berlin: Springer.
Google Scholar
Kazawa, H., Izumitani, T., Taira, H., & Maeda, E. (2005). Maximal margin labeling for multi-topic text categorization. In L. K. Saul, Y. Weiss, & L. Bottou (Eds.), Advances in neural inf. proc. syst. (Vol. 17). Cambridge: MIT Press.
Google Scholar
Lu, Q., & Getoor, L. (2003). Link-based classification. In Proc. ICML-03 (pp. 496–503) Washington.
Maron, O., & Ratan, A. L. (1998). Multiple-instance learning for natural scene classification. In Proc. ICML (pp. 341–349), Madison, WI.
Schapire, R. E., & Singer, Y. (2000). Boostexter: a boosting-based system for text categorization. Machine Learning, 39(2), 135–168.
Article MATH Google Scholar
Sebastiani, F. (2002). Machine learning in automated text categorization. ACM Computing Surveys, 34(1), 1–47.
Article Google Scholar
Snoek, C. G. M., Worring, M., van Gemert, J. C., Geusebroek, J. M., & Smeulders, A. W. M. (2006). The challenge problem for automated detection of 101 semantic concepts in multimedia. In Proc. ACM multimedia (pp. 421–430), Santa Barbara, USA.
Trohidis, K., Tsoumakas, G., Kalliris, G., & Vlahavas, I. (2008). Multilabel classification of music into emotions. In Proc. int. conf. music information retrieval.
Tsoumakas, G., & Katakis, I. (2007). Multi-label classification: An overview. International Journal of Data Warehousing and Mining, 3(3), 1–13.
Google Scholar
Ueda, N., & Saito, K. (2003). Parametric mixture models for multilabel text. In S. Becker & S. Thrun (Eds.), Advances in neural information processing (Vol. 15, pp. 721–728). Cambridge: MIT Press.
Google Scholar
Vens, C., Struyf, J., Schietgat, L., Dzeroski, S., & Blockeel, H. (2008). Decision trees for hierarchical multilabel classification. Machine Learning, 73, 185–214.
Article Google Scholar
Witten, I., & Frank, E. (2005). Data mining: practical machine learning tools and techniques (2nd ed.). San Francisco: Morgan Kaufmann.
MATH Google Scholar
Zhang, M.-L., & Zhou, Z.-H. (2006). Multi-label neural networks with applications to functional genomics and text categorization. In IEEE transactions on knowledge and data engineering (Vol. 18, pp. 1338–1351).
Zhang, M.-L., & Zhou, Z.-H. (2007). ML-kNN: A lazy learning approach to multilabel learning. Pattern Recognition, 40(7), 2038–2048.
Article MATH Google Scholar
Zhou, Z.-H., & Zhang, M.-L. (2007). Multi-instance multilabel learning with application to scene classification. In B. Schölkopf, J. Platt, & T. Hofmann (Eds.), Advances in neural inf. proc. syst. (Vol. 19, pp. 1609–1616). Cambridge: MIT Press.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics and Computer Science, University of Marburg, Marburg, Germany
Weiwei Cheng & Eyke Hüllermeier

Authors

Weiwei Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Eyke Hüllermeier
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Eyke Hüllermeier.

Additional information

Editors: Aleksander Kołcz, Dunja Mladenić, Wray Buntine, Marko Grobelnik, and John Shawe-Taylor.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cheng, W., Hüllermeier, E. Combining instance-based learning and logistic regression for multilabel classification. Mach Learn 76, 211–225 (2009). https://doi.org/10.1007/s10994-009-5127-5

Download citation

Received: 12 June 2009
Revised: 12 June 2009
Accepted: 16 June 2009
Published: 23 July 2009
Issue Date: September 2009
DOI: https://doi.org/10.1007/s10994-009-5127-5

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Combining instance-based learning and logistic regression for multilabel classification

Abstract

Article PDF

Similar content being viewed by others

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

A survey of transfer learning

A random forest guided tour

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Combining instance-based learning and logistic regression for multilabel classification

Abstract

Article PDF

Similar content being viewed by others

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

A survey of transfer learning

A random forest guided tour

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation