A Nonlinear Label Compression and Transformation Method for Multi-label Classification Using Autoencoders

Wicker, Jörg; Tyukin, Andrey; Kramer, Stefan

doi:10.1007/978-3-319-31753-3_27

A Nonlinear Label Compression and Transformation Method for Multi-label Classification Using Autoencoders

Jörg Wicker¹⁹,
Andrey Tyukin¹⁹ &
Stefan Kramer¹⁹

Conference paper
First Online: 12 April 2016

2817 Accesses
12 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9651))

Abstract

Multi-label classification targets the prediction of multiple interdependent and non-exclusive binary target variables. Transformation-based algorithms transform the data set such that regular single-label algorithms can be applied to the problem. A special type of transformation-based classifiers are label compression methods, which compress the labels and then mostly use single label classifiers to predict the compressed labels. So far, there are no compression-based algorithms that follow a problem transformation approach and address non-linear dependencies in the labels. In this paper, we propose a new algorithm, called Maniac (Multi-lAbel classificatioN usIng AutoenCoders), which extracts the non-linear dependencies by compressing the labels using autoencoders. We adapt the training process of autoencoders in a way to make them more suitable for a parameter optimization in the context of this algorithm. The method is evaluated on eight standard multi-label data sets. Experiments show that despite not producing a good ranking, Maniac generates a particularly good bipartition of the labels into positives and negatives. This is caused by rather strong predictions with either really high or low probability. Additionally, the algorithm seems to perform better given more labels and a higher label cardinality in the data set.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
Notice that ECCs should still be considered as a strong baseline method as confirmed by a quite recent extensive experimental comparison [10].
2.
A yet different family of methods transforms larger multi-label problems into smaller multi-label methods, like for instance HOMER [20] or RAKEL [22].
3.
The implementation is available at https://github.com/kramerlab/maniac, as well as directly integrated in Meka http://meka.sourceforge.net/.
4.
It should be noted that, as we train the autoencoders on the labels, this could be understood as supervised learning. Nevertheless, for the training of the autoencoders, no additional target variable is used, and the labels are not treated as target variables for this step, hence this is still unsupervised training.
5.
Due to space limitations, we moved a more detailed version of this section, which is more technically involved, to https://github.com/kramerlab/maniac/blob/master/docs/supplementary.pdf.
6.
As the authors of [8] did not share their code for experimental comparisons, and we were not able to reproduce the published results, we did not compare to this algorithm.
7.
The autoencoder implementation is available at https://github.com/kramerlab/autoencoder.
8.
Due to space limitations, we give only representative results, the full results are given at https://github.com/kramerlab/maniac/blob/master/docs/supplementary.pdf.
9.
We used only a single core of an Intel ® Core™i7-4770 K CPU – 3.50 GHz Processor and 4 GB of RAM.

References

Boutell, M.R., Luo, J., Shen, X., Brown, C.M.: Learning multi-label scene classification. Pattern Recogn. 37(9), 1757–1771 (2004)
Article Google Scholar
Clare, A.J., King, R.D.: Knowledge discovery in multi-label phenotype data. In: Siebes, A., De Raedt, L. (eds.) PKDD 2001. LNCS (LNAI), vol. 2168, p. 42. Springer, Heidelberg (2001)
Chapter Google Scholar
Diplaris, S., Tsoumakas, G., Mitkas, P.A., Vlahavas, I.P.: Protein classification with multiple algorithms. In: Bozanis, P., Houstis, E.N. (eds.) PCI 2005. LNCS, vol. 3746, pp. 448–456. Springer, Heidelberg (2005)
Chapter Google Scholar
Elisseeff, A., Weston, J.: A kernel method for multi-labelled classification. In: Advances in Neural Information Processing Systems, pp. 681–687 (2001)
Google Scholar
Gonçalves, E.C., Plastino, A., Freitas, A.A.: A genetic algorithm for optimizing the label ordering in multi-label classifier chains. In: 2013 IEEE 25th International Conference on Tools with Artificial Intelligence (ICTAI), pp. 469–476. IEEE (2013)
Google Scholar
Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
Article MathSciNet MATH Google Scholar
Klimt, B., Yang, Y.: The enron corpus: a new dataset for email classification research. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) ECML 2004. LNCS (LNAI), vol. 3201, pp. 217–226. Springer, Heidelberg (2004)
Chapter Google Scholar
Li, X., Guo, Y.: Bi-directional representation learning for multi-label classification. In: Calders, T., Esposito, F., Hüllermeier, E., Meo, R. (eds.) ECML PKDD 2014, Part II. LNCS, vol. 8725, pp. 209–224. Springer, Heidelberg (2014)
Google Scholar
Li, X., Zhao, F., Guo, Y.: Conditional restricted Boltzmann machines for multi-label learning with incomplete labels. In: Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, pp. 635–643 (2015)
Google Scholar
Madjarov, G., Kocev, D., Gjorgjevikj, D., Džeroski, S.: An extensive experimental comparison of methods for multi-label learning. Pattern Recogn. 45(9), 3084–3104 (2012)
Article Google Scholar
Nadeau, C., Bengio, Y.: Inference for the generalization error. Mach. Learn. 52(3), 239–281 (2003)
Article MATH Google Scholar
Nam, J., Kim, J., Loza Mencía, E., Gurevych, I., Fürnkranz, J.: Large-scale multi-label text classification — revisiting neural networks. In: Calders, T., Esposito, F., Hüllermeier, E., Meo, R. (eds.) ECML PKDD 2014, Part II. LNCS, vol. 8725, pp. 437–452. Springer, Heidelberg (2014)
Google Scholar
Pestian, J.P., Brew, C., Matykiewicz, P., Hovermale, D., Johnson, N., Cohen, K.B., Duch, W.: A shared task involving multi-label classification of clinical free text. In: Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing, pp. 97–104. Association for Computational Linguistics (2007)
Google Scholar
Read, J., Hollmén, J.: A deep interpretation of classifier chains. In: Blockeel, H., van Leeuwen, M., Vinciotti, V. (eds.) IDA 2014. LNCS, vol. 8819, pp. 251–262. Springer, Heidelberg (2014)
Google Scholar
Read, J., Pfahringer, B., Holmes, G., Frank, E.: Classifier chains for multi-label classification. Mach. Learn. 85(3), 333–359 (2011)
Article MathSciNet Google Scholar
Spyromitros-Xioufis, E., Tsoumakas, G., Groves, W., Vlahavas, I.: Multi-label classification methods for multi-target regression (2012). arXiv preprint arxiv:1211.6581
Tai, F., Lin, H.T.: Multilabel classification with principal label space transformation. Neural Comput. 24(9), 2508–2542 (2012)
Article MathSciNet MATH Google Scholar
Trohidis, K., Tsoumakas, G., Kalliris, G., Vlahavas, I.P.: Multi-label classification of music into emotions. In: Proceedings of the Ninth International Conference on Music Information Retrieval, vol. 8, pp. 325–330 (2008)
Google Scholar
Tsoumakas, G., Dimou, A., Spyromitros, E., Mezaris, V., Kompatsiaris, I., Vlahavas, I.: Correlation-based pruning of stacked binary relevance models for multi-label learning. In: Proceedings of the 1st International Workshop on Learning from Multi-Label Data, pp. 101–116 (2009)
Google Scholar
Tsoumakas, G., Katakis, I., Vlahavas, I.: Effective and efficient multilabel classification in domains with large number of labels. In: Proceedings of the ECML/PKDD 2008 Workshop on Mining Multidimensional Data (MMD 2008), pp. 30–44 (2008)
Google Scholar
Tsoumakas, G., Katakis, I., Vlahavas, I.: Mining multi-label data. In: Maimon, O., Rokach, L. (eds.) Data mining and knowledge discovery handbook, pp. 667–685. Springer, Heidelberg (2010)
Google Scholar
Tsoumakas, G., Vlahavas, I.P.: Random k-labelsets: an ensemble method for multilabel classification. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) ECML 2007. LNCS (LNAI), vol. 4701, pp. 406–417. Springer, Heidelberg (2007)
Chapter Google Scholar
Turnbull, D., Barrington, L., Torres, D., Lanckriet, G.: Semantic annotation and retrieval of music and sound effects. IEEE Trans. Audio Speech Lang. Process. 16(2), 467–476 (2008)
Article Google Scholar
Wicker, J., Pfahringer, B., Kramer, S.: Multi-label classification using Boolean matrix decomposition. In: Proceedings of the 27th Annual ACM Symposium on Applied Computing, pp. 179–186. ACM (2012)
Google Scholar
Zhang, M.L., Zhou, Z.H.: ML-KNN: a lazy learning approach to multi-label learning. Pattern Recogn. 40(7), 2038–2048 (2007)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Institut of Computer Science, Johannes Gutenberg University Mainz, Staudingerweg 9, 55128, Mainz, Germany
Jörg Wicker, Andrey Tyukin & Stefan Kramer

Authors

Jörg Wicker
View author publications
You can also search for this author in PubMed Google Scholar
Andrey Tyukin
View author publications
You can also search for this author in PubMed Google Scholar
Stefan Kramer
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jörg Wicker .

Editor information

Editors and Affiliations

The University of Melbourne, Melbourne, Victoria, Australia
James Bailey
The University of Texas at Dallas, Richardson, Texas, USA
Latifur Khan
Osaka University, Osaka, Japan
Takashi Washio
University of Auckland, Auckland, New Zealand
Gill Dobbie
Shenzhen University, Shenzhen, China
Joshua Zhexue Huang
Massey University, Auckland, New Zealand
Ruili Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wicker, J., Tyukin, A., Kramer, S. (2016). A Nonlinear Label Compression and Transformation Method for Multi-label Classification Using Autoencoders. In: Bailey, J., Khan, L., Washio, T., Dobbie, G., Huang, J., Wang, R. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2016. Lecture Notes in Computer Science(), vol 9651. Springer, Cham. https://doi.org/10.1007/978-3-319-31753-3_27

Download citation

DOI: https://doi.org/10.1007/978-3-319-31753-3_27
Published: 12 April 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-31752-6
Online ISBN: 978-3-319-31753-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics