Decision trees for hierarchical multi-label classification

Vens, Celine; Struyf, Jan; Schietgat, Leander; Džeroski, Sašo; Blockeel, Hendrik

doi:10.1007/s10994-008-5077-3

Decision trees for hierarchical multi-label classification

Published: 01 August 2008

Volume 73, pages 185–214, (2008)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

Decision trees for hierarchical multi-label classification

Download PDF

Celine Vens¹,
Jan Struyf¹,
Leander Schietgat¹,
Sašo Džeroski² &
…
Hendrik Blockeel¹

9639 Accesses
443 Citations
9 Altmetric
Explore all metrics

Abstract

Hierarchical multi-label classification (HMC) is a variant of classification where instances may belong to multiple classes at the same time and these classes are organized in a hierarchy. This article presents several approaches to the induction of decision trees for HMC, as well as an empirical study of their use in functional genomics. We compare learning a single HMC tree (which makes predictions for all classes together) to two approaches that learn a set of regular classification trees (one for each class). The first approach defines an independent single-label classification task for each class (SC). Obviously, the hierarchy introduces dependencies between the classes. While they are ignored by the first approach, they are exploited by the second approach, named hierarchical single-label classification (HSC). Depending on the application at hand, the hierarchy of classes can be such that each class has at most one parent (tree structure) or such that classes may have multiple parents (DAG structure). The latter case has not been considered before and we show how the HMC and HSC approaches can be modified to support this setting. We compare the three approaches on 24 yeast data sets using as classification schemes MIPS’s FunCat (tree structure) and the Gene Ontology (DAG structure). We show that HMC trees outperform HSC and SC trees along three dimensions: predictive accuracy, model size, and induction time. We conclude that HMC trees should definitely be considered in HMC tasks where interpretable models are desired.

References

Altschul, S. F., Madden, T. L., Schaffer, A. A., Zhang, J., Zhang, Z., Miller, W., & Lipman, D. J. (1997). Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Research, 25, 3389–3402.
Article Google Scholar
Ashburner, M. et al. (2000). Gene Ontology: tool for the unification of biology. The Gene Ontology Consortium. Nature Genetics, 25(1), 25–29.
Article Google Scholar
Barutcuoglu, Z., Schapire, R. E., & Troyanskaya, O. G. (2006). Hierarchical multi-label prediction of gene function. Bioinformatics, 22(7), 830–836.
Article Google Scholar
Blockeel, H., Bruynooghe, M., Džeroski, S., Ramon, J., & Struyf, J. (2002). Hierarchical multi-classification. In Proceedings of the ACM SIGKDD 2002 workshop on multi-relational data mining (MRDM 2002) (pp. 21–35).
Blockeel, H., De Raedt, L., & Ramon, J. (1998). Top-down induction of clustering trees. In Proceedings of the 15th international conference on machine learning (pp. 55–63).
Blockeel, H., Džeroski, S., & Grbović, J. (1999). Simultaneous prediction of multiple chemical parameters of river water quality with Tilde. In Proceedings of the 3rd European conference on principles of data mining and knowledge discovery (pp. 32–40).
Blockeel, H., Schietgat, L., Struyf, J., Džeroski, S., & Clare, A. (2006). Decision trees for hierarchical multilabel classification: a case study in functional genomics. In Proceedings of the 10th European conference on principles and practice of knowledge discovery in databases (pp. 18–29).
Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. (1984). Classification and regression trees. Belmont: Wadsworth.
MATH Google Scholar
Cesa-Bianchi, N., Gentile, C., & Zaniboni, L. (2006). Incremental algorithms for hierarchical classification. Journal of Machine Learning Research, 7, 31–54.
MathSciNet Google Scholar
Chu, S., DeRisi, J., Eisen, M., Mulholland, J., Botstein, D., Brown, P., & Herskowitz, I. (1998). The transcriptional program of sporulation in budding yeast. Science, 282, 699–705.
Article Google Scholar
Clare, A. (2003). Machine learning and data mining for yeast functional genomics. PhD thesis, University of Wales, Aberystwyth.
Clare, A., & King, R. D. (2001). Knowledge discovery in multi-label phenotype data. In 5th European conference on principles of data mining and knowledge discovery (pp. 42–53).
Davis, J., & Goadrich, M. (2006), The relationship between precision-recall and ROC curves. In Proceedings of the 23rd international conference on machine learning (pp. 233–240)
Demšar, D., Džeroski, S., Larsen, T., Struyf, J., Axelsen, J., Bruus Pedersen, M., & Henning Krogh, P. (2006). Using multi-objective classification to model communities of soil microarthropods. Ecological Modelling, 191(1), 131–143.
Article Google Scholar
DeRisi, J., Iyer, V., & Brown, P. (1997). Exploring the metabolic and genetic control of gene expression on a genomic scale. Science, 278, 680–686.
Article Google Scholar
Džeroski, S., Slavkov, I., Gjorgjioski, V., & Struyf, J. (2006). Analysis of time series data with predictive clustering trees. In Proceedings of the 5th international workshop on knowledge discovery in inductive databases (pp. 47–58).
Eisen, M., Spellman, P., Brown, P., & Botstein, D. (1998). Cluster analysis and display of genome-wide expression patterns. Proceedings of the National Academy of Sciences of the USA, 95, 14863–14868.
Article Google Scholar
Expasy (2008). ProtParam. http://www.expasy.org/tools/protparam.html.
Gasch, A., Huang, M., Metzner, S., Botstein, D., Elledge, S., & Brown, P. (2001). Genomic expression responses to DNA-damaging agents and the regulatory role of the yeast ATR homolog Mec1p. Molecular Biology of the Cell, 12(10), 2987–3000.
Google Scholar
Gasch, A., Spellman, P., Kao, C., Carmel-Harel, O., Eisen, M., Storz, G., Botstein, D., & Brown, P. (2000). Genomic expression program in the response of yeast cells to environmental changes. Molecular Biology of the Cell, 11, 4241–4257.
Google Scholar
Geurts, P., Wehenkel, L., & d’Alché-Buc, F. (2006). Kernelizing the output of tree-based methods. In Proceedings of the 23th international conference on machine learning (pp. 345–352)
Koller, D., & Sahami, M. (1997). Hierarchically classifying documents using very few words. In Proceedings of the 14th international conference on machine learning (pp. 170–178).
Kumar, A., Cheung, K. H., Ross-Macdonald, P., Coelho, P. S. R., Miller, P., & Snyder, M. (2000). TRIPLES: a database of gene function in S. cerevisiae. Nucleic Acids Research, 28, 81–84.
Article Google Scholar
Mewes, H. W., Heumann, K., Kaps, A., Mayer, K., Pfeiffer, F., Stocker, S., & Frishman, D. (1999). MIPS: a database for protein sequences and complete genomes. Nucl. Acids Research, 27, 44–48.
Article Google Scholar
Oliver, S. (1996). A network approach to the systematic analysis of yeast gene function. Trends in Genetics, 12(7), 241–242.
Article MathSciNet Google Scholar
Ouali, M., & King, R. D. (2000). Cascaded multiple classifiers for secondary structure prediction. Protein Science, 9(6), 1162–1176.
Article Google Scholar
Quinlan, J. R. (1993). C4.5: Programs for machine learning. San Mateo: Morgan Kaufmann.
Google Scholar
Roth, F., Hughes, J., Estep, P., & Church, G. (1998). Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole-genome mRNA quantitation. Nature Biotechnology, 16, 939–945.
Article Google Scholar
Rousu, J., Saunders, C., Szedmak, S., & Shawe-Taylor, J. (2006). Kernel-based learning of hierarchical multilabel classification models. Journal of Machine Learning Research, 7, 1601–1626.
MathSciNet Google Scholar
Spellman, P., Sherlock, G., Zhang, M., Iyer, V., Anders, K., Eisen, M., Brown, P., Botstein, D., & Futcher, B. (1998). Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Molecular Biology of the Cell, 9, 3273–3297.
Google Scholar
Stenger, B., Thayananthan, A., Torr, P., & Cipolla, R. (2007). Estimating 3D hand pose using hierarchical multi-label classification. Image and Vision Computing, 5(12), 1885–1894.
Article Google Scholar
Struyf, J., & Džeroski, S. (2006). Constraint based induction of multi-objective regression trees. In Knowledge discovery in inductive databases, 4th international workshop, KDID’05, revised, selected and invited papers (pp. 222–233).
Struyf, J., & Džeroski, S. (2007). Clustering trees with instance level constraints. In Proceedings of the 18th European conference on machine learning (pp. 359–370)
Taskar, B., Guestrin, C., & Koller, D. (2003). Max-margin Markov networks. In Advances in neural information processing systems 16 16
Tsochantaridis, I., Joachims, T., Hofmann, T., & Altun, Y. (2005). Large margin methods for structured and interdependent output variables. Journal of Machine Learning Research, 6, 1453–1484.
MathSciNet Google Scholar
Tsoumakas, G., & Vlahavas, I. (2007). Random k-labelsets: an ensemble method for multilabel classification. In Proceedings of the 18th European conference on machine learning (pp. 406–417).
Weiss, G. M., & Provost, F. J. (2003). Learning when training data are costly: the effect of class distribution on tree induction. The Journal of Artificial Intelligence Research, 19, 315–354.
MATH Google Scholar
Wilcoxon, F. (1945). Individual comparisons by ranking methods. Biometrics, 1, 80–83.
Article Google Scholar
Yang, Y. (1999). An evaluation of statistical approaches to text categorization. Information Retrieval, 1, 69–90.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Katholieke Universiteit Leuven, Celestijnenlaan 200A, 3001, Leuven, Belgium
Celine Vens, Jan Struyf, Leander Schietgat & Hendrik Blockeel
Department of Knowledge Technologies, Jožef Stefan Institute, Jamova 39, 1000, Ljubljana, Slovenia
Sašo Džeroski

Authors

Celine Vens
View author publications
You can also search for this author in PubMed Google Scholar
Jan Struyf
View author publications
You can also search for this author in PubMed Google Scholar
Leander Schietgat
View author publications
You can also search for this author in PubMed Google Scholar
Sašo Džeroski
View author publications
You can also search for this author in PubMed Google Scholar
Hendrik Blockeel
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Celine Vens.

Additional information

Editor: Johannes Fürnkranz.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Vens, C., Struyf, J., Schietgat, L. et al. Decision trees for hierarchical multi-label classification. Mach Learn 73, 185–214 (2008). https://doi.org/10.1007/s10994-008-5077-3

Download citation

Received: 16 October 2007
Revised: 11 June 2008
Accepted: 08 July 2008
Published: 01 August 2008
Issue Date: November 2008
DOI: https://doi.org/10.1007/s10994-008-5077-3

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Decision trees for hierarchical multi-label classification

Abstract

Article PDF

Similar content being viewed by others

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

A survey on semi-supervised learning

Learning from positive and unlabeled data: a survey

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Decision trees for hierarchical multi-label classification

Abstract

Article PDF

Similar content being viewed by others

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

A survey on semi-supervised learning

Learning from positive and unlabeled data: a survey

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation