DS 2017: Discovery Science pp 116-123 | Cite as
Option Predictive Clustering Trees for Hierarchical Multi-label Classification
Abstract
In this work, we address the task of hierarchical multi-label classification (HMLC). HMLC is a variant of classification, where a single example may belong to multiple classes at the same time and the classes are organized in the form of a hierarchy. Many practically relevant problems can be presented as a HMLC task, such as predicting gene function, habitat modelling, annotation of images and videos, etc. We propose to extend the predictive clustering trees for HMLC – a generalization of decision trees for HMLC – toward learning option predictive clustering trees (OPCTs) for HMLC. OPCTs address the myopia of the standard tree induction by considering alternative splits in the internal nodes of the tree. An option tree can also be regarded as a condensed representation of an ensemble. We evaluate OPCTs on 12 benchmark HMLC datasets from various domains. With the least restrictive parameter values, OPCTs are comparable to the state-of-the-art ensemble methods of bagging and random forest of PCTs. Moreover, OPCTs statistically significantly outperform PCTs.
Notes
Acknowledgments
We acknowledge the financial support of the European Commission through the grants ICT-2013-612944 MAESTRA and ICT-2013-604102 HBP, as well as the support of the Slovenian Research Agency through young researcher grants and the program Knowledge Technologies (P2-0103).
References
- 1.Blockeel, H., Struyf, J.: Efficient algorithms for decision tree cross-validation. J. Mach. Learn. Res. 3, 621–650 (2002)MATHGoogle Scholar
- 2.Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)MATHGoogle Scholar
- 3.Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)CrossRefMATHGoogle Scholar
- 4.Breiman, L., Friedman, J., Olshen, R., Stone, C.J.: Classification and Regression Trees. Chapman & Hall/CRC, London (1984)MATHGoogle Scholar
- 5.Buntine, W.: Learning classification trees. Stat. Comput. 2(2), 63–73 (1992)CrossRefGoogle Scholar
- 6.Clare, A.: Machine learning and data mining for yeast functional genomics. Ph.D. thesis, University of Wales Aberystwyth, Aberystwyth, Wales, UK (2003)Google Scholar
- 7.Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)MathSciNetMATHGoogle Scholar
- 8.Dimitrovski, I., Kocev, D., Loskovska, S., Dzeroski, S.: Hierarchical annotation of medical images. Pattern Recogn. 44(10–11), 2436–2449 (2011)CrossRefGoogle Scholar
- 9.Dimitrovski, I., Kocev, D., Loskovska, S., Dzeroski, S.: Hierarchical classification of diatom images using ensembles of predictive clustering trees. Ecol. Inf. 7(1), 19–29 (2012)CrossRefGoogle Scholar
- 10.Ikonomovska, E., Gama, J., Zenko, B., Dzeroski, S.: Speeding-up hoeffding-based regression trees with options. In: Proceedings of the 28th International Conference on Machine Learning, ICML 2011, pp. 537–544 (2011)Google Scholar
- 11.Klimt, B., Yang, Y.: The enron corpus: a new dataset for email classification research. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) ECML 2004. LNCS, vol. 3201, pp. 217–226. Springer, Heidelberg (2004). doi: 10.1007/978-3-540-30115-8_22 CrossRefGoogle Scholar
- 12.Kocev, D., Struyf, J., Džeroski, S.: Beam search induction and similarity constraints for predictive clustering trees. In: Džeroski, S., Struyf, J. (eds.) KDID 2006. LNCS, vol. 4747, pp. 134–151. Springer, Heidelberg (2007). doi: 10.1007/978-3-540-75549-4_9 CrossRefGoogle Scholar
- 13.Kocev, D., Vens, C., Struyf, J., Džeroski, S.: Tree ensembles for predicting structured outputs. Pattern Recogn. 46(3), 817–833 (2013)CrossRefGoogle Scholar
- 14.Kohavi, R., Kunz, C.: Option decision trees with majority votes. In: Proceedings of the 14th International Conference on Machine Learning, ICML 1997, pp. 161–169. Morgan Kaufmann Publishers Inc., San Francisco (1997)Google Scholar
- 15.Lewis, D.D., Yang, Y., Rose, T.G., Li, F.: RCV1: A new benchmark collection for text categorization research. J. Mach. Learn. Res. 5, 361–397 (2004)Google Scholar
- 16.Osojnik, A., Džeroski, S., Kocev, D.: Option predictive clustering trees for multi-target regression. In: Calders, T., Ceci, M., Malerba, D. (eds.) DS 2016. LNCS, vol. 9956, pp. 118–133. Springer, Cham (2016). doi: 10.1007/978-3-319-46307-0_8 CrossRefGoogle Scholar
- 17.Rousu, J., Saunders, C., Szedmak, S., Shawe-Taylor, J.: Kernel-based learning of hierarchical multilabel classification models. J. Mach. Learn. Res. 7, 1601–1626 (2006)MathSciNetMATHGoogle Scholar
- 18.Schietgat, L., Vens, C., Struyf, J., Blockeel, H., Kocev, D., Džeroski, S.: Predicting gene function using hierarchical multi-label decision tree ensembles. BMC Bioinform. 11(2), 1–14 (2010)MATHGoogle Scholar
- 19.Silla, C., Freitas, A.: A survey of hierarchical classification across different application domains. Data Min. Knowl. Disc. 22(1–2), 31–72 (2011)MathSciNetCrossRefMATHGoogle Scholar
- 20.Vens, C., Struyf, J., Schietgat, L., Džeroski, S., Blockeel, H.: Decision trees for hierarchical multi-label classification. Mach. Learn. 73(2), 185–214 (2008)CrossRefGoogle Scholar