Hierarchical Multi-classification with Predictive Clustering Trees in Functional Genomics
This paper investigates how predictive clustering trees can be used to predict gene function in the genome of the yeast Saccharomyces cerevisiae. We consider the MIPS FunCat classification scheme, in which each gene is annotated with one or more classes selected from a given functional class hierarchy. This setting presents two important challenges to machine learning: (1) each instance is labeled with a set of classes instead of just one class, and (2) the classes are structured in a hierarchy; ideally the learning algorithm should also take this hierarchical information into account. Predictive clustering trees generalize decision trees and can be applied to a wide range of prediction tasks by plugging in a suitable distance metric. We define an appropriate distance metric for hierarchical multi-classification and present experiments evaluating this approach on a number of data sets that are available for yeast.
KeywordsAverage Precision Yeast Gene Prediction Task Multitask Learning Hierarchical Information
Unable to display preview. Download preview PDF.
- 1.Bakker, B., Heskes, T.: Task clustering for learning to learn. In: Proceedings of the 13th Belgium-Netherlands Conference on Artificial Intelligence, Amsterdam, pp. 33–40 (2001)Google Scholar
- 2.Bishop, C.M.: Neural Networks for Pattern Recognition. University Press, Oxford (1999)Google Scholar
- 3.Blockeel, H., Bruynooghe, M., Džeroski, S., Ramon, J., Struyf, J.: Hierarchical multi-classification. In: Proceedings of the ACM SIGKDD 2002 Workshop on Multi-Relational Data Mining (MRDM 2002), pp. 21–35 (2002)Google Scholar
- 5.Blockeel, H., De Raedt, L., Ramon, J.: Top-down induction of clustering trees. In: Proceedings of the 15th International Conference on Machine Learning, pp. 55–63 (1998)Google Scholar
- 9.Clare, A.: Machine Learning and Data Mining for Yeast Functional Genomics. PhD thesis, University of Wales, Aberystwyth (2003)Google Scholar
- 10.Langley, P.: Elements of Machine Learning. Morgan Kaufmann, San Francisco (1996)Google Scholar
- 11.Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann series in Machine Learning. Morgan Kaufmann, San Francisco (1993)Google Scholar
- 14.Ženko, B., Džeroski, S., Struyf, J.: Learning predictive clustering rules. Submitted to the Workshop on Knowledge Discovery in Inductive Databases at the 16th European Conference on Machine Learning, ECML (2005)Google Scholar
- 15.Wang, K., Zhou, S., Liew, S.C.: Building hierarchical classifiers using class proximity. In: VLDB 1999, Proceedings of 25th International Conference on Very Large Data Bases, Edinburgh, Scotland, UK, September 7-10, pp. 363–374. Morgan Kaufmann, San Francisco (1999)Google Scholar