Chapter

Bioinformatics and Biomedical Engineering

Volume 9043 of the series Lecture Notes in Computer Science pp 66-77

Prediction of Human Gene - Phenotype Associations by Exploiting the Hierarchical Structure of the Human Phenotype Ontology

  • Giorgio ValentiniAffiliated withAnacletoLab - DI, Dipartimento di Informatica, Università degli Studi di Milano
  • , Sebastian KöhlerAffiliated withInstitut fur Medizinische Genetik und Humangenetik, Charité - Universitatsmedizin Berlin
  • , Matteo ReAffiliated withAnacletoLab - DI, Dipartimento di Informatica, Università degli Studi di Milano
  • , Marco NotaroAffiliated withDipartimento di Bioscienze, Università degli Studi di Milano
  • , Peter N. RobinsonAffiliated withInstitut fur Medizinische Genetik und Humangenetik, Charité - Universitatsmedizin BerlinInstitute of Bioinformatics, Department of Mathematics and Computer Science, Freie Universitat Berlin

* Final gross prices may vary according to local VAT.

Get Access

Abstract

The Human Phenotype Ontology (HPO) provides a conceptualization of phenotype information and a tool for the computational analysis of human diseases. It covers a wide range of phenotypic abnormalities encountered in human diseases and its terms (classes) are structured according to a directed acyclic graph. In this context the prediction of the phenotypic abnormalities associated to human genes is a key tool to stratify patients into disease subclasses that share a common biological or pathophisiological basis. Methods are being developed to predict the HPO terms that are associated for a given disease or disease gene, but most such methods adopt a simple ”flat” approach, that is they do not take into account the hierarchical relationships of the HPO, thus loosing important a priori information about HPO terms. In this contribution we propose a novel Hierarchical Top-Down (HTD) algorithm that associates a specific learner to each HPO term and then corrects the predictions according to the hierarchical structure of the underlying DAG. Genome-wide experimental results relative to a complex HPO DAG including more than 4000 HPO terms show that the proposed hierarchical-aware approach significantly improves predictions obtained with flat methods, especially in terms of precision/recall results.

Keywords

Human Phenotype Ontology term prediction Ensemble methods Hierarchical classification methods Disease gene prioritization