Research article

BMC Bioinformatics

, 13:265

First online:

Open Access This content is freely available online to anyone, anywhere at any time.

Supervised segmentation of phenotype descriptions for the human skeletal phenome using hybrid methods

  • Tudor GrozaAffiliated withSchool of ITEE, The University of Queensland Email author 
  • , Jane HunterAffiliated withSchool of ITEE, The University of Queensland
  • , Andreas ZanklAffiliated withBone Dysplasia Research Group, UQ Centre for Clinical Research (UQCCR), University of QueenslandGenetic Health Queensland, Royal Brisbane and Women’s Hospital



Over the course of the last few years there has been a significant amount of research performed on ontology-based formalization of phenotype descriptions. In order to fully capture the intrinsic value and knowledge expressed within them, we need to take advantage of their inner structure, which implicitly combines qualities and anatomical entities. The first step in this process is the segmentation of the phenotype descriptions into their atomic elements.


We present a two-phase hybrid segmentation method that combines a series individual classifiers using different aggregation schemes (set operations and simple majority voting). The approach is tested on a corpus comprised of skeletal phenotype descriptions emerged from the Human Phenotype Ontology. Experimental results show that the best hybrid method achieves an F-Score of 97.05% in the first phase and F-Scores of 97.16% / 94.50% in the second phase.


The performance of the initial segmentation of anatomical entities and qualities (phase I) is not affected by the presence / absence of external resources, such as domain dictionaries. From a generic perspective, hybrid methods may not always improve the segmentation accuracy as they are heavily dependent on the goal and data characteristics.