Conclusions and Research Directions

Part of the Advanced Information and Knowledge Processing book series (AI&KP)


Overall, the hierarchical feature selection methods (especially the lazy learning-based ones) show the capacity on improving the predictive performance of different classifiers. Their better performance also proves that exploiting the hierarchical dependancy information as a type of searching constraint usually leads to a feature subset containing higher predictive power. However, note that, those hierarchical feature selection methods still have some drawbacks. For example, as one of the top-performing methods, HIP eliminates hierarchical redundancy and selects a feature subset that retains all hierarchical information, whereas it ignores the relevance of individual features - since it does not consider any measure of association between a feature and the class attribute. Analogously, MR method eliminates hierarchical redundancy and selects features by considering both the hierarchical information and the features relevance, but the selected features might not retain the complete hierarchical information.


  1. 1.
    Croft D, O’Kelly G, Wu G, Haw R, Gillespie M, Matthews L, ..., Stein L (2011) Reactome: a database of reactions, pathways and biological processes. Nucl Acids Res 39:D691–D697CrossRefGoogle Scholar
  2. 2.
    Fellbaum C (1998) WordNet. Blackwell Publishing LtdGoogle Scholar
  3. 3.
    Hall MA (1999) Correlation-based feature selection for machine learning. PhD thesis, The University of WaikatoGoogle Scholar
  4. 4.
    Kanehisa M, Goto S (2000) Kegg: Kyoto encyclopedia of genes and genomes. Nucl Acids Res 28(1):27–30CrossRefGoogle Scholar
  5. 5.
    Wan C, Freitas AA (2016) A new hierarchical redundancy eliminated tree augmented naive bayes classifier for coping with gene ontology-based features. In: Proceedings of the 33rd international conference on machine learning (ICML 2016) workshop on computational biology, New York, USAGoogle Scholar
  6. 6.
    Webb G, Boughton J, Wang Z (2005) Not so naive bayes: aggregating one-dependence estimators. Mach Learn 58(1)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Department of Computer ScienceUniversity College LondonLondonUK

Personalised recommendations