Conclusions and Research Directions
Overall, the hierarchical feature selection methods (especially the lazy learning-based ones) show the capacity on improving the predictive performance of different classifiers. Their better performance also proves that exploiting the hierarchical dependancy information as a type of searching constraint usually leads to a feature subset containing higher predictive power. However, note that, those hierarchical feature selection methods still have some drawbacks. For example, as one of the top-performing methods, HIP eliminates hierarchical redundancy and selects a feature subset that retains all hierarchical information, whereas it ignores the relevance of individual features - since it does not consider any measure of association between a feature and the class attribute. Analogously, MR method eliminates hierarchical redundancy and selects features by considering both the hierarchical information and the features relevance, but the selected features might not retain the complete hierarchical information.
- 2.Fellbaum C (1998) WordNet. Blackwell Publishing LtdGoogle Scholar
- 3.Hall MA (1999) Correlation-based feature selection for machine learning. PhD thesis, The University of WaikatoGoogle Scholar
- 5.Wan C, Freitas AA (2016) A new hierarchical redundancy eliminated tree augmented naive bayes classifier for coping with gene ontology-based features. In: Proceedings of the 33rd international conference on machine learning (ICML 2016) workshop on computational biology, New York, USAGoogle Scholar