Abstract
In the era of big data, the amount of class labels is growing rapidly, which poses a great challenge to classification tasks. The hierarchical classification was thus introduced to address this issue by considering the structural information between different class labels. In this paper, we propose an incremental feature selection algorithm for handling the arrival of new samples by using the theory of fuzzy rough sets. As a preliminary step, we propose a non-incremental hierarchical feature selection algorithm, which is an improved version of the existing method. Then utilizing the sibling strategy, the incremental calculation of the dependency degree at the arrival of samples is discussed. Based on the analysis of dependency degree change, we design feature addition and deletion strategies, as well as the incremental feature selection algorithm. In the experimental section, two versions of algorithms are designed. The experimental results show that our improvement of the existing method is highly effective and can significantly accelerate the process of feature selection. In addition, version 2 of the incremental algorithm exhibits much higher efficiency than the improved non-incremental algorithm on several datasets, as well as the existing method. Compared to six hierarchical feature selection algorithms, our algorithm achieves better results on the classification accuracy and three hierarchical evaluation metrics. The effectiveness and efficiency of version 1 are also verified by the comparison of version 2 and other results.
Similar content being viewed by others
Data availability and access
The datasets generated during and analysed during the current study are available from the corresponding author on reasonable request.
References
Deng J, Dong W, Socher R, et al (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition. IEEE, pp 248–255 https://doi.org/10.1109/cvpr.2009.5206848
Kabir A, Shehu A (2022) Goproformer: a multi-modal transformer method for gene ontology protein function prediction. Biomolecules 12(11):1709. https://doi.org/10.3390/biom12111709
Gemmeke JF, Ellis DP, Freedman D et al (2017) Audio set: an ontology and human-labeled dataset for audio events. In: 2017 IEEE international conference on acoustics, speech and signal processing (ICASSP), IEEE, pp 776–780 https://doi.org/10.1109/ICASSP.2017.7952261
Deng J, Berg AC, Fei-Fei L (2011) Hierarchical semantic indexing for large scale image retrieval. In: CVPR 2011, IEEE, pp 785–792, https://doi.org/10.1109/CVPR.2011.5995516
Galshetwar G, Waghmare LM, Gonde AB et al (2019) Local energy oriented pattern for image indexing and retrieval. J Vis Commun Image Represent 64:102615. https://doi.org/10.1016/j.jvcir.2019.102615
Wu X, Zhu X, Wu GQ et al (2014) Data mining with big data. IEEE Trans Knowl Data Eng 26(1):97–107. https://doi.org/10.1109/TKDE.2013.109
Zhao S, Han Y, Zou Q et al (2016) Hierarchical support vector machine based structural classification with fused hierarchies. Neurocomputing 214:86–92. https://doi.org/10.1016/j.neucom.2016.05.072
Wang Y, Wang Z, Hu Q et al (2022) Hierarchical semantic risk minimization for large-scale classification. IEEE Trans Cybern 52(9):9546–9558. https://doi.org/10.1109/TCYB.2021.3059631
Wang Y, Hu Q, Chen H et al (2022) Uncertainty instructed multi-granularity decision for large-scale hierarchical classification. Inf Sci 586:644–661. https://doi.org/10.1016/j.ins.2021.12.009
Wang C, Qi Y, Shao M et al (2016) A fitting model for feature selection with fuzzy rough sets. IEEE Trans Fuzzy Syst 25(4):741–753. https://doi.org/10.1109/TFUZZ.2016.2574918
Wang C, Huang Y, Shao M et al (2019) Fuzzy rough set-based attribute reduction using distance measures. Knowl Based Syst 164:205–212. https://doi.org/10.1016/j.knosys.2018.10.038
Wang C, Huang Y, Ding W et al (2021) Attribute reduction with fuzzy rough self-information measures. Inf Sci 549:68–86. https://doi.org/10.1016/j.ins.2020.11.021
Jensen R, Shen Q (2004) Fuzzy-rough attribute reduction with application to web categorization. Fuzzy Sets Syst 141(3):469–485. https://doi.org/10.1016/S0165-0114(03)00021-6
Hu Q, Yu D, Xie Z et al (2006) Fuzzy probabilistic approximation spaces and their information measures. IEEE Trans Fuzzy Syst 14(2):191–201. https://doi.org/10.1109/TFUZZ.2005.864086
Hu Q, Yu D, Xie Z (2006) Information-preserving hybrid data reduction based on fuzzy-rough techniques. Pattern Recognit Lett 27(5):414–423. https://doi.org/10.1016/j.patrec.2005.09.004
Chen D, Zhang L, Zhao S et al (2011) A novel algorithm for finding reducts with fuzzy rough sets. IEEE Trans Fuzzy Syst 20(2):385–389. https://doi.org/10.1109/TFUZZ.2011.2173695
Ni P, Zhao S, Wang X et al (2020) Incremental feature selection based on fuzzy rough sets. Inf Sci 536:185–204. https://doi.org/10.1016/j.ins.2020.04.038
Zhang X, Mei C, Chen D et al (2019) Active incremental feature selection using a fuzzy-rough-set-based information entropy. IEEE Trans Fuzzy Syst 28(5):901–915. https://doi.org/10.1109/TFUZZ.2019.2959995
Yang Y, Chen D, Wang H et al (2017) Incremental perspective for feature selection based on fuzzy rough sets. IEEE Trans Fuzzy Syst 26(3):1257–1273. https://doi.org/10.1109/TFUZZ.2017.2718492
Zhao H, Hu Q, Zhu P et al (2019) A recursive regularization based feature selection framework for hierarchical classification. IEEE Trans Knowl Data Eng 33(7):2833–2846. https://doi.org/10.1109/TKDE.2019.2960251
Huang H, Liu H (2020) Feature selection for hierarchical classification via joint semantic and structural information of labels. Knowl Based Syst 195:105655. https://doi.org/10.1016/j.knosys.2020.105655
Wang Q, Wan J, Nie F et al (2018) Hierarchical feature selection for random projection. IEEE Trans Neural Netw Learn Syst 30(5):1581–1586. https://doi.org/10.1109/TNNLS.2018.2868836
Guo S, Zhao H, Yang W (2021) Hierarchical feature selection with multi-granularity clustering structure. Inf Sci 568:448–462. https://doi.org/10.1016/j.ins.2021.04.046
Lima HC, Otero FE, Merschmann LH et al (2021) A novel hybrid feature selection algorithm for hierarchical classification. IEEE Access 9:127278–127292. https://doi.org/10.1109/ACCESS.2021.3112396
Lin Y, Liu H, Zhao H et al (2022) Hierarchical feature selection based on label distribution learning. IEEE Trans Knowl Data Eng. https://doi.org/10.1109/TKDE.2022.3177246
Zhao H, Wang P, Hu Q et al (2019) Fuzzy rough set based feature selection for large-scale hierarchical classification. IEEE Trans Fuzzy Syst 27(10):1891–1903. https://doi.org/10.1109/TFUZZ.2019.2892349
Qiu Z, Zhao H (2022) A fuzzy rough set approach to hierarchical feature selection based on hausdorff distance. Appl Intell 52(10):1–14. https://doi.org/10.1007/s10489-021-03028-4
Bai S, Lin Y, Lv Y et al (2021) Kernelized fuzzy rough sets based online streaming feature selection for large-scale hierarchical classification. Appl Intell 51(3):1602–1615. https://doi.org/10.1007/s10489-020-01863-5
Dubois D, Prade H (1990) Rough fuzzy sets and fuzzy rough sets. Int J Gen Syst 17(2–3):191–209. https://doi.org/10.1080/03081079008935107
Zhao S, Tsang EC (2008) On fuzzy approximation operators in attribute reduction with fuzzy rough sets. Inf Sci 178(16):3163–3176. https://doi.org/10.1016/j.ins.2008.03.022
Tsang EC, Chen D, Yeung DS et al (2008) Attributes reduction using fuzzy rough sets. IEEE Trans Fuzzy Syst 16(5):1130–1141. https://doi.org/10.1109/TFUZZ.2006.889960
Silla CN, Freitas AA (2011) A survey of hierarchical classification across different application domains. Data Min Knowl Disc 22(1):31–72. https://doi.org/10.1007/s10618-010-0175-9
Kosmopoulos A, Partalas I, Gaussier E et al (2015) Evaluation measures for hierarchical classification: a unified view and novel approaches. Data Min Knowl Disc 29(3):820–865. https://doi.org/10.1007/s10618-014-0382-x
Ceci M, Malerba D (2007) Classifying web documents in a hierarchy of categories: a comprehensive study. J Intell Inf Syst 28(1):37–78. https://doi.org/10.1007/s10844-006-0003-2
Wan J, Chen H, Li T et al (2023) Interactive and complementary feature selection via fuzzy multigranularity uncertainty measures. IEEE Trans Cybern 53(2):1208–1221. https://doi.org/10.1109/TCYB.2021.3112203
Ding CH, Dubchak I (2001) Multi-class protein fold recognition using support vector machines and neural networks. Bioinformatics 17(4):349–358. https://doi.org/10.1093/bioinformatics/17.4.349
Wei L, Liao M, Gao X et al (2015) An improved protein structural classes prediction method by incorporating both sequence and structure information. IEEE Trans Nanobioscience 14(4):339–349. https://doi.org/10.1109/TNB.2014.2352454
Li D, Ju Y, Zou Q (2016) Protein folds prediction with hierarchical structured svm. Curr Proteom 13(2):79–85. https://doi.org/10.2174/157016461302160514000940
Liu X, Zhao H (2021) Robust hierarchical feature selection with a capped l2-norm. Neurocomputing 443:131–146. https://doi.org/10.1016/j.neucom.2021.03.002
Everingham M, Van Gool L, Williams CK et al (2010) The pascal visual object classes (voc) challenge. Int J Comput Vis 88:303–338. https://doi.org/10.1007/s11263-009-0275-4
Lampert CH, Nickisch H, Harmeling S (2009) Learning to detect unseen object classes by between-class attribute transfer. In: 2009 IEEE conference on computer vision and pattern recognition, IEEE, pp 951–958 https://doi.org/10.1109/CVPR.2009.5206594
Lang K (1995) Newsweeder: learning to filter netnews. In: Machine learning proceedings 1995. Elsevier, pp 331–339 https://doi.org/10.1016/B978-1-55860-377-6.50048-7
Joachims T (1998) Text categorization with support vector machines: learning with many relevant features. In: Machine Learning: ECML-98, Springer, pp 137–142 https://doi.org/10.1007/BFb0026683
Kira K, Rendell LA (1992) A practical approach to feature selection. In: Sleeman D, Edwards P (eds) Machine learning proceedings 1992. Morgan Kaufmann, San Francisco (CA), pp 249–256 https://doi.org/10.1016/B978-1-55860-247-2.50037-1
Peng H, Long F, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238. https://doi.org/10.1109/TPAMI.2005.159
Dekel O, Keshet J, Singer Y (2004) Large margin hierarchical classification. In: Proceedings of the twenty-first international conference on machine learning. Association for Computing Machinery, New York, USA, p 27 https://doi.org/10.1145/1015330.1015374
Struyf J, Džeroski S, Blockeel H, et al (2005) Hierarchical multi-classification with predictive clustering trees in functional genomics. In: Bento C, Cardoso A, Dias G (eds) Portuguese conference on artificial intelligence. Springer, Heidelberg, pp 272–283 https://doi.org/10.1007/11595014_27
Awad M, Khanna R (2015) Support Vector Machines for Classification, Apress, Berkeley, CA, pp 39–66 https://doi.org/10.1007/978-1-4302-5990-9_3
Gao X, Li G (2020) A knn model based on manhattan distance to identify the snare proteins. Ieee Access 8:112922–112931. https://doi.org/10.1109/ACCESS.2020.3003086
Ghiasi MM, Zendehboudi S (2021) Application of decision tree-based ensemble learning in the classification of breast cancer. Computers in biology and medicine 128:104089. https://doi.org/10.1016/j.compbiomed.2020.104089
Friedman M (1940) A comparison of alternative tests of significance for the problem of m rankings. Ann Math Stat 11(1):86–92. https://doi.org/10.1214/aoms/1177731944
Dunn OJ (1961) Multiple comparisons among means. J Am Stat Assoc 56(293):52–64. https://doi.org/10.2307/2282330
Acknowledgements
This work was supported by the National Nature Science Foundation of China under Grants 61976244 and 12001422, the Nature Science Foundation of Shaanxi Province under Grants 2021JQ-580 and 2023-JC-YB-597, and the Postgraduate Innovation and Practice Ability Development Fund of Xi’an Shiyou University under Grant YCS22112071.
Author information
Authors and Affiliations
Contributions
Yang Tian: Methodology, Software, Validation, Investigation, Formal analysis, Visualization, Writing-original draft; Yanhong She: Conceptualization, Methodology, Writing-original draft, Supervision.
Corresponding author
Ethics declarations
Ethical and informed consent for data used
Not applicable.
Competing Interests
The authors have no competing interests to declare that are relevant to the content of this article.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Tian, Y., She, Y. Incremental feature selection for large-scale hierarchical classification with the arrival of new samples. Appl Intell 54, 3933–3953 (2024). https://doi.org/10.1007/s10489-024-05352-x
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-024-05352-x