Skip to main content
Log in

Incremental feature selection for large-scale hierarchical classification with the arrival of new samples

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

In the era of big data, the amount of class labels is growing rapidly, which poses a great challenge to classification tasks. The hierarchical classification was thus introduced to address this issue by considering the structural information between different class labels. In this paper, we propose an incremental feature selection algorithm for handling the arrival of new samples by using the theory of fuzzy rough sets. As a preliminary step, we propose a non-incremental hierarchical feature selection algorithm, which is an improved version of the existing method. Then utilizing the sibling strategy, the incremental calculation of the dependency degree at the arrival of samples is discussed. Based on the analysis of dependency degree change, we design feature addition and deletion strategies, as well as the incremental feature selection algorithm. In the experimental section, two versions of algorithms are designed. The experimental results show that our improvement of the existing method is highly effective and can significantly accelerate the process of feature selection. In addition, version 2 of the incremental algorithm exhibits much higher efficiency than the improved non-incremental algorithm on several datasets, as well as the existing method. Compared to six hierarchical feature selection algorithms, our algorithm achieves better results on the classification accuracy and three hierarchical evaluation metrics. The effectiveness and efficiency of version 1 are also verified by the comparison of version 2 and other results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Algorithm 1
Algorithm 2
Algorithm 3
Fig. 1
Algorithm 4
Fig. 2
Algorithm 5
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Data availability and access

The datasets generated during and analysed during the current study are available from the corresponding author on reasonable request.

References

  1. Deng J, Dong W, Socher R, et al (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition. IEEE, pp 248–255 https://doi.org/10.1109/cvpr.2009.5206848

  2. Kabir A, Shehu A (2022) Goproformer: a multi-modal transformer method for gene ontology protein function prediction. Biomolecules 12(11):1709. https://doi.org/10.3390/biom12111709

  3. Gemmeke JF, Ellis DP, Freedman D et al (2017) Audio set: an ontology and human-labeled dataset for audio events. In: 2017 IEEE international conference on acoustics, speech and signal processing (ICASSP), IEEE, pp 776–780 https://doi.org/10.1109/ICASSP.2017.7952261

  4. Deng J, Berg AC, Fei-Fei L (2011) Hierarchical semantic indexing for large scale image retrieval. In: CVPR 2011, IEEE, pp 785–792, https://doi.org/10.1109/CVPR.2011.5995516

  5. Galshetwar G, Waghmare LM, Gonde AB et al (2019) Local energy oriented pattern for image indexing and retrieval. J Vis Commun Image Represent 64:102615. https://doi.org/10.1016/j.jvcir.2019.102615

    Article  Google Scholar 

  6. Wu X, Zhu X, Wu GQ et al (2014) Data mining with big data. IEEE Trans Knowl Data Eng 26(1):97–107. https://doi.org/10.1109/TKDE.2013.109

    Article  Google Scholar 

  7. Zhao S, Han Y, Zou Q et al (2016) Hierarchical support vector machine based structural classification with fused hierarchies. Neurocomputing 214:86–92. https://doi.org/10.1016/j.neucom.2016.05.072

    Article  Google Scholar 

  8. Wang Y, Wang Z, Hu Q et al (2022) Hierarchical semantic risk minimization for large-scale classification. IEEE Trans Cybern 52(9):9546–9558. https://doi.org/10.1109/TCYB.2021.3059631

    Article  Google Scholar 

  9. Wang Y, Hu Q, Chen H et al (2022) Uncertainty instructed multi-granularity decision for large-scale hierarchical classification. Inf Sci 586:644–661. https://doi.org/10.1016/j.ins.2021.12.009

    Article  Google Scholar 

  10. Wang C, Qi Y, Shao M et al (2016) A fitting model for feature selection with fuzzy rough sets. IEEE Trans Fuzzy Syst 25(4):741–753. https://doi.org/10.1109/TFUZZ.2016.2574918

    Article  Google Scholar 

  11. Wang C, Huang Y, Shao M et al (2019) Fuzzy rough set-based attribute reduction using distance measures. Knowl Based Syst 164:205–212. https://doi.org/10.1016/j.knosys.2018.10.038

    Article  Google Scholar 

  12. Wang C, Huang Y, Ding W et al (2021) Attribute reduction with fuzzy rough self-information measures. Inf Sci 549:68–86. https://doi.org/10.1016/j.ins.2020.11.021

    Article  MathSciNet  Google Scholar 

  13. Jensen R, Shen Q (2004) Fuzzy-rough attribute reduction with application to web categorization. Fuzzy Sets Syst 141(3):469–485. https://doi.org/10.1016/S0165-0114(03)00021-6

    Article  MathSciNet  Google Scholar 

  14. Hu Q, Yu D, Xie Z et al (2006) Fuzzy probabilistic approximation spaces and their information measures. IEEE Trans Fuzzy Syst 14(2):191–201. https://doi.org/10.1109/TFUZZ.2005.864086

    Article  Google Scholar 

  15. Hu Q, Yu D, Xie Z (2006) Information-preserving hybrid data reduction based on fuzzy-rough techniques. Pattern Recognit Lett 27(5):414–423. https://doi.org/10.1016/j.patrec.2005.09.004

    Article  Google Scholar 

  16. Chen D, Zhang L, Zhao S et al (2011) A novel algorithm for finding reducts with fuzzy rough sets. IEEE Trans Fuzzy Syst 20(2):385–389. https://doi.org/10.1109/TFUZZ.2011.2173695

    Article  Google Scholar 

  17. Ni P, Zhao S, Wang X et al (2020) Incremental feature selection based on fuzzy rough sets. Inf Sci 536:185–204. https://doi.org/10.1016/j.ins.2020.04.038

    Article  MathSciNet  Google Scholar 

  18. Zhang X, Mei C, Chen D et al (2019) Active incremental feature selection using a fuzzy-rough-set-based information entropy. IEEE Trans Fuzzy Syst 28(5):901–915. https://doi.org/10.1109/TFUZZ.2019.2959995

    Article  Google Scholar 

  19. Yang Y, Chen D, Wang H et al (2017) Incremental perspective for feature selection based on fuzzy rough sets. IEEE Trans Fuzzy Syst 26(3):1257–1273. https://doi.org/10.1109/TFUZZ.2017.2718492

    Article  Google Scholar 

  20. Zhao H, Hu Q, Zhu P et al (2019) A recursive regularization based feature selection framework for hierarchical classification. IEEE Trans Knowl Data Eng 33(7):2833–2846. https://doi.org/10.1109/TKDE.2019.2960251

    Article  Google Scholar 

  21. Huang H, Liu H (2020) Feature selection for hierarchical classification via joint semantic and structural information of labels. Knowl Based Syst 195:105655. https://doi.org/10.1016/j.knosys.2020.105655

    Article  Google Scholar 

  22. Wang Q, Wan J, Nie F et al (2018) Hierarchical feature selection for random projection. IEEE Trans Neural Netw Learn Syst 30(5):1581–1586. https://doi.org/10.1109/TNNLS.2018.2868836

    Article  Google Scholar 

  23. Guo S, Zhao H, Yang W (2021) Hierarchical feature selection with multi-granularity clustering structure. Inf Sci 568:448–462. https://doi.org/10.1016/j.ins.2021.04.046

    Article  MathSciNet  Google Scholar 

  24. Lima HC, Otero FE, Merschmann LH et al (2021) A novel hybrid feature selection algorithm for hierarchical classification. IEEE Access 9:127278–127292. https://doi.org/10.1109/ACCESS.2021.3112396

    Article  Google Scholar 

  25. Lin Y, Liu H, Zhao H et al (2022) Hierarchical feature selection based on label distribution learning. IEEE Trans Knowl Data Eng. https://doi.org/10.1109/TKDE.2022.3177246

  26. Zhao H, Wang P, Hu Q et al (2019) Fuzzy rough set based feature selection for large-scale hierarchical classification. IEEE Trans Fuzzy Syst 27(10):1891–1903. https://doi.org/10.1109/TFUZZ.2019.2892349

    Article  Google Scholar 

  27. Qiu Z, Zhao H (2022) A fuzzy rough set approach to hierarchical feature selection based on hausdorff distance. Appl Intell 52(10):1–14. https://doi.org/10.1007/s10489-021-03028-4

    Article  Google Scholar 

  28. Bai S, Lin Y, Lv Y et al (2021) Kernelized fuzzy rough sets based online streaming feature selection for large-scale hierarchical classification. Appl Intell 51(3):1602–1615. https://doi.org/10.1007/s10489-020-01863-5

    Article  Google Scholar 

  29. Dubois D, Prade H (1990) Rough fuzzy sets and fuzzy rough sets. Int J Gen Syst 17(2–3):191–209. https://doi.org/10.1080/03081079008935107

    Article  Google Scholar 

  30. Zhao S, Tsang EC (2008) On fuzzy approximation operators in attribute reduction with fuzzy rough sets. Inf Sci 178(16):3163–3176. https://doi.org/10.1016/j.ins.2008.03.022

    Article  MathSciNet  Google Scholar 

  31. Tsang EC, Chen D, Yeung DS et al (2008) Attributes reduction using fuzzy rough sets. IEEE Trans Fuzzy Syst 16(5):1130–1141. https://doi.org/10.1109/TFUZZ.2006.889960

    Article  Google Scholar 

  32. Silla CN, Freitas AA (2011) A survey of hierarchical classification across different application domains. Data Min Knowl Disc 22(1):31–72. https://doi.org/10.1007/s10618-010-0175-9

    Article  MathSciNet  Google Scholar 

  33. Kosmopoulos A, Partalas I, Gaussier E et al (2015) Evaluation measures for hierarchical classification: a unified view and novel approaches. Data Min Knowl Disc 29(3):820–865. https://doi.org/10.1007/s10618-014-0382-x

    Article  MathSciNet  Google Scholar 

  34. Ceci M, Malerba D (2007) Classifying web documents in a hierarchy of categories: a comprehensive study. J Intell Inf Syst 28(1):37–78. https://doi.org/10.1007/s10844-006-0003-2

    Article  Google Scholar 

  35. Wan J, Chen H, Li T et al (2023) Interactive and complementary feature selection via fuzzy multigranularity uncertainty measures. IEEE Trans Cybern 53(2):1208–1221. https://doi.org/10.1109/TCYB.2021.3112203

    Article  Google Scholar 

  36. Ding CH, Dubchak I (2001) Multi-class protein fold recognition using support vector machines and neural networks. Bioinformatics 17(4):349–358. https://doi.org/10.1093/bioinformatics/17.4.349

    Article  Google Scholar 

  37. Wei L, Liao M, Gao X et al (2015) An improved protein structural classes prediction method by incorporating both sequence and structure information. IEEE Trans Nanobioscience 14(4):339–349. https://doi.org/10.1109/TNB.2014.2352454

    Article  Google Scholar 

  38. Li D, Ju Y, Zou Q (2016) Protein folds prediction with hierarchical structured svm. Curr Proteom 13(2):79–85. https://doi.org/10.2174/157016461302160514000940

    Article  Google Scholar 

  39. Liu X, Zhao H (2021) Robust hierarchical feature selection with a capped l2-norm. Neurocomputing 443:131–146. https://doi.org/10.1016/j.neucom.2021.03.002

    Article  Google Scholar 

  40. Everingham M, Van Gool L, Williams CK et al (2010) The pascal visual object classes (voc) challenge. Int J Comput Vis 88:303–338. https://doi.org/10.1007/s11263-009-0275-4

    Article  Google Scholar 

  41. Lampert CH, Nickisch H, Harmeling S (2009) Learning to detect unseen object classes by between-class attribute transfer. In: 2009 IEEE conference on computer vision and pattern recognition, IEEE, pp 951–958 https://doi.org/10.1109/CVPR.2009.5206594

  42. Lang K (1995) Newsweeder: learning to filter netnews. In: Machine learning proceedings 1995. Elsevier, pp 331–339 https://doi.org/10.1016/B978-1-55860-377-6.50048-7

  43. Joachims T (1998) Text categorization with support vector machines: learning with many relevant features. In: Machine Learning: ECML-98, Springer, pp 137–142 https://doi.org/10.1007/BFb0026683

  44. Kira K, Rendell LA (1992) A practical approach to feature selection. In: Sleeman D, Edwards P (eds) Machine learning proceedings 1992. Morgan Kaufmann, San Francisco (CA), pp 249–256 https://doi.org/10.1016/B978-1-55860-247-2.50037-1

  45. Peng H, Long F, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance,  and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238. https://doi.org/10.1109/TPAMI.2005.159

  46. Dekel O, Keshet J, Singer Y (2004) Large margin hierarchical classification. In: Proceedings of the twenty-first international conference on machine learning. Association for Computing Machinery, New York, USA, p 27 https://doi.org/10.1145/1015330.1015374

  47. Struyf J, Džeroski S, Blockeel H, et al (2005) Hierarchical multi-classification with predictive clustering trees in functional genomics. In: Bento C, Cardoso A, Dias G (eds) Portuguese conference on artificial intelligence. Springer, Heidelberg, pp 272–283 https://doi.org/10.1007/11595014_27

  48. Awad M, Khanna R (2015) Support Vector Machines for Classification, Apress, Berkeley, CA, pp 39–66 https://doi.org/10.1007/978-1-4302-5990-9_3

  49. Gao X, Li G (2020) A knn model based on manhattan distance to identify the snare proteins. Ieee Access 8:112922–112931. https://doi.org/10.1109/ACCESS.2020.3003086

    Article  Google Scholar 

  50. Ghiasi MM, Zendehboudi S (2021) Application of decision tree-based ensemble learning in the classification of breast cancer. Computers in biology and medicine 128:104089. https://doi.org/10.1016/j.compbiomed.2020.104089

    Article  Google Scholar 

  51. Friedman M (1940) A comparison of alternative tests of significance for the problem of m rankings. Ann Math Stat 11(1):86–92. https://doi.org/10.1214/aoms/1177731944

    Article  MathSciNet  Google Scholar 

  52. Dunn OJ (1961) Multiple comparisons among means. J Am Stat Assoc 56(293):52–64. https://doi.org/10.2307/2282330

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Nature Science Foundation of China under Grants 61976244 and 12001422, the Nature Science Foundation of Shaanxi Province under Grants 2021JQ-580 and 2023-JC-YB-597, and the Postgraduate Innovation and Practice Ability Development Fund of Xi’an Shiyou University under Grant YCS22112071.

Author information

Authors and Affiliations

Authors

Contributions

Yang Tian: Methodology, Software, Validation, Investigation, Formal analysis, Visualization, Writing-original draft; Yanhong She: Conceptualization, Methodology, Writing-original draft, Supervision.

Corresponding author

Correspondence to Yanhong She.

Ethics declarations

Ethical and informed consent for data used

Not applicable.

Competing Interests

The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tian, Y., She, Y. Incremental feature selection for large-scale hierarchical classification with the arrival of new samples. Appl Intell 54, 3933–3953 (2024). https://doi.org/10.1007/s10489-024-05352-x

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-024-05352-x

Keywords

Navigation