Skip to main content
Log in

Uncertainty Measure-Based Incremental Feature Selection For Hierarchical Classification

  • Published:
International Journal of Fuzzy Systems Aims and scope Submit manuscript

Abstract

In the era of big data, there exist complex structure between different classes labels. Hierarchical structure, among others, has become a representative one, which is mathematically depicted as a tree-like structure or directed acyclic graph. Most studies in the literature focus on static feature selection in hierarchical information system. In this study, in order to solve the incremental feature selection problem of hierarchical classification in a dynamic environment, we develop two incremental algorithms for this purpose (IHFSGR-1 and IHFSGR-2 for short). As a preliminary step, we propose a new uncertainty measure to quantify the amount of information contained in the hierarchical classification system, and based on this, we develop a non-incremental hierarchical feature selection algorithm. Next, we investigate the updating mechanism of this uncertainty measure upon the arrival of samples, and propose two strategies for adding and deleting features, leading to the development of two incremental algorithms. Finally, we conduct some comparative experiments with several non-incremental algorithms. The experimental results suggest that compared with several non-incremental algorithms, our incremental algorithms can achieve better performance in terms of the classification accuracy and two hierarchical evaluation metrics, and can significantly accelerate the fuzzy rough set-based hierarchical feature selection.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Data Availability

The datasets generated during and analyzed during the current study are available from the corresponding author on reasonable request.

References

  1. Gemmeke, J.F., Ellis, D.P., Freedman, D., Jansen, A., Lawrence, W., Moore, R.C., Plakal, M., Ritter, M.: Audio set: an ontology and human-labeled dataset for audio events. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 776–780 (2017). https://doi.org/10.1109/ICASSP.2017.7952261

  2. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei Fei, L.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009). https://doi.org/10.1109/cvpr.2009.5206848

  3. Kabir, A., Shehu, A.: Goproformer: a multi-modal transformer method for gene ontology protein function prediction. Biomolecules 12(11), 1709 (2022). https://doi.org/10.3390/biom12111709

    Article  Google Scholar 

  4. Seshadri, K., Maruthappan, A., Sundar Raman, M.: A scalable parallel algorithm for building web directories. Concurr. Comp.-Pract. E 33(9), 6121 (2021). https://doi.org/10.1002/cpe.6121

    Article  Google Scholar 

  5. Omosebi, P.A., Adewole, A.P., Sennaike, O.: Web page prediction model using machine learning approaches: a review. In: 2023 International Conference on Science, Engineering and Business for Sustainable Development Goals (SEB-SDG), vol. 1, pp. 1–10 (2023). https://doi.org/10.1109/SEB-SDG57117.2023.10124586

  6. Galshetwar, G., Waghmare, L.M., Gonde, A.B., Murala, S.: Local energy oriented pattern for image indexing and retrieval. J. Vis. Commun. Image Represent 64, 102615 (2019). https://doi.org/10.1016/j.jvcir.2019.102615

    Article  Google Scholar 

  7. Deng, J., Berg, A.C., Fei-Fei, L.: Hierarchical semantic indexing for large scale image retrieval. In: CVPR 2011, pp. 785–792 (2011). https://doi.org/10.1109/CVPR.2011.5995516

  8. Zadeh, L.A.: Fuzzy sets. Inf. Control 8(3), 338–353 (1965). https://doi.org/10.1016/S0019-9958(65)90241-X

    Article  Google Scholar 

  9. Shen, Y., E, H., Chen, T., Xiao, Z., Liu, B., Chen, Y.: High-dimensional data clustering with fuzzy c-means: problem, reason, and solution. In: International Work-Conference on Artificial Neural Networks, pp. 89–100 (2021). https://doi.org/10.1007/978-3-030-85030-2_8

  10. Ruspini, E.H., Bezdek, J.C., Keller, J.M.: Fuzzy clustering: a historical perspective. IEEE Comput. Intell. M 14(1), 45–55 (2019). https://doi.org/10.1109/MCI.2018.2881643

    Article  Google Scholar 

  11. Hu, L., Pan, X., Tang, Z., Luo, X.: A fast fuzzy clustering algorithm for complex networks via a generalized momentum method. IEEE Trans. Fuzzy Syst. 30(9), 3473–3485 (2021). https://doi.org/10.1109/TFUZZ.2021.3117442

    Article  Google Scholar 

  12. Hanyu, E., Cui, Y., Pedrycz, W., Fayek, A.R., Li, Z., Li, J.: Design of fuzzy rule-based models with fuzzy relational factorization. Expert Syst. Appl. 206, 117904 (2022). https://doi.org/10.1016/j.eswa.2022.117904

    Article  Google Scholar 

  13. Hu, X., Pedrycz, W., Wang, X.: Granular fuzzy rule-based models: a study in a comprehensive evaluation and construction of fuzzy models. IEEE Trans. Fuzzy Syst. 25(5), 1342–1355 (2016). https://doi.org/10.1109/TFUZZ.2016.2612300

    Article  Google Scholar 

  14. Dubois, D., Prade, H.: Rough fuzzy sets and fuzzy rough sets. Int. J. Gen. Syst. 17(2–3), 191–209 (1990). https://doi.org/10.1080/03081079008935107

    Article  Google Scholar 

  15. Pawlak, Z.: Rough sets. Int. J. Comput. Inf. sci. 11, 341–356 (1982). https://doi.org/10.1007/BF01001956

    Article  Google Scholar 

  16. Wang, C., Qi, Y., Shao, M., Hu, Q., Chen, D., Qian, Y., Lin, Y.: A fitting model for feature selection with fuzzy rough sets. IEEE Trans. Fuzzy Syst. 25(4), 741–753 (2016). https://doi.org/10.1109/TFUZZ.2016.2574918

    Article  Google Scholar 

  17. Wang, C., Huang, Y., Shao, M., Fan, X.: Fuzzy rough set-based attribute reduction using distance measures. Knowl. Based Syst. 164, 205–212 (2019). https://doi.org/10.1016/j.knosys.2018.10.038

    Article  Google Scholar 

  18. Wang, C., Huang, Y., Ding, W., Cao, Z.: Attribute reduction with fuzzy rough self-information measures. Inf. Sci. 549, 68–86 (2021). https://doi.org/10.1016/j.ins.2020.11.021

    Article  MathSciNet  Google Scholar 

  19. Ni, P., Zhao, S., Wang, X., Chen, H., Li, C.: Para: a positive-region based attribute reduction accelerator. Inf. Sci. 503, 533–550 (2019). https://doi.org/10.1016/j.ins.2019.07.038

    Article  Google Scholar 

  20. Dai, J., Hu, H., Wu, W.-Z., Qian, Y., Huang, D.: Maximal-discernibility-pair-based approach to attribute reduction in fuzzy rough sets. IEEE Trans. Fuzzy Syst. 26(4), 2174–2187 (2017). https://doi.org/10.1109/TFUZZ.2017.2768044

    Article  Google Scholar 

  21. Hu, Q., Yu, D., Xie, Z., Liu, J.: Fuzzy probabilistic approximation spaces and their information measures. IEEE Trans. Fuzzy Syst. 14(2), 191–201 (2006). https://doi.org/10.1109/TFUZZ.2005.864086

    Article  Google Scholar 

  22. Hu, Q., Yu, D., Xie, Z.: Information-preserving hybrid data reduction based on fuzzy-rough techniques. Pattern Recognit. Lett. 27(5), 414–423 (2006). https://doi.org/10.1016/j.patrec.2005.09.004

    Article  Google Scholar 

  23. Ni, P., Zhao, S., Wang, X., Chen, H., Li, C., Tsang, E.C.: Incremental feature selection based on fuzzy rough sets. Inf. Sci. 536, 185–204 (2020). https://doi.org/10.1016/j.ins.2020.04.038

    Article  MathSciNet  Google Scholar 

  24. Yang, Y., Chen, D., Wang, H., Wang, X.: Incremental perspective for feature selection based on fuzzy rough sets. IEEE Trans. Fuzzy Syst. 26(3), 1257–1273 (2017). https://doi.org/10.1109/TFUZZ.2017.2718492

    Article  Google Scholar 

  25. Zhang, X., Mei, C., Chen, D., Yang, Y., Li, J.: Active incremental feature selection using a fuzzy-rough-set-based information entropy. IEEE Trans. Fuzzy Syst. 28(5), 901–915 (2019). https://doi.org/10.1109/TFUZZ.2019.2959995

    Article  Google Scholar 

  26. Huang, H., Liu, H.: Feature selection for hierarchical classification via joint semantic and structural information of labels. Knowl. Based Syst. 195, 105655 (2020). https://doi.org/10.1016/j.knosys.2020.105655

    Article  Google Scholar 

  27. Wang, Q., Wan, J., Nie, F., Liu, B., Yan, C., Li, X.: Hierarchical feature selection for random projection. IEEE Trans. Neural Netw. Learn. Syst. 30(5), 1581–1586 (2018). https://doi.org/10.1109/TNNLS.2018.2868836

    Article  Google Scholar 

  28. Lima, H.C., Otero, F.E., Merschmann, L.H., Souza, M.J.: A novel hybrid feature selection algorithm for hierarchical classification. IEEE Access 9, 127278–127292 (2021). https://doi.org/10.1109/ACCESS.2021.3112396

    Article  Google Scholar 

  29. Zhao, H., Wang, P., Hu, Q., Zhu, P.: Fuzzy rough set based feature selection for large-scale hierarchical classification. IEEE Trans. Fuzzy Syst. 27(10), 1891–1903 (2019). https://doi.org/10.1109/TFUZZ.2019.2892349

    Article  Google Scholar 

  30. Qiu, Z., Zhao, H.: A fuzzy rough set approach to hierarchical feature selection based on Hausdorff distance. Appl. Intell. 52(10), 1–14 (2022). https://doi.org/10.1007/s10489-021-03028-4

    Article  Google Scholar 

  31. Huang, W., She, Y., He, X., Ding, W.: Fuzzy rough sets-based incremental feature selection for hierarchical classification. IEEE Trans. Fuzzy Syst. 31(10), 3721–3733 (2023). https://doi.org/10.1109/TFUZZ.2023.3300913

    Article  Google Scholar 

  32. Bai, S., Lin, Y., Lv, Y., Chen, J., Wang, C.: Kernelized fuzzy rough sets based online streaming feature selection for large-scale hierarchical classification. Appl. Intell. 51(3), 1602–1615 (2021). https://doi.org/10.1007/s10489-020-01863-5

    Article  Google Scholar 

  33. She, Y., Wu, J., He, X.: An incremental approach to hierarchical feature selection by applying fuzzy rough set technique. Artif. Intell. Rev. 56, 2571–2598 (2023). https://doi.org/10.1007/s10462-023-10584-3

    Article  Google Scholar 

  34. Zhao, S., Tsang, E.C.: On fuzzy approximation operators in attribute reduction with fuzzy rough sets. Inf. Sci. 178(16), 3163–3176 (2008). https://doi.org/10.1016/j.ins.2008.03.022

    Article  MathSciNet  Google Scholar 

  35. Silla, C.N., Freitas, A.A.: A survey of hierarchical classification across different application domains. Data Min. Knowl. Disc. 22(1), 31–72 (2011). https://doi.org/10.1007/s10618-010-0175-9

    Article  MathSciNet  Google Scholar 

  36. Kosmopoulos, A., Partalas, I., Gaussier, E., Paliouras, G., Androutsopoulos, I.: Evaluation measures for hierarchical classification: a unified view and novel approaches. Data Min. Knowl. Disc. 29(3), 820–865 (2015). https://doi.org/10.1007/s10618-014-0382-x

    Article  MathSciNet  Google Scholar 

  37. Zhao, H., Hu, Q., Zhu, P., Wang, Y., Wang, P.: A recursive regularization based feature selection framework for hierarchical classification. IEEE Trans. Knowl. Data Eng. 33(7), 2833–2846 (2019). https://doi.org/10.1109/TKDE.2019.2960251

    Article  Google Scholar 

  38. Zhang, X., Mei, C., Chen, D., Li, J.: Feature selection in mixed data: a method using a novel fuzzy rough set-based information entropy. Pattern Recognit. 56, 1–15 (2016). https://doi.org/10.1016/j.patcog.2016.02.013

    Article  Google Scholar 

  39. Jia, P., Dai, J., Pan, Y.: Novel algorithm for attribute reduction based on mutual-information gain ratio. J. Zhejiang Univ. Eng. Sci. 40(6), 1041 (2006)

    Google Scholar 

  40. Dai, J., Xu, Q.: Attribute selection based on information gain ratio in fuzzy rough set theory with application to tumor classification. Appl. Soft Comput. 13(1), 211–221 (2013). https://doi.org/10.1016/j.asoc.2012.07.029

    Article  Google Scholar 

  41. Li, D., Ju, Y., Zou, Q.: Protein folds prediction with hierarchical structured SVM. Curr. Proteom 13(2), 79–85 (2016). https://doi.org/10.2174/157016461302160514000940

    Article  Google Scholar 

  42. Wan, J., Chen, H., Li, T., Yuan, Z., Liu, J., Huang, W.: Interactive and complementary feature selection via fuzzy multigranularity uncertainty measures. IEEE Trans. Cybern. 53(2), 1208–1221 (2023). https://doi.org/10.1109/TCYB.2021.3112203

    Article  Google Scholar 

  43. She, Y., Wu, J., He, X.: Hierarchical classification feature selection algorithm based on sample pair selection. J. KunMing Univ. Sci. Technol. 47(240), 92–102 (2022). https://doi.org/10.16112/j.cnki.53-1223/n.2022.05.136

    Article  Google Scholar 

  44. Peng, H., Long, F., Ding, C.: Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1226–1238 (2005). https://doi.org/10.1109/TPAMI.2005.159

    Article  Google Scholar 

  45. Chen, T., Guestrin, C.: Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd Acm Sigkdd International Conference on Knowledge Discovery and Data Mining, pp. 785–794 (2016). https://doi.org/10.1145/2939672.2939785

  46. Dekel, O., Keshet, J., Singer, Y.: Large margin hierarchical classification. In: Proceedings of the Twenty-first International Conference on Machine Learning, p. 27. Association for Computing Machinery, New York (2004). https://doi.org/10.1145/1015330.1015374

  47. Struyf, J., Džeroski, S., Blockeel, H., Clare, A.: Hierarchical multi-classification with predictive clustering trees in functional genomics. In: Bento, C., Cardoso, A., Dias, G. (eds.) Portuguese Conference on Artificial Intelligence, pp. 272–283. Springer, Heidelberg (2005). https://doi.org/10.1007/11595014_27

    Chapter  Google Scholar 

  48. Friedman, M.: A comparison of alternative tests of significance for the problem of m rankings. Ann. Math. Stat. 11(1), 86–92 (1940). https://doi.org/10.1214/aoms/1177731944

    Article  MathSciNet  Google Scholar 

  49. Dunn, O.J.: Multiple comparisons among means. J. Am. Stat. Assoc. 56(293), 52–64 (1961). https://doi.org/10.2307/2282330

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Nature Science Foundation of China under Grants 61976244 and 12001422, the Nature Science Foundation of Shaanxi Province under Grants 2021JQ-580 and 2023-JC-YB-597, and the Postgraduate Innovation and Practice Ability Development Fund of Xi’an Shiyou University under Grant YCS22112071.

Author information

Authors and Affiliations

Authors

Contributions

Yang Tian: Conceptualization, Methodology, Software, Validation, Investigation, Formal analysis, Visualization, Writing-original draft; Yanhong She: Conceptualization, Methodology, Writing-original draft, Supervision.

Corresponding author

Correspondence to Yanhong She.

Ethics declarations

Conflict of interest

The authors have no Conflict of interest to declare that are relevant to the content of this article.

Ethical Approval

Not applicable.

Informed Consent

Not applicable.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tian, Y., She, Y. Uncertainty Measure-Based Incremental Feature Selection For Hierarchical Classification. Int. J. Fuzzy Syst. (2024). https://doi.org/10.1007/s40815-024-01708-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s40815-024-01708-0

Keywords

Navigation