Skip to main content
Log in

Fuzzy rough assisted missing value imputation and feature selection

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Presence of missing values and irrelevant features are commonplace issues that need to be handled effectively. Missing value imputation and feature selection is an efficient technique for redressing such problems. Fuzzy rough set-based approaches provide a handful of solutions for further dealing with vagueness and uncertainty available in the data. The present paper introduces the notion of imputing missing values followed by feature selection utilizing fuzzy rough set-based approaches. The idea of missing value estimation and instance ignorance are combined for fuzzy rough missing value imputation employing only correlated features followed by feature selection with a search heuristic. The experimental evaluation on benchmark datasets demonstrates the applicability and robustness of the proposed work. It significantly reduces data dimensionality after imputing missing values maintaining high performances. A comparative analysis demonstrates the superiority of the proposed methodology.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data Availability

All data analysed during this study are from UCI [30], open ML [31] and open MV net (https://openmv.net/tag/missing-data).

Notes

  1. https://openmv.net/tag/missing-data.

References

  1. Gupta A, Lam MS (1996) Estimating missing values using neural networks. J Oper Res Soc 47(2):229–238

    Article  MATH  Google Scholar 

  2. Song S, Sun Y, Zhang A, Chen L, Wang J (2018) Enriching data imputation under similarity rule constraints. IEEE transactions on knowledge and data engineering

  3. Honghai F, Guoshun C, Cheng Y, Bingru Y, Yumei C (2005) A svm regression based approach to filling in missing values. In: International Conference on Knowledge-Based and Intelligent Information and Engineering Systems. Springer, pp 581–587

  4. Liao Z, Lu X, Yang T, Wang H (2009) Missing data imputation: a fuzzy k-means clustering algorithm over sliding window. In: 2009 Sixth International Conference on Fuzzy Systems and Knowledge Discovery, vol. 3. IEEE, pp 133–137

  5. de França FO, Coelho GP, Von Zuben FJ (2013) Predicting missing values with biclustering: A coherence-based approach. Pattern Recogn 46(5):1255–1266

    Article  MATH  Google Scholar 

  6. Liu Z-G, Pan Q, Dezert J, Martin A (2016) Adaptive imputation of missing values for incomplete pattern classification. Pattern Recogn 52:85–95

    Article  Google Scholar 

  7. Sefidian AM, Daneshpour N (2019) Missing value imputation using a novel grey based fuzzy c-means, mutual information based feature selection, and regression model. Expert Syst Appl 115:68–94

    Article  Google Scholar 

  8. Rastegar S, Araujo R, Mendes J (2017) Online identification of takagi-sugeno fuzzy models based on self-adaptive hierarchical particle swarm optimization algorithm. Appl Math Model 45:606–620

    Article  MathSciNet  MATH  Google Scholar 

  9. Silva-Ramirez E-L, Cabrera-Sánchez J-F (2022) Correction to: Co-active neuro-fuzzy inference system model as single imputation approach for non-monotone pattern of missing data. Neural Comput Appl 34(3):2495–2496

    Article  Google Scholar 

  10. Shu W, Shen H (2014) Incremental feature selection based on rough set in dynamic incomplete data. Pattern Recogn 47(12):3890–3906

    Article  Google Scholar 

  11. Safi M (2021) Data imputation using differential dependency and fuzzy multi-objective linear programming, Ph.D. thesis, University of Windsor (Canada)

  12. Choudhury SJ, Pal NR (2022) Fuzzy clustering of single-view incomplete data using a multi-view framework. IEEE Trans Fuzzy Syst

  13. Dubois D, Prade H (1992) Putting rough sets and fuzzy sets together. In: Intelligent Decision Support. Springer, pp 203–232

  14. Raja P, Sasirekha K, Thangavel K (2019) A novel fuzzy rough clustering parameter-based missing value imputation. Neural Comput Appl pp 1–18

  15. Jain P, Tiwari AK, Som T (2020) A fitting model based intuitionistic fuzzy rough feature selection. Eng Appl Artif Intell 89:103421

    Article  Google Scholar 

  16. Jain P, Tiwari AK, Som T (2022) An intuitionistic fuzzy bireduct model and its application to cancer treatment. Comput Ind Eng 168:108124

    Article  Google Scholar 

  17. Jain P, Tiwari AK, Som T (2021) Enhanced prediction of anti-tubercular peptides from sequence information using divergence measure-based intuitionistic fuzzy-rough feature selection. Soft Comput 25(4):3065–3086

    Article  MATH  Google Scholar 

  18. Huang Z, Li J (2022) Noise-tolerant discrimination indexes for fuzzy v covering and feature subset selection. IEEE Tran Neural Netw Learn Syst

  19. Zhang X, Mei C, Chen D, Li J (2016) Feature selection in mixed data: A method using a novel fuzzy rough set-based information entropy. Pattern Recogn 56:1–15

    Article  MATH  Google Scholar 

  20. Li Y, Wu Z-F (2008) Fuzzy feature selection based on min-max learning rule and extension matrix. Pattern Recogn 41(1):217–226

    Article  MATH  Google Scholar 

  21. Yuan Z, Chen H, Li T (2022) Exploring interactive attribute reduction via fuzzy complementary entropy for unlabeled mixed data. Pattern Recogn 127:108651

    Article  Google Scholar 

  22. Wan J, Chen H, Li T, Sang B, Yuan Z (2022) Feature grouping and selection with graph theory in robust fuzzy rough approximation space. IEEE Trans Fuzzy Syst

  23. Qiu Z, Zhao H (2022) A fuzzy rough set approach to hierarchical feature selection based on hausdorff distance. Appl Intell 1–14

  24. Dengfeng L, Chuntian C (2002) New similarity measures of intuitionistic fuzzy sets and application to pattern recognitions. Pattern Recogn Lett 23(1–3):221–225

    Article  MATH  Google Scholar 

  25. Radzikowska AM, Kerre EE (2002) A comparative study of fuzzy rough sets. Fuzzy Sets Syst 126(2):137–155

    Article  MathSciNet  MATH  Google Scholar 

  26. Jensen R, Mac Parthaláin N, Cornells C (2014) Feature grouping-based fuzzy-rough feature selection. In: 2014 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE). IEEE, pp 1488–1495

  27. Liu H, Motoda H (2012) Feature selection for knowledge discovery and data mining, vol 454. Springer Science & Business Media, Berlin

    MATH  Google Scholar 

  28. Wang G-G, Deb S, Zhao X, Cui Z (2018) A new monarch butterfly optimization with an improved crossover operator. Oper Res 18(3):731–755

    Google Scholar 

  29. Wang G-G, Zhao X, Deb S (2015) A novel monarch butterfly optimization with greedy strategy and self-adaptive. In: 2015 Second International Conference on Soft Computing and Machine Intelligence (ISCMI). IEEE, pp 45–50

  30. Asuncion A, Newman D (2007) Uci machine learning repository

  31. Vanschoren J, Van Rijn JN, Bischl B, Torgo L (2014) Openml: networked science in machine learning. ACM SIGKDD Explor Newsl 15(2):49–60

    Article  Google Scholar 

  32. Singh S, Haddon J, Markou M (2001) Nearest-neighbour classifiers in natural scene analysis. Pattern Recogn 34(8):1601–1612

    Article  MATH  Google Scholar 

  33. Keerthi SS, Shevade SK, Bhattacharyya C, Murthy KRK (2001) Improvements to platt’s smo algorithm for svm classifier design. Neural comput 13(3):637–649

    Article  MATH  Google Scholar 

  34. Friedman M (1940) A comparison of alternative tests of significance for the problem of m rankings. Ann Math Stat 11(1):86–92

    Article  MathSciNet  MATH  Google Scholar 

  35. Dunn OJ (1961) Multiple comparisons among means. J Am Stat Assoc 56(293):52–64

    Article  MathSciNet  MATH  Google Scholar 

  36. Alcalá-Fdez J, Fernández A, Luengo J, Derrac J, García S, Sánchez L, Herrera F (2011) Keel data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J Mult Valued Logic Soft Comput 17

  37. Maini T, Kumar A, Misra RK, Singh D, Intelligent fuzzy rough set based feature selection using swarm algorithms with improved initialization, J Intell Fuzzy Syst (Preprint) 1–10

  38. Wang C, Qi Y, Shao M, Hu Q, Chen D, Qian Y, Lin Y (2016) A fitting model for feature selection with fuzzy rough sets. IEEE Trans Fuzzy Syst 25(4):741–753

    Article  Google Scholar 

Download references

Acknowledgements

This research work is funded by UGC Research Fellowship, India (Grant no: 3600/(PWD)(NET-NOV2017)) awarded to first author.

Funding

This study was funded by UGC Research Fellowship, India.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pankhuri Jain.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jain, P., Tiwari, A. & Som, T. Fuzzy rough assisted missing value imputation and feature selection. Neural Comput & Applic 35, 2773–2793 (2023). https://doi.org/10.1007/s00521-022-07754-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-022-07754-9

Keywords

Navigation