Abstract
Feature selection (FS) stability is an important topic of recent interest. Finding stable features is important for creating reliable, non-overfitted feature sets, which in turn can be used to generate machine learning models with better accuracy and explanations and are less prone to adversarial attacks. There are currently several definitions of FS stability that are widely used. In this paper, we demonstrate that existing stability metrics fail to quantify certain key elements of many datasets such as resilience to data drift or non-uniformly distributed missing values. To address this shortcoming, we propose a new definition for FS stability inspired by Lyapunov stability in dynamic systems. We show the proposed definition is statistically different from the classical record-stability on (\(n=90\)) datasets. We present the advantages and disadvantages of using Lyapunov and other stability definitions and demonstrate three scenarios in which each one of the three proposed stability metrics is best suited.
Article PDF
Similar content being viewed by others
Data transparency
All the data used in this research is provided as supplementary materials.
References
Ling, C.X., Huang, J., Zhang, H.: AUC: a better measure than accuracy in comparing learning algorithms. Adv. Artif. Intell. (2003)
Huang, J., Ling, C.X.: Using auc and accuracy in evaluating learning algorithms. Adv. Artif. Intell. 17(3), 299–310 (2005)
Al-Jarrah, O.Y., Yoo, P.D., Muhaidat, S., Karagiannidis, G.K., Taha, K.: Efficient machine learning for big data: a review. Big Data Res. 2(3), 87–93 (2015)
Jordan, M.I., Mitchell, T.M.: Machine learning: trends, perspectives, and prospects. Science 349(6245), 255–260 (2015)
Beriman, L.: Heuristics of instability and stabilization in model selection. Ann. Stat. 24, 2350–2383 (1996)
Bousquet, O., Elisseff, A.: Stability and generalization. J. Mach. Learn. Res. 2, 499–526 (2002)
Rosenfeld, A., Richardson, A.: Explainability in human-agent systems. Auton. Agents Multi-Agent Syst. 33(6), 673–705 (2019)
Ben-Hur, A., Elisseeff, I., Guyon, A.: A stability based method for discovering structure in clustered data. Pac. Symp. Biocomput. 1, 6–17 (2002)
Meinshausen, N., Buhlmann, P.: Stability selection. J. R. Stat. Soc. 72, 414–473 (2010)
Wang, J.: Consistent selection of the number of clusters via cross validation. Biometrika 72, 893–904 (2010)
Liu, K., Roeder, K., Wasserman, L.: Stability approach to regularization selection for high-dim graphical models. Adv. Neural Inf. Process. Syst. 23, (2010)
Stodden, V., Leisch, F., Peng, R.: Implementing reproducible research. CRC Press (2014)
Shah, P., Kendall, F., Khozin, S., Goosen, R., Hu, J., Laramie, J., Ringel, M., Schork, N.: Artificial intelligence and machine learning in clinical development: a transnational perspective. Npj Digit. Med. 69, 1–34 (2019)
Boyko, N., Sviridova, T., Shakhovska, N.: Use of machine learning in the forecast of clinical consequences of cancer diseases. 7th Mediterranean Conference on Embedded Computing (MECO), pp. 1–6 (2018)
Yaniv-Rosenfeld, A., Savchenko, E., Rosenfeld, A., Lazebnik, T.: Scheduling bcg and il-2 injections for bladder cancer immunotherapy treatment. Mathematics, 1–6 (2018)
Veturi, Y.A., Woof, W., Lazebnik, T., Moghul, I., Woodward-Court, P., Wagner, S.K., Cabral de Guimaraes, T.A., Daich Varela, M., Liefers, B., Patel, P.J., Beck, S., Webster, A.R., Mahroo, O., Keane, P.A., Michaelides, M., Balaskas, K., Pontikos, N.: Syntheye Investigating the impact of synthetic data on artificial intelligence-assisted gene diagnosis of inherited retinal disease. Ophthalmology Science 3(2), 100258 (2023)
Weng, S.F., Reps, J., Kai, J., Garibaldi, J.M., Qureshi, N.: Can machine-learning improve cardiovascular risk prediction using routine clinical data? PLOS ONE 12, e0174944 (2017)
Bonner, G.: Decision making for health care professionals: use of decision trees within the community mental health setting. J. Adv. Nursing 35, 349–356 (2001)
Flechet, M., Güiza, F., Schetz, M., Wouters, P., Vanhorebeek, I., Derese, I., Gunst, J., Spriet, I., Casaer, M., Van den Berghe, G., Meyfroidt, G.: Akipredictor, an online prognostic calculator for acute kidney injury in adult critically ill patients: development, validation and comparison to serum neutrophil gelatinase-associated lipocalin. J. Adv. Nursing 35, 349–356 (2001)
Shung, D.L., Au, B., Taylor, R.A., Tay, J.K., Laursen, S.B., Stanley, A.J., Dalton, H.R., Ngu, J., Schultz, M., Laine, L.: Validation of a machine learning model that outperforms clinical risk scoring systems for upper gastrointestinal bleeding. Gastroenterology 158, 160–167 (2020)
Shamout, F., Zhu, T., Clifton, D.A.: Machine learning for clinical outcome prediction. IEEE Rev. Biomed. Eng. 14, 116–126 (2020)
Lazebnik, T., Somech, A., Weinberg, A.I.: Substrat: a subset-based optimization strategy for faster automl. Proc. VLDB Endow. 16(4), 772–780 (2022)
Aztiria, A., Farhadi, G., Aghajan, H.: User Behavior Shift Detection in Intelligent Environments. Springer, (2012)
Gama, J., Zliobaite, I., Bifet, A., Pechenizkiy, M., Bouchachia, A.: A survey on concept drift adaptation. ACM Comput. Surv. (CSUR), 46, (2014)
Cavalcante, R.C., Oliveira, A.L.I.: An approach to handle concept drift in financial time series based on extreme learning machines and explicit drift detection. Int. Jt. Conf. Neural Netw. (IJCNN), 1–8 (2015)
Lazebnik, T., Fleischer, T., Yaniv-Rosenfeld, A.: Benchmarking biologically-inspired automatic machine learning for economic tasks. Sustainability 11232(14), (2023)
Shami, L., Lazebnik, T.: Implementing machine learning methods in estimating the size of the non-observed economy. Comput. Econ. (2023)
K. Chaudhuri and S. A. Vinterbo. A stability-based validation procedure for differentially private machine learning. Advances in Neural Information Processing Systems, 2013
Yokoyama, H.: Machine learning system architectural pattern for improving operational stability. IEEE Int. Conf. Softw. Architecture Comp. (2019)
Bolón-Canedo, V., Alonso-Betanzos, A.: Ensembles for feature selection: a review and future trends. Inf. Fusion 52, 1–12 (2019)
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3(Mar), 1157–1182 (2003)
Liu, H., Motoda, H., Setiono, R., Zhao, Z.: Feature selection: an ever evolving frontier in data mining. In Feature selection in data mining, p 4–13. PMLR (2010)
Rosenfeld, A.: Better metrics for evaluating explainable artificial intelligence. In: AAMAS ’21: 20th International Conference on Autonomous Agents and Multiagent Systems, pp. 45–50. ACM (2021)
Bhatt, U., Xiang, A., Sharma, S., Weller, A., Taly, A., Jia, Y., Ghosh, J., Puri, R., Moura, J.M.F., Eckersley, P.: Explainable machine learning in deployment. In: Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, pp. 648–657 (2020)
Lazebnik, T., Bunimovich-Mendrazitsky, S., Rosenfeld, A.: An algorithm to optimize explainability using feature ensembles. Appl. Intell. (2024)
Sun, W.: Stability of machine learning algorithms. Purdue University, (2015)
Kenneth, O.S.: Learning concept drift with a committee of decision trees. Technical Report AI03-302, (2019)
Jain, A.K., Chandrasekaran, B.: Machine learning based concept drift detection for predictive maintenance. Comput. Ind. Eng. 137, 106031 (2019)
Khaire, U.M., Dhanalakshmi, R.: Stability of feature selection algorithm: a review. J. King Saud Univ. Comput. Inf. (2019)
Shah, R., Samworth, R.: Variable selection with error control: another look at stability selection. J. R. Stat. Soc. 75, 55–80 (2013)
Sun, W., Wang, J., Fang, Y.: Consistent selection of tuning parameters via variable selection stability. J. Mach. Learn. Res. 14, 3419–3440 (2013)
Han, Y.: Stable Feature Selection: Theory and Algorithms. PhD thesis, (2012)
Zhang, X., Fan, M., Wang, D., Zhou, P., Tao, D.: Top-k feature selection framework using robust 0-1 integer programming. IEEE Trans. Neural Netw. Learn. Syst. 32(7), 3005–3019 (2021)
Plackett, R.L.: Karl pearson and the chi-squared test. International Statistical Review/Revue Internationale de Statistique, pp. 59–72 (1983)
Chung, N.C., Miasojedow, B., Startek, M., Gambin, A.: Jaccard/tanimoto similarity test and estimation methods for biological presence-absence data. BMC Bioinform. 20, (2019)
Bajusz, D., Racz, A., Heberger, K.: Why is tanimoto index an appropriate choice for fingerprint-based similarity calculations? J. Cheminform. 20(7), (2015)
Bookstein, A., Kulyukin, V.A., Raita, T.: Generalized hamming distance. Inf. Retr. 5, 353–375 (2002)
Liu, Y., Mu, Y., Chen, K., Li, Y., Guo, J.: Daily activity feature selection in smart homes based on pearson correlation coefcient. Neural Process. Letters 51, 1771–1787 (2020)
Chandrashekar, G., Sahin, F.: A survey on feature selection methods. Comput. Electr. Eng. 40(1), 16–28 (2014)
Plackett, R.L.: Karl pearson and the chi-squared test. International Statistical Review/Revue Internationale de Statistique, 59–72 (1983)
Kanna, S.S., Ramaraj, N.: A novel hybrid feature selection via symmetrical uncertainty ranking based local memetic search algorithm. Knowl. Based Syst. 23(6), 580–585 (2010)
Chengzhang, L., Jiucheng, X.: Feature selection with the fisher score followed by the maximal clique centrality algorithm can accurately identify the hub genes of hepatocellular carcinoma. Sci. Rep. 9, 17283 (2019)
Gu, Q., Li, Z., Han, J.: Generalized fisher score for feature selection. In: Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence, pp. 266–273. AUAI Press (2011)
Azhagusundari, B., Thanamani, A.S.: Feature selection based on information gain. Int. J. Innov. Res. Sci. Eng. Technol. 2(2), 18–21 (2013)
Bommert, A., Michel, L.: stabm: Stability measures for feature selection. J. Open Source Softw. 1, 1 (2021)
Kalousis, A., Prados, J., Hilario, M.: Evaluating feature-selection stability in next-generation proteomics. Knowl. Inf. Syst. 12(1), 95–116 (2007)
Kuncheva, L.I.: A stability index for feature selec. In: Proceedings of the 25th IASTED International Multi-Conference Artificial Intelligence and Applications (2007)
Dernoncourt, D., Hanczar, B., Zucker, J.-D.: Analysis of feature selection stability on high dimension and small sample data. Comput. Stat. Data Anal. 71, 681–693 (2013)
Saeys, Y., Abeel, T.: and Y, vol. de. Springer, Peer. Robust Feature Selection Using Ensemble Feature Selection Techniques (2008)
Yeom, S., Giacomelli, I., Fredrikson, M., Jha, S.: Privacy risk in machine learning: analyzing the connection to overfitting. In: 2018 IEEE 31st Computer Security Foundations Symposium (CSF), pp. 268–282. IEEE (2018)
Nogueira, S., Sechidis, K., Brown, G.: On the stability of feature selection algorithms. J. Mach. Learn. Res. 18, 1–54 (2018)
Lyapunov, A.M..: The general problem of the stability of motion. University Of Kharkov, (1966)
Shami, L., Lazebnik, T.: Economic aspects of the detection of new strains in a multi-strain epidemiological-mathematical model. Chaos, Solitons & Fractals 165, 112823 (2022)
Mayerhofer, T., Klein, S.J., Peer, A., Perschinka, F., Lehner, G.F., Hasslacher, J., Bellmann, R., Gasteiger, L., Mittermayr, S., Eschertzhuber, M., Mathis, S., Fiala, S., Fries, D., Kalenka, A., Foidl, E., Hasibeder, W., Helbok, R., Kirchmair, L., Stogermüller, C., Krismer, B., Heiner, T., Ladner, E., Thome, C., Preub-Hernandez, C., Mayr, A., Pechlaner, A., Potocnik, M., Reitter, M., Brunner, J., Zagitzer-Hofer, S., Ribitsch, A., Joannidis, M.: Changes in characteristics and outcomes of critically ill covid-19 patients in tyrol (Austria) over 1 year. Wiener klinische Wochenschrift 133, 1237–1247 (2021)
Liu, Y., Mu, Y., Chen, K., Li, Y., Guo, J.: Daily activity feature selection in smart homes based on pearson correlation coefcient. Neural Process. Letters 51, 1771–1787 (2020)
A. Jovie, K. Brkie, and N. Bogunovic. A review of feature selection methods with applications. IEEE, (2015). In: Russian
Liu, R., Liu, E., Yang, J., Li, M., Wang, F.: Optimizing the hyper-parameters for svm by combining evolution strategies with a grid search. Intell. Control Automation 344, (2006)
Rezatofighi, H., Tsoi, N., Gwak, J., Sadeghian, A., Reid, I., Savarese, S.: Generalized intersection over union: a metric and a loss for bounding box regression. In CVPR 2019 (2019)
Žliobaite, I., Pechenizkiy, M., Gama, J.: Big Data Analysis: New Algorithms for a New Society, vol. 16. Springer (2016)
Gama, J.M., Zliobaite, I., Bifet, A., Pechenizkiy, M., Bouchachia, A.: A survey on concept drift adaptation. ACM Comput. Surv. 46(4), 1–37 (2014)
Lu, J., Liu, A., Dong, F., Gu, F., Gama, J., Zhang, G.: Learning under concept drift: a review. IEEE Trans. Knowl. Data Eng. 31(12), 2346–2363 (2019)
Marlin, B.M.: Missing data problems in machine learning. pp. 1–6. University of Toronto, (2008)
Jerez, J.M., Molina, I., Garcia-Laencina, P.J., Alba, E., Ribelles, N., Martin, M., Franco, L.: Missing data imputation using statistical and machine learning methods in a real breast cancer problem. Artif. Intell. Med. 50(2), 105–115 (2010)
Ramoni, M., Sebastiani, P.: Robust learning with missing data. Mach. Learn. 45, 147–170 (2001)
Thomas, R.M., Bruin, W., Zhutovsky, P., van Wingen, G.: Chapter 14 - dealing with missing data, small sample sizes, and heterogeneity in machine learning studies of brain disorders. In: Andrea Mechelli and Sandra Vieira, editors, Machine Learning, pp. 249–266. Academic Press (2020)
Funding
The authors did not receive support from any organization for the submitted work.
Author information
Authors and Affiliations
Contributions
Conceptualization, methodology, formal analysis and investigation, Writing - original draft preparation, and Writing - review and editing: Teddy Lazebnik; Supervision and Writing - review and editing: Avi Rosenfeld.
Corresponding author
Ethics declarations
Competing interests
The authors have no relevant financial or non-financial interests to disclose.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Lazebnik, T., Rosenfeld, A. A new definition for feature selection stability analysis. Ann Math Artif Intell (2024). https://doi.org/10.1007/s10472-024-09936-8
Accepted:
Published:
DOI: https://doi.org/10.1007/s10472-024-09936-8