Abstract
The growing use of technology makes the development of secure applications essential. In contrast, the secure software development cycle is a costly task, considering the human effort required to review application code for finding vulnerabilities. In order to minimize this cost (human effort), Vulnerability Prediction Models (VPMs) can be used by software development teams during inspection tasks. The VPM low precision makes its application unfeasible, because it indicates the waste of human effort during the inspection. One of the obstacles in the construction of efficient VPMs (i.e., high precision) is modeling meaningful features related to the vulnerabilities, specially in the initial training stages. In this work we compare a promising feature, extracted through another domain (i.e., defect prediction) techniques. We evaluated the feature within an active learning-based VPM through a simulation on real open source projects. Our results indicates that the feature looks promising in cost saving when applied to vulnerability inspection tasks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
The vulnerabilities are a subset of defects.
- 2.
References
Bilgin, Z., Ersoy, M.A., Soykan, E.U., Tomur, E., Çomak, P., Karaçay, L.: Vulnerability prediction from source code using machine learning. IEEE Access 8, 150672–150684 (2020)
Duarte, D., Ståhl, N.: Machine learning: a concise overview. In: Said, A., Torra, V. (eds.) Data Science in Practice. SBD, vol. 46, pp. 27–58. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-97556-6_3
Jabeen, G., et al.: Machine learning techniques for software vulnerability prediction: a comparative study. Appl. Intell. 1–22 (2022). https://doi.org/10.1007/s10489-022-03350-5
Kudjo, P.K., Chen, J.: A cost-effective strategy for software vulnerability prediction based on bellwether analysis. In: Proceedings of the 28th ACM SIGSOFT, pp. 424–427 (2019)
Li, Z., Shao, Y.: A survey of feature selection for vulnerability prediction using feature-based machine learning. In: Proceedings of the 2019 ICML, pp. 36–42 (2019)
Lika, B., Kolomvatsos, K., Hadjiefthymiades, S.: Facing the cold start problem in recommender systems. Expert Syst. Appl. 41(4), 2065–2073 (2014)
Manning, C., Raghavan, P., Schütze, H.: Introduction to information retrieval. Nat. Lang. Eng. 16(1), 100–103 (2010)
Morrison, P., Herzig, K., Murphy, B., Williams, L.: Challenges with applying vulnerability prediction models. In: Proceedings of the 2015 Symposium and Bootcamp on the Science of Security, pp. 1–9 (2015)
Nam, J., Kim, S.: Clami: Defect prediction on unlabeled datasets (t). In: 2015 30th IEEE/ACM ASE, pp. 452–463. IEEE (2015)
Pereira, F., Crocker, P., Leithardt, V.R.: Padres: tool for privacy, data regulation and security. SoftwareX 17, 100895 (2022)
Settles, B.: Active learning literature survey. Computer Sciences Technical report 1648, University of Wisconsin-Madison (2009)
Shamal, P., Rahamathulla, K., Akbar, A.: A study on software vulnerability prediction model. In: 2017 WiSPNET, pp. 703–706. IEEE (2017)
Suzin, J.C., Zeferino, C.A., Leithardt, V.R.Q.: Digital statelessness. In: de Paz Santana, J.F., de la Iglesia, D.H., López Rivero, A.J. (eds.) New Trends in Disruptive Technologies, Tech Ethics and Artificial Intelligence, pp. 178–189. Springer International Publishing, Cham (2022). https://doi.org/10.1007/978-3-030-87687-6
Theisen, C., Herzig, K., Morrison, P., Murphy, B., Williams, L.: Approximating attack surfaces with stack traces. In: 2015 IEEE/ACM 37th ICSE, vol. 2, pp. 199–208. IEEE (2015)
Walden, J., Stuckman, J., Scandariato, R.: Predicting vulnerable components: software metrics vs text mining. In: 2014 IEEE 25th International Symposium on Software Reliability Engineering, pp. 23–33. IEEE (2014)
Yu, Z., Kraft, N.A., Menzies, T.: Finding better active learners for faster literature reviews. Empirical Softw. Eng. 23(6), 3161–3186 (2018). https://doi.org/10.1007/s10664-017-9587-0
Yu, Z., Theisen, C., Williams, L., Menzies, T.: Improving vulnerability inspection efficiency using active learning. IEEE TSE 47(11), 2401–2420 (2019)
Zhang, J., Wu, J., Chen, C., Zheng, Z., Lyu, M.R.: Cds: a cross-version software defect prediction model with data selection. IEEE Access 8, 110059–110072 (2020)
Zhang, Y., Lo, D., Xia, X., Xu, B., Sun, J., Li, S.: Combining software metrics and text features for vulnerable file prediction. In: 2015 20th ICECCS, pp. 40–49. IEEE (2015)
Zimmermann, T., Nagappan, N., Williams, L.: Searching for a needle in a haystack: predicting security vulnerabilities for windows vista. In: 2010 ICST, pp. 421–428. IEEE (2010)
Acknowledgements
This work was supported by Fundaça̋o para a Ciência e a Tecnologia, I.P. (Portuguese Foundation for Science and Technology) by the project UIDB/05064/2020 (VALORIZA-Research Centre for Endogenous Resource Valorization). Also by PES-2021-0140 UFFS.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Apolinário, V.A., Bianco, G.D., Duarte, D., Leithardt, V.R.Q. (2023). Exploring Feature Extraction to Vulnerability Prediction Problem. In: de la Iglesia, D.H., de Paz Santana, J.F., López Rivero, A.J. (eds) New Trends in Disruptive Technologies, Tech Ethics and Artificial Intelligence. DiTTEt 2022. Advances in Intelligent Systems and Computing, vol 1430. Springer, Cham. https://doi.org/10.1007/978-3-031-14859-0_7
Download citation
DOI: https://doi.org/10.1007/978-3-031-14859-0_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-14858-3
Online ISBN: 978-3-031-14859-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)