Skip to main content

Exploring Feature Extraction to Vulnerability Prediction Problem

  • Conference paper
  • First Online:
New Trends in Disruptive Technologies, Tech Ethics and Artificial Intelligence (DiTTEt 2022)

Abstract

The growing use of technology makes the development of secure applications essential. In contrast, the secure software development cycle is a costly task, considering the human effort required to review application code for finding vulnerabilities. In order to minimize this cost (human effort), Vulnerability Prediction Models (VPMs) can be used by software development teams during inspection tasks. The VPM low precision makes its application unfeasible, because it indicates the waste of human effort during the inspection. One of the obstacles in the construction of efficient VPMs (i.e., high precision) is modeling meaningful features related to the vulnerabilities, specially in the initial training stages. In this work we compare a promising feature, extracted through another domain (i.e., defect prediction) techniques. We evaluated the feature within an active learning-based VPM through a simulation on real open source projects. Our results indicates that the feature looks promising in cost saving when applied to vulnerability inspection tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    The vulnerabilities are a subset of defects.

  2. 2.

    https://github.com/vulnerability-research.

References

  1. Bilgin, Z., Ersoy, M.A., Soykan, E.U., Tomur, E., Çomak, P., Karaçay, L.: Vulnerability prediction from source code using machine learning. IEEE Access 8, 150672–150684 (2020)

    Article  Google Scholar 

  2. Duarte, D., Ståhl, N.: Machine learning: a concise overview. In: Said, A., Torra, V. (eds.) Data Science in Practice. SBD, vol. 46, pp. 27–58. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-97556-6_3

    Chapter  Google Scholar 

  3. Jabeen, G., et al.: Machine learning techniques for software vulnerability prediction: a comparative study. Appl. Intell. 1–22 (2022). https://doi.org/10.1007/s10489-022-03350-5

  4. Kudjo, P.K., Chen, J.: A cost-effective strategy for software vulnerability prediction based on bellwether analysis. In: Proceedings of the 28th ACM SIGSOFT, pp. 424–427 (2019)

    Google Scholar 

  5. Li, Z., Shao, Y.: A survey of feature selection for vulnerability prediction using feature-based machine learning. In: Proceedings of the 2019 ICML, pp. 36–42 (2019)

    Google Scholar 

  6. Lika, B., Kolomvatsos, K., Hadjiefthymiades, S.: Facing the cold start problem in recommender systems. Expert Syst. Appl. 41(4), 2065–2073 (2014)

    Article  Google Scholar 

  7. Manning, C., Raghavan, P., Schütze, H.: Introduction to information retrieval. Nat. Lang. Eng. 16(1), 100–103 (2010)

    Article  Google Scholar 

  8. Morrison, P., Herzig, K., Murphy, B., Williams, L.: Challenges with applying vulnerability prediction models. In: Proceedings of the 2015 Symposium and Bootcamp on the Science of Security, pp. 1–9 (2015)

    Google Scholar 

  9. Nam, J., Kim, S.: Clami: Defect prediction on unlabeled datasets (t). In: 2015 30th IEEE/ACM ASE, pp. 452–463. IEEE (2015)

    Google Scholar 

  10. Pereira, F., Crocker, P., Leithardt, V.R.: Padres: tool for privacy, data regulation and security. SoftwareX 17, 100895 (2022)

    Google Scholar 

  11. Settles, B.: Active learning literature survey. Computer Sciences Technical report 1648, University of Wisconsin-Madison (2009)

    Google Scholar 

  12. Shamal, P., Rahamathulla, K., Akbar, A.: A study on software vulnerability prediction model. In: 2017 WiSPNET, pp. 703–706. IEEE (2017)

    Google Scholar 

  13. Suzin, J.C., Zeferino, C.A., Leithardt, V.R.Q.: Digital statelessness. In: de Paz Santana, J.F., de la Iglesia, D.H., López Rivero, A.J. (eds.) New Trends in Disruptive Technologies, Tech Ethics and Artificial Intelligence, pp. 178–189. Springer International Publishing, Cham (2022). https://doi.org/10.1007/978-3-030-87687-6

    Chapter  Google Scholar 

  14. Theisen, C., Herzig, K., Morrison, P., Murphy, B., Williams, L.: Approximating attack surfaces with stack traces. In: 2015 IEEE/ACM 37th ICSE, vol. 2, pp. 199–208. IEEE (2015)

    Google Scholar 

  15. Walden, J., Stuckman, J., Scandariato, R.: Predicting vulnerable components: software metrics vs text mining. In: 2014 IEEE 25th International Symposium on Software Reliability Engineering, pp. 23–33. IEEE (2014)

    Google Scholar 

  16. Yu, Z., Kraft, N.A., Menzies, T.: Finding better active learners for faster literature reviews. Empirical Softw. Eng. 23(6), 3161–3186 (2018). https://doi.org/10.1007/s10664-017-9587-0

    Article  Google Scholar 

  17. Yu, Z., Theisen, C., Williams, L., Menzies, T.: Improving vulnerability inspection efficiency using active learning. IEEE TSE 47(11), 2401–2420 (2019)

    Google Scholar 

  18. Zhang, J., Wu, J., Chen, C., Zheng, Z., Lyu, M.R.: Cds: a cross-version software defect prediction model with data selection. IEEE Access 8, 110059–110072 (2020)

    Article  Google Scholar 

  19. Zhang, Y., Lo, D., Xia, X., Xu, B., Sun, J., Li, S.: Combining software metrics and text features for vulnerable file prediction. In: 2015 20th ICECCS, pp. 40–49. IEEE (2015)

    Google Scholar 

  20. Zimmermann, T., Nagappan, N., Williams, L.: Searching for a needle in a haystack: predicting security vulnerabilities for windows vista. In: 2010 ICST, pp. 421–428. IEEE (2010)

    Google Scholar 

Download references

Acknowledgements

This work was supported by Fundaça̋o para a Ciência e a Tecnologia, I.P. (Portuguese Foundation for Science and Technology) by the project UIDB/05064/2020 (VALORIZA-Research Centre for Endogenous Resource Valorization). Also by PES-2021-0140 UFFS.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guilherme Dal Bianco .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Apolinário, V.A., Bianco, G.D., Duarte, D., Leithardt, V.R.Q. (2023). Exploring Feature Extraction to Vulnerability Prediction Problem. In: de la Iglesia, D.H., de Paz Santana, J.F., López Rivero, A.J. (eds) New Trends in Disruptive Technologies, Tech Ethics and Artificial Intelligence. DiTTEt 2022. Advances in Intelligent Systems and Computing, vol 1430. Springer, Cham. https://doi.org/10.1007/978-3-031-14859-0_7

Download citation

Publish with us

Policies and ethics