Skip to main content
Log in

Can traditional fault prediction models be used for vulnerability prediction?

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Finding security vulnerabilities requires a different mindset than finding general faults in software—thinking like an attacker. Therefore, security engineers looking to prioritize security inspection and testing efforts may be better served by a prediction model that indicates security vulnerabilities rather than faults. At the same time, faults and vulnerabilities have commonalities that may allow development teams to use traditional fault prediction models and metrics for vulnerability prediction. The goal of our study is to determine whether fault prediction models can be used for vulnerability prediction or if specialized vulnerability prediction models should be developed when both models are built with traditional metrics of complexity, code churn, and fault history. We have performed an empirical study on a widely-used, large open source project, the Mozilla Firefox web browser, where 21% of the source code files have faults and only 3% of the files have vulnerabilities. Both the fault prediction model and the vulnerability prediction model provide similar ability in vulnerability prediction across a wide range of classification thresholds. For example, the fault prediction model provided recall of 83% and precision of 11% at classification threshold 0.6 and the vulnerability prediction model provided recall of 83% and precision of 12% at classification threshold 0.5. Our results suggest that fault prediction models based upon traditional metrics can substitute for specialized vulnerability prediction models. However, both fault prediction and vulnerability prediction models require significant improvement to reduce false positives while providing high recall.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

Similar content being viewed by others

Notes

  1. http://www.cert.org

  2. http://www.mozilla.com/

  3. http://www.cs.waikato.ac.nz/ml/weka/

  4. https://bugzilla.mozilla.org/

  5. http://www.mozilla.org/security/known-vulnerabilities/

  6. http://www.scitools.com/

References

  • Alhazmi OH, Malaiya YK, Ray I (2007) Measuring, analyzing and predicting security vulnerabilities in software systems. Comput Secur 26(3):219–228

    Article  Google Scholar 

  • Antoniol G, Ayari K, Penta MD, Khomh F, Guéhéneuc Y-G (Oct. 27–30 2008) Is it a bug or an enhancement? A text-based approach to classify change requests. In: 2008 Conference of the Center for Advanced Studies on Collaborative Research, Ontario, Canada.

  • Arisholm E, Briand LC (Sep. 21–22 2006) Predicting fault-prone components in a Java Legacy System. In: the 2006 ACM/IEEE International Symposium on Empirical Software Engineering, Rio de Janeiro, Brazil, pp. 8–17.

  • Arisholm E, Briand LC, Fuglerud M (5–9 Nov. 2007) Data mining techniques for building fault-proneness models in telecom Java Software. In: 18th IEEE Int’l Symposium on Software Reliability Engineering (ISSRE’07), Trollhättan, Sweden, pp. 215–224.

  • Basili VR, Briand LC, Melo WL (1996) A validation of object-oriented design metrics as quality indicators. IEEE Trans Software Eng 22(10):751–761

    Article  Google Scholar 

  • Crews-Meyer KA, Hudson PF (2004) Landscape complexity and remote classification in Estern Costal Mexico: applications of Landsat-7 ETM+ Data. Geocarto International 19 (1).

  • Fitzpatrick-Linz K (1981) Comparison of sampling procedure and data analysis for a Land-Use and Land-Cover Map. Photogramm Eng Rem Sens 47(3):343–351

    Google Scholar 

  • Gegick M, Rotella P, Williams L (2009) Toward non-security failures as a predictor of security faults and failures. Paper presented at the International Symposium on Engineering Secure Software and Systems, Leuven, Belgium, February 04–06.

  • Gegick M, Williams L, Osborne J, Vouk M (Oct. 27 2008) Prioritizing software security fortification through code-level metrics. In: 4th ACM workshop on Quality of protection, Alexandria, Virginia, pp 31–38.

  • Graves TL, Karr AF, Marron JS, Siy H (2000) Predicting fault incidence using software change history. IEEE Trans Software Eng 26(7):653–661

    Article  Google Scholar 

  • Guo L, Ma Y, Cukic B, Singh H (2004) Robust prediction of fault-proneness by random forests. In: the 15th International Symposium on Software Reliability Engineering (ISSRE’04), Saint-Malo, Bretagne, France, pp 417–428.

  • Hassan AE (2009) Predicting faults using the complexity of code changes. In: the 31st International Conference on Software Engineering, pp 78–88.

  • Heckman S, Williams L (Oct. 9–10 2008) On establishing a benchmark for evaluating static analysis alert prioritization and classification techniques. In: 2nd International Symposium on Empirical Software Engineering and Measurement, Kaiserslautern, Germany, pp 41–50.

  • IEEE (1988) IEEE Std 982.1-1988 IEEE standard dictionary of measures to produce reliable software. IEEE Computer Society.

  • Jiang Y, Cukic B, Menzies T (2008a) Can data transformation help in the detection of fault-prone modules? In: Proceedings of the 2008 Workshop on Defects in Large Software Systems (DEFECTS’08), Seattle, Washington, pp 16–20.

  • Jiang Y, Cukic B, Menzies T (10–14 Nov. 2008b) Cost curve evaluation of fault prediction models. In: 19th International Symposium on oftware Reliability Engineering (ISSRE’08), pp 197–206.

  • Kamei Y, Monden A, Matsumoto S, Kakimoto T, Matsumoto K (20–21 Sept. 2007) The effects of over and under sampling on fault-prone module detection. In: 1st International Symposium on Empirical Software Engineering and Measurement, Madrid, Spain, pp 196–204.

  • Khoshgoftaar TM, Allen EB, Kalaichelvan KS, Goel N (1996) Early quality prediction: a case study in telecommunications. IEEE Software 13(1):65–71

    Article  Google Scholar 

  • Kim S, Ernst MD (Sep. 3–7 2007) Which warnings should i fix first? In: the 6th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on The Foundations of Software Engineering, pp 45–54.

  • Kim S, Zimmermann T, E. James Whitehead J, Zeller A (2007) Predicting faults from cached history. In: the 29th International Conference on Software Engineering, pp 489–498.

  • Krsul IV (1998) Software vulnerability analysis. PhD dissertation, Purdue University, West Lafayette.

  • Lessmann S, Baesens B, Mues C, Pietsch S (2008) Benchmarking classification models for software defect prediction: a proposed framework and novel findings. IEEE Trans Software Eng 34(4):485–496

    Article  Google Scholar 

  • Mark A. Hall, Holmes G (2003) Benchmarking attribute selection techniques for discrete class data mining. IEEE Trans Knowl Data Eng 15 (3).

  • McCabe TJ (1976) A complexity measure. IEEE Trans Software Eng 2(4):308–320

    Article  MathSciNet  MATH  Google Scholar 

  • Mende T, Koschke R (2009) Revisiting the evaluation of defect prediction models. In: Proceedings of the 5th International Conference on Predictor Models in Software Engineering (PROMISE’09), Vancouver, Canada.

  • Meneely A, Williams L (November 2009) Secure open source collaboration: an empirical study of Linus’ Law” computer and communications security. In: Computer and Communications Security (CCS), Chicago, IL, pp 453–462.

  • Menzies T, Dekhtyar A, Distefano J, Greenwald J (2007a) Problems with precision: a response to “Comments on ‘Data Mining Static Code Attributes to Learn Defect Predictors’”. IEEE Trans Software Eng 33(9):637–640

    Article  Google Scholar 

  • Menzies T, Greenwald J, Frank A (2007b) Data mining static code attributes to learn defect predictors. IEEE Trans Software Eng 33(1):2–13

    Article  Google Scholar 

  • Menzies T, Milton Z, Turhan B, Cukic B, Jiang Y, Bener A (2010) Defect prediction from static code feature: current results, limitations, new approaches. Autom Softw Eng 17(4):doi:10.1007/s10515-010-0069-5

    Google Scholar 

  • Menzies T, Turhan B, Bener A, Gay G, Cukic B, Jiang Y (May 2008) Implications of ceiling effects in defect predictors. In: the 4th International Workshop on Predictor Models in Software Engineering (PROMISE’'08), Leipzig, Germany, pp 47–54.

  • Nagappan N, Ball T (May 15–21 2005) Use of relative code churn measures to predict system defect density. In: the 27th International Conference on Software Engineering, St. Louis, MO, USA, pp 284–292.

  • Nagappan N, Ball T, Zeller A (May 20–28 2006) Mining metrics to predict component failures. In: the 28th International Conference on Software Engineering, Shanghai, China, pp 452–461.

  • Neuhaus S, Zimmermann T, Zeller A (October 29–November 2 2007) Predicting vulnerable software components. In: the 14th ACM Conference on Computer and Communications Security (CCS’07), Alexandria, Virginia, USA, pp 529–540.

  • NIST (2002) The economic impacts of inadequate infrastructure for software testing. National Institute of Standards & Technology.

  • Ostrand TJ, Weyuker EJ, Bell RM (2005) Predicting the location and number of faults in large software systems. IEEE Trans Software Eng 31(4):340–355

    Article  Google Scholar 

  • Ostrand TJ, Weyuker EJ, Bell RM (July 9–12 2007) Automating algorithms for the identification of fault-prone files. In: the 2007 International Symposium on Software Testing and Analysis (ISSTA’07), London, UK, pp. 219–227.

  • Ott RL, Longnecker M (2001) An introduction to statistical methods and data analysis, 5th edn. Duxbury, Pacific Grove

    Google Scholar 

  • Porter MF (1980) An algorithm for suffix stripping. Program 16(3):130–137

    Article  Google Scholar 

  • Rice D (2007) Geekonomics: The real cost of insecure software. Addison-Wesley Professional,

  • Shin Y, Williams L (Oct. 27 2008) Is complexity really the enemy of software security? In: the 4th ACM Workshop on Quality of Protection, Alexandria, Virginia, USA, pp. 47–50.

  • Shin Y, Meneely A, Williams L (2011) Evaluating complexity, code churn, and developer activity metrics as indicators of software vulnerabilities. IEEE Trans Software Eng.

  • Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann Publishers, Boston

    MATH  Google Scholar 

  • Zimmermann T, Nagappan N (10–18 May 2008) Predicting defects using network analysis on dependency graphs. In: the 13th International Conference on Software Engineering, pp. 531–540.

  • Zimmermann T, Nagappan N, Williams L (Apr. 6–11 2010) Searching for a needle in a haystack: predicting security vulnerabilities for Windows Vista. In: 3rd International Conference on Software Testing, Verification and Validation, Paris, France, pp. 421–428.

Download references

Acknowledgment

This work was supported in part by the National Science Foundation Grant No. 0716176 and the CAREER Grant No. 0346903. Any opinions expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. We also thank the NCSU Software Engineering Realsearch group for their reviews for the initial version of this paper. We appreciate Dr. Robert Bell and Dr. Raffaella Settimi for their advice on statistics. Most of all, we thank the editors and reviewers of Empirical Software Engineering journal for their thorough reviews and helpful suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yonghee Shin.

Additional information

Editor: Tim Menzies

Appendix A Comparison of Metrics Values for Neutral, Faulty, and Vulnerable Files

Appendix A Comparison of Metrics Values for Neutral, Faulty, and Vulnerable Files

Table 11 Descriptive statistics for metrics of neutral, faulty, and vulnerable files for Firefox 2.0

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shin, Y., Williams, L. Can traditional fault prediction models be used for vulnerability prediction?. Empir Software Eng 18, 25–59 (2013). https://doi.org/10.1007/s10664-011-9190-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10664-011-9190-8

Keywords

Navigation