Advertisement

Empirical Software Engineering

, Volume 17, Issue 4–5, pp 390–423 | Cite as

On the use of calling structure information to improve fault prediction

  • Yonghee Shin
  • Robert M Bell
  • Thomas J Ostrand
  • Elaine J Weyuker
Article

Abstract

Previous studies have shown that software code attributes, such as lines of source code, and history information, such as the number of code changes and the number of faults in prior releases of software, are useful for predicting where faults will occur. In this study of two large industrial software systems, we investigate the effectiveness of adding information about calling structure to fault prediction models. Adding calling structure information to a model based solely on non-calling structure code attributes modestly improved prediction accuracy. However, the addition of calling structure information to a model that included both history and non-calling structure code attributes produced no improvement.

Keywords

Software faults Negative binomial model Empirical study Calling structure attributes 

Notes

Acknowledgment

This work is supported in part by the National Science Foundation Grant No. 0716176 and the CAREER Grant No. 0346903. Any opinions expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation. The comments by several reviewers helped us greatly to clarify and strengthen the results reported in the paper.

References

  1. Andersson C, Runeson P (2007) A replicated quantitative analysis of fault distributions in complex software systems. IEEE Trans Software Eng 33(5):273–286CrossRefGoogle Scholar
  2. Arisholm E, Briand LC (Sep. 21-22 2006) Predicting fault-prone components in a Java Legacy System. In: the 2006 ACM/IEEE International Symposium on Empirical Software Engineering, Rio de Janeiro, Brazil, pp 8–17.Google Scholar
  3. Basili VR, Perricone BR (1984) Software errors and complexity: an empirical investigation. Comm ACM 27:42–52CrossRefGoogle Scholar
  4. Basili VR, Briand LC, Melo WL (1996) A validation of object-oriented design metrics as quality indicators. IEEE Trans Software Eng 22(10):751–761CrossRefGoogle Scholar
  5. Boehm BW (1981) Software engineering economics. Prentice-Hall, Englewood CliffsMATHGoogle Scholar
  6. Briand LC, Wust J, Ikonomovski SV, Lounis H (16–22 May 1999) Investigating quality factors in object-oriented designs: an industrial case study. In: the 1999 International Conference on Software Engineering (ICSE’99), Los Angeles, CA, USA, pp 345–354.Google Scholar
  7. Chidamber SR, Kemerer CF (1994) A metrics suite for object-oriented design. IEEE Trans Software Eng 20(6):476–493CrossRefGoogle Scholar
  8. Fenton NE, Ohlsson N (2000) Quantitative analysis of faults and failures in a complex software system. IEEE Trans Software Eng 26(8):797–814CrossRefGoogle Scholar
  9. Graves TL, Karr AF, Marron JS, Siy H (2000) Predicting fault incidence using software change history. IEEE Trans Software Eng 26(7):653–661CrossRefGoogle Scholar
  10. Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Research:1157–1182.Google Scholar
  11. Hassan AE (2009) Predicting faults using the complexity of code changes. In: the 31st International Conference on Software Engineering, pp 78–88.Google Scholar
  12. Kamiya T, Kusumoto S, Inoue K (1999) Prediction of fault-proneness at early phase in object-oriented development. In: 2nd IEEE International Symposium Object-Oriented Real-Time Distributed Computing, pp 253–258.Google Scholar
  13. Khoshgoftaar TM, Allen EB, Kalaichelvan KS, Goel N (1996) Early quality prediction: a case study in telecommunications. IEEE Software 13(1):65–71CrossRefGoogle Scholar
  14. Khoshgoftaar TM, Allen EB, Deng J (2002) Using regression trees to classify fault-prone software modules. IEEE Trans Reliab 51(4):455–462CrossRefGoogle Scholar
  15. Kim S, Zimmermann T, E. James Whitehead J, Zeller A (2007) Predicting faults from cached history. In: the 29th International Conference on Software Engineering, pp 489–498.Google Scholar
  16. McFadden D (1974) Conditional logit analysis of qualitative choice behavior. Frontiers in Econometrics 1(2):105–142Google Scholar
  17. Menzies T, Greenwald J, Frank A (2007) Data mining static code attributes to learn defect predictors. IEEE Trans Software Eng 33(1):2–13CrossRefGoogle Scholar
  18. Nagappan N, Ball T (20–21 Sept. 2007) Using software dependencies and churn metrics to predict field failures: an empirical case study. In: First International Symposium on Empirical Software Engineering and Measurement, Madrid, Spain, pp 364–373.Google Scholar
  19. Nagappan N, Ball T, Zeller A (May 20–28 2006) Mining metrics to predict component failures. In: the 28th International Conference on Software Engineering, Shanghai, China, pp 452–461.Google Scholar
  20. Nguyen THD, Adams B, Hassa AE (2010) Studying the impact of dependency network measures on software quality. In: 26th IEEE International Conference on Software Maintenance Timisoara, Romania, pp 1–10.Google Scholar
  21. NIST (2002) The economic impacts of inadequate infrastructure for software testing. National Institute of Standards & Technology.Google Scholar
  22. Ohlsson N, Alberg H (1996) Predicting fault-prone software modules in telephone switches. IEEE Trans Software Eng 22(12):886–894CrossRefGoogle Scholar
  23. Ostrand TJ, Weyuker EJ (July 22–24 2002) The distribution of faults in a large industrial software system. In: the 2002 ACM SIGSOFT International Symposium on Software Testing and Analysis, Roma, Italy, pp 55–64.Google Scholar
  24. Ostrand TJ, Weyuker EJ, Bell RM (2005) Predicting the location and number of faults in large software systems. IEEE Trans Software Eng 31(4):340–355CrossRefGoogle Scholar
  25. Shin Y, Bell R, Ostrand T, Weyuker E (May 16–17 2009) Does calling structure information improve the accuracy of fault prediction? In: 6th IEEE International Working Conference on Mining Software Repositories, Vancouver, BC, Canada, pp 61–70.Google Scholar
  26. Stevens WP, Myers GJ, Constantine LL (1974) Structured design. IBM Systems Journal 13(2):115–139CrossRefGoogle Scholar
  27. Tosun A, Turhan B, Bener A (May 18–19 2009) Validation of network measures as indicators of defective modules in software systems. In: Proceedings of the 5th International Conference on Predictor Models in Software Engineering (PROMISE ’09), Vancouver, Canada.Google Scholar
  28. UCLA (2011) FAQ: What are pseudo R-squareds? Statistical Consulting Group. http://www.ats.ucla.edu/stat/mult_pkg/faq/general/psuedo_rsquareds.htm.
  29. Weyuker EJ, Ostrand TJ (July 20 2008) Comparing methods to identify defect reports in a change management database. In: International Workshop on Defects in Large Software Systems (DEFFECTS’08), Seattle, WA.Google Scholar
  30. Weyuker EJ, Ostrand TJ, Bell RM (20 May 2007) Using developer information as a factor for fault prediction. In: International Workshop on Predictor Models in Software Engineering (PROMISE ’07), Minneapolis, MN.Google Scholar
  31. Weyuker EJ, Ostrand TJ, Bell RM (May 12–13 2008a) Comparing negative binomial and recursive partitioning models for fault prediction. In: the 4th International Workshop on Predictor Models in Software Engineering (PROMISE’08), Leipzig, Germany, pp 3–10.Google Scholar
  32. Weyuker EJ, Ostrand TJ, Bell RM (2008b) Do too many cooks spoil the broth? Using the number of developers to enhance defect prediction models. Empir Software Eng 13(5):539–559CrossRefGoogle Scholar
  33. Weyuker EJ, Ostrand TJ, Bell RM (2010) Comparing the effectiveness of several modeling methods for fault prediction. Empir Software Eng 15(3).Google Scholar
  34. Zhou Y, Leung H (2006) Empirical analysis of object-oriented design metrics for predicting high and low severity faults. IEEE Trans Software Eng 32(10):771–789CrossRefGoogle Scholar
  35. Zimmermann T, Nagappan N (10–18 May 2008) Predicting defects using network analysis on dependency graphs. In: the 13th International Conference on Software Engineering, pp 531–540.Google Scholar
  36. Zimmermann T, Weißgerber P (May 2004) Preprocessing CVS data for fine-grained analysis. In: 1st International Workshop on Mining Software Repositories (MSR’04), Edinburgh, UK.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  • Yonghee Shin
    • 1
  • Robert M Bell
    • 2
  • Thomas J Ostrand
    • 2
  • Elaine J Weyuker
    • 2
  1. 1.North Carolina State UniversityRaleighUSA
  2. 2.AT&T Labs ResearchFlorham ParkUSA

Personalised recommendations