On the use of calling structure information to improve fault prediction
- 356 Downloads
Previous studies have shown that software code attributes, such as lines of source code, and history information, such as the number of code changes and the number of faults in prior releases of software, are useful for predicting where faults will occur. In this study of two large industrial software systems, we investigate the effectiveness of adding information about calling structure to fault prediction models. Adding calling structure information to a model based solely on non-calling structure code attributes modestly improved prediction accuracy. However, the addition of calling structure information to a model that included both history and non-calling structure code attributes produced no improvement.
KeywordsSoftware faults Negative binomial model Empirical study Calling structure attributes
This work is supported in part by the National Science Foundation Grant No. 0716176 and the CAREER Grant No. 0346903. Any opinions expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation. The comments by several reviewers helped us greatly to clarify and strengthen the results reported in the paper.
- Arisholm E, Briand LC (Sep. 21-22 2006) Predicting fault-prone components in a Java Legacy System. In: the 2006 ACM/IEEE International Symposium on Empirical Software Engineering, Rio de Janeiro, Brazil, pp 8–17.Google Scholar
- Briand LC, Wust J, Ikonomovski SV, Lounis H (16–22 May 1999) Investigating quality factors in object-oriented designs: an industrial case study. In: the 1999 International Conference on Software Engineering (ICSE’99), Los Angeles, CA, USA, pp 345–354.Google Scholar
- Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Research:1157–1182.Google Scholar
- Hassan AE (2009) Predicting faults using the complexity of code changes. In: the 31st International Conference on Software Engineering, pp 78–88.Google Scholar
- Kamiya T, Kusumoto S, Inoue K (1999) Prediction of fault-proneness at early phase in object-oriented development. In: 2nd IEEE International Symposium Object-Oriented Real-Time Distributed Computing, pp 253–258.Google Scholar
- Kim S, Zimmermann T, E. James Whitehead J, Zeller A (2007) Predicting faults from cached history. In: the 29th International Conference on Software Engineering, pp 489–498.Google Scholar
- McFadden D (1974) Conditional logit analysis of qualitative choice behavior. Frontiers in Econometrics 1(2):105–142Google Scholar
- Nagappan N, Ball T (20–21 Sept. 2007) Using software dependencies and churn metrics to predict field failures: an empirical case study. In: First International Symposium on Empirical Software Engineering and Measurement, Madrid, Spain, pp 364–373.Google Scholar
- Nagappan N, Ball T, Zeller A (May 20–28 2006) Mining metrics to predict component failures. In: the 28th International Conference on Software Engineering, Shanghai, China, pp 452–461.Google Scholar
- Nguyen THD, Adams B, Hassa AE (2010) Studying the impact of dependency network measures on software quality. In: 26th IEEE International Conference on Software Maintenance Timisoara, Romania, pp 1–10.Google Scholar
- NIST (2002) The economic impacts of inadequate infrastructure for software testing. National Institute of Standards & Technology.Google Scholar
- Ostrand TJ, Weyuker EJ (July 22–24 2002) The distribution of faults in a large industrial software system. In: the 2002 ACM SIGSOFT International Symposium on Software Testing and Analysis, Roma, Italy, pp 55–64.Google Scholar
- Shin Y, Bell R, Ostrand T, Weyuker E (May 16–17 2009) Does calling structure information improve the accuracy of fault prediction? In: 6th IEEE International Working Conference on Mining Software Repositories, Vancouver, BC, Canada, pp 61–70.Google Scholar
- Tosun A, Turhan B, Bener A (May 18–19 2009) Validation of network measures as indicators of defective modules in software systems. In: Proceedings of the 5th International Conference on Predictor Models in Software Engineering (PROMISE ’09), Vancouver, Canada.Google Scholar
- UCLA (2011) FAQ: What are pseudo R-squareds? Statistical Consulting Group. http://www.ats.ucla.edu/stat/mult_pkg/faq/general/psuedo_rsquareds.htm.
- van Heesch D. Doxygen. http://www.stack.nl/~dimitri/doxygen/.
- Weyuker EJ, Ostrand TJ (July 20 2008) Comparing methods to identify defect reports in a change management database. In: International Workshop on Defects in Large Software Systems (DEFFECTS’08), Seattle, WA.Google Scholar
- Weyuker EJ, Ostrand TJ, Bell RM (20 May 2007) Using developer information as a factor for fault prediction. In: International Workshop on Predictor Models in Software Engineering (PROMISE ’07), Minneapolis, MN.Google Scholar
- Weyuker EJ, Ostrand TJ, Bell RM (May 12–13 2008a) Comparing negative binomial and recursive partitioning models for fault prediction. In: the 4th International Workshop on Predictor Models in Software Engineering (PROMISE’08), Leipzig, Germany, pp 3–10.Google Scholar
- Weyuker EJ, Ostrand TJ, Bell RM (2010) Comparing the effectiveness of several modeling methods for fault prediction. Empir Software Eng 15(3).Google Scholar
- Zimmermann T, Nagappan N (10–18 May 2008) Predicting defects using network analysis on dependency graphs. In: the 13th International Conference on Software Engineering, pp 531–540.Google Scholar
- Zimmermann T, Weißgerber P (May 2004) Preprocessing CVS data for fine-grained analysis. In: 1st International Workshop on Mining Software Repositories (MSR’04), Edinburgh, UK.Google Scholar