Advertisement

Empirical Software Engineering

, Volume 21, Issue 5, pp 2072–2106 | Cite as

Studying just-in-time defect prediction using cross-project models

  • Yasutaka Kamei
  • Takafumi Fukushima
  • Shane McIntosh
  • Kazuhiro Yamashita
  • Naoyasu Ubayashi
  • Ahmed E. Hassan
Article

Abstract

Unlike traditional defect prediction models that identify defect-prone modules, Just-In-Time (JIT) defect prediction models identify defect-inducing changes. As such, JIT defect models can provide earlier feedback for developers, while design decisions are still fresh in their minds. Unfortunately, similar to traditional defect models, JIT models require a large amount of training data, which is not available when projects are in initial development phases. To address this limitation in traditional defect prediction, prior work has proposed cross-project models, i.e., models learned from other projects with sufficient history. However, cross-project models have not yet been explored in the context of JIT prediction. Therefore, in this study, we empirically evaluate the performance of JIT models in a cross-project context. Through an empirical study on 11 open source projects, we find that while JIT models rarely perform well in a cross-project context, their performance tends to improve when using approaches that: (1) select models trained using other projects that are similar to the testing project, (2) combine the data of several other projects to produce a larger pool of training data, and (3) combine the models of several other projects to produce an ensemble model. Our findings empirically confirm that JIT models learned using other projects are a viable solution for projects with limited historical data. However, JIT models tend to perform best in a cross-project context when the data used to learn them are carefully selected.

Keywords

Empirical study Defect prediction Just-in-time prediction 

Notes

Acknowledgments

This research was partially supported by JSPS KAKENHI Grant Numbers 15H05306 and 24680003 and the Natural Sciences and Engineering Research Council of Canada (NSERC).

References

  1. Basili VR, Briand LC, Melo WL (1996) A validation of object-oriented design metrics as quality indicators. IEEE Trans Softw Eng 22(10):751–761CrossRefGoogle Scholar
  2. Bettenburg N, Nagappan M, Hassan AE (2012) Think locally, act globally: Improving defect and effort prediction models. In: Proc. Int’l Working Conf. on Mining Software Repositories (MSR’12), pp 60–69Google Scholar
  3. Breiman L (2001) Random forests. Mach Learn 45(1):5–32MathSciNetCrossRefzbMATHGoogle Scholar
  4. Briand LC, Melo WL, Wüst J (2002) Assessing the applicability of fault-proneness models across object-oriented software projects. IEEE Trans Softw Eng 28(7):706–720CrossRefGoogle Scholar
  5. Coolidge FL (2012) Statistics: A Gentle Introduction. SAGE Publications (3rd ed.)Google Scholar
  6. D’Ambros M, Lanza M, Robbes R (2010) An extensive comparison of bug prediction approaches. In: Proc. Int’l Working Conf. on Mining Software Repositories (MSR’10), pp 31–41Google Scholar
  7. Fukushima T, Kamei Y, McIntosh S, Yamashita K, Ubayashi N (2014) An empirical study of just-in-time defect prediction using cross-project models. In: Proc. Int’l Working Conf. on Mining Software Repositories (MSR’14), pp 172–181Google Scholar
  8. Graves TL, Karr AF, Marron JS, Siy H (2000) Predicting fault incidence using software change history. IEEE Trans Softw Eng 26(7):653–661CrossRefGoogle Scholar
  9. Guo P J, Zimmermann T, Nagappan N, Murphy B (2010) Characterizing and predicting which bugs get fixed: An empirical study of microsoft windows. In: Proc. Int’l Conf. on Softw. Eng. (ICSE’10) vol 1, pp 495–504Google Scholar
  10. Hall T, Beecham S, Bowes D, Gray D, Counsell S (2012) A systematic literature review on fault prediction performance in software engineering. IEEE Trans Softw Eng 38(6):1276–1304CrossRefGoogle Scholar
  11. Hassan AE (2009) Predicting faults using the complexity of code changes. In: Proc. Int’l Conf. on Softw. Eng. (ICSE’09), pp 78–88Google Scholar
  12. He Z, Shu F, Yang Y, Li M, Wang Q (2012) An investigation on the feasibility of cross-project defect prediction. Automated Software Engg 19(2):167–199CrossRefGoogle Scholar
  13. Jiang Y, Cukic B, Menzies T (2008) Can data transformation help in the detection of fault-prone modules?. In: Proc. Workshop on Defects in Large Software Systems (DEFECTS’08), pp 16–20Google Scholar
  14. Kamei Y, Monden A, Matsumoto S, Kakimoto T, Matsumoto Ki (2007) The effects of over and under sampling on fault-prone module detection. In: Proc. Int’l Symposium on Empirical Softw. Eng. and Measurement (ESEM’07), pp 196–204Google Scholar
  15. Kamei Y, Matsumoto S, Monden A, Matsumoto K, Adams B, Hassan AE (2010) Revisiting common bug prediction findings using effort aware models. In: Proc. Int’l Conf. on Software Maintenance (ICSM’10), pp 1–10Google Scholar
  16. Kamei Y, Shihab E, Adams B, Hassan AE, Mockus A, Sinha A, Ubayashi N (2013) A large-scale empirical study of just-in-time quality assurance. IEEE Trans Softw Eng 39(6):757–773CrossRefGoogle Scholar
  17. Kampstra P (2008) Beanplot: A boxplot alternative for visual comparison of distributions. J Stat Softw,Code Snippets 28(1):1–9Google Scholar
  18. Kim S, Whitehead EJ, Zhang Y (2008) Classifying software changes: Clean or buggy IEEE Trans Softw Eng 34(2):181–196CrossRefGoogle Scholar
  19. Kocaguneli E, Menzies T, Keung J (2012) On the value of ensemble effort estimation. IEEE Trans Softw Eng 38(6):1403–1416CrossRefGoogle Scholar
  20. Koru AG, Zhang D, El Emam K, Liu H (2009) An investigation into the functional form of the size-defect relationship for software modules. IEEE Trans Softw Eng 35(2):293–304CrossRefGoogle Scholar
  21. Lessmann S, Baesens B, Mues C, Pietsch S (2008) Benchmarking classification models for software defect prediction: A proposed framework and novel findings. IEEE Trans Softw Eng 34(4):485–496CrossRefGoogle Scholar
  22. Li PL, Herbsleb J, Shaw M, Robinson B (2006) Experiences and results from initiating field defect prediction and product test prioritization efforts at ABB Inc. In: Proc. Int’l Conf. on Softw. Eng. (ICSE’06), pp 413–422Google Scholar
  23. Matsumoto S, Kamei Y, Monden A, Matsumoto K (2010) An analysis of developer metrics for fault prediction. In: Proc. Int’l Conf. on Predictive Models in Softw. Eng. (PROMISE’10), pp 18:1–18:9Google Scholar
  24. McIntosh S, Nagappan M, Adams B, Mockus A, Hassan A E (2014) A large-scale empirical study of the relationship between build technology and build maintenance. Empirical Software Engineering. doi: 10.1.1/jpb001. http://link.springer.com/article/10.1007
  25. Menzies T, Turhan B, Bener A, Gay G, Cukic B, Jiang Y (2008) Implications of ceiling effects in defect predictors. In: Proc. Int’l Conf. on Predictive Models in Softw. Eng. (PROMISE’10), pp 47–54Google Scholar
  26. Menzies T, Butcher A, Marcus A, Zimmermann T, Cok D (2011) Local vs. global models for effort estimation and defect prediction. In: Proc. Int’l Conf. on Automated Software Engineering (ASE’11), pp 343–351Google Scholar
  27. Menzies T, Butcher A, Cok D, Marcus A, Layman L, Shull F, Turhan B, Zimmermann T (2013) Local versus global lessons for defect prediction and effort estimation. IEEE Trans Softw Eng 39(6):822–834CrossRefGoogle Scholar
  28. Minku LL, Yao X (2014) How to make best use of cross-company data in software effort estimation?. In: Proc. Int’l Conf. on Software Engineering (ICSE’14), pp 446–456Google Scholar
  29. Mısırlı AT, Bener AB, Turhan B (2011) An industrial case study of classifier ensembles for locating software defects. Softw Qual J 19(3):515–536CrossRefGoogle Scholar
  30. Mockus A (2009) Amassing and indexing a large sample of version control systems: Towards the census of public source code history. In: Proc. Int’l Working Conf. on Mining Software Repositories (MSR’09), pp 11–20Google Scholar
  31. Mockus A, Weiss DM (2000) Predicting risk of software changes. Bell Labs Tech J 5(2):169–180CrossRefGoogle Scholar
  32. Moser R, Pedrycz W, Succi G (2008) A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction. In: Proc. Int’l Conf. on Softw. Eng. (ICSE’08), 181–190Google Scholar
  33. Nagappan N, Ball T (2005) Use of relative code churn measures to predict system defect density. In: Proc. Int’l Conf. on Softw. Eng. (ICSE’05), pp 284–292Google Scholar
  34. Nagappan N, Ball T, Zeller A (2006) Mining metrics to predict component failures. In: Proc. Int’l Conf. on Softw. Eng. (ICSE’06), pp 452–461Google Scholar
  35. Nam J, Pan S J, Kim S (2013) Transfer defect learning. In: Proc. Int’l Conf. on Softw. Eng. (ICSE’13), pp 382–391Google Scholar
  36. Purushothaman R, Perry DE (2005) Toward understanding the rhetoric of small source code changes. IEEE Trans Softw Eng 31(6):511–526CrossRefGoogle Scholar
  37. Rahman F, Posnett D, Devanbu P (2012) Recalling the ”imprecision” of cross-project defect prediction. In: Proc. Int’l Symposium on the Foundations of Softw. Eng. (FSE’12), pp 61:1–61:11Google Scholar
  38. Ratzinger J, Sigmund T, Gall HC (2008) On the relation of refactorings and software defect prediction. In: Proc. Int’l Working Conf. on Mining Software Repositories (MSR’08), pp 35–38Google Scholar
  39. Shihab E (2012) An exploration of challenges limiting pragmatic software defect prediction. PhD thesis, Queen’s UniversityGoogle Scholar
  40. Shihab E, Hassan AE, Adams B, Jiang ZM (2012) An industrial study on the risk of software changes. In: Proc. Int’l Symposium on the Foundations of Softw. Eng. (FSE’12), pp 62:1–62:11Google Scholar
  41. Śliwerski J, Zimmermann T, Zeller A (2005) When do changes induce fixes?. In: Proc. Int’l Working Conf. on Mining Software Repositories (MSR’05), pp 1–5Google Scholar
  42. Tan M, Tan L, Dara S, Mayuex C (2015) Online defect prediction for imbalanced data. In: Proc. Int’l Conf. on Softw. Eng. (ICSE’13 SEIP), (To appear)Google Scholar
  43. Thomas SW, Nagappan M, Blostein D, Hassan AE (2013) The impact of classifier configuration and classifier combination on bug localization. IEEE Trans Softw Eng 39(10):1427–1443CrossRefGoogle Scholar
  44. Turhan B (2012) On the dataset shift problem in software engineering prediction models. Empirical Softw Engg 17(1-2):62–74MathSciNetCrossRefGoogle Scholar
  45. Turhan B, Menzies T, Bener AB, Di Stefano J (2009) On the relative value of cross-company and within-company data for defect prediction. Empir Softw Eng 14 (5):540–578CrossRefGoogle Scholar
  46. Turhan B, Tosun A, Bener A (2011) Empirical evaluation of mixed-project defect prediction models. In: Proc. EUROMICRO Conf. on Software Engineering and Advanced Applications (SEAA’11), pp 396–403Google Scholar
  47. Wu R, Zhang H, Kim S, Cheung SC (2011) Relink: recovering links between bugs and changes. In: Proc. European Softw. Eng. Conf. and Symposium on the Foundations of Softw. Eng. (ESEC/FSE’11), pp 15–25Google Scholar
  48. Zhang F, Mockus A, Zou Y, Khomh F, Hassan AE (2013) How does context affect the distribution of software maintainability metrics?. In: Proc. Int’l Conf. on Software Maintenance (ICSM’13), pp 350–359Google Scholar
  49. Zhang F, Mockus A, Keivanloo I, Zou Y (2014) Towards building a universal defect prediction model. In: Proc. Int’l Working Conf. on Mining Software Repositories (MSR’14), pp 182–191Google Scholar
  50. Zimmermann T, Nagappan N, Gall H, Giger E, Murphy B (2009) Cross-project defect prediction: a large scale experiment on data vs. domain vs. process. In: Proc. European Softw. Eng. Conf. and Symposium on the Foundations of Softw. Eng. (ESEC/FSE’09), pp 91–100Google Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  • Yasutaka Kamei
    • 1
  • Takafumi Fukushima
    • 1
  • Shane McIntosh
    • 2
  • Kazuhiro Yamashita
    • 1
  • Naoyasu Ubayashi
    • 1
  • Ahmed E. Hassan
    • 3
  1. 1.Principles of Software Languages Group (POSL)Kyushu UniversityFukuoka-shi, FukuokaJapan
  2. 2.Department of Electrical and Computer EngineeringMcGill UniversityMontréalCanada
  3. 3.Software Analysis and Intelligence Lab (SAIL)Queen’s UniversityKingstonCanada

Personalised recommendations