Skip to main content
Log in

Software defect prediction: future directions and challenges

  • Published:
Automated Software Engineering Aims and scope Submit manuscript

Abstract

Software defect prediction is one of the most popular research topics in software engineering. The objective of defect prediction is to identify defective instances prior to the occurrence of software defects, thus it aids in more effectively prioritizing software quality assurance efforts. In this article, we delve into various prospective research directions and potential challenges in the field of defect prediction. The aim of this article is to propose a range of defect prediction techniques and methodologies for the future. These ideas are intended to enhance the practicality, explainability, and actionability of the predictions of defect models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Bird, C., Nagappan, N., Murphy, B., et al.: Don’t touch my code!: Examining the effects of ownership on software quality. In: ESEC/FSE’11. ACM, pp. 4–14 (2011)

  • Chen, H., Jing, X.Y., Li, Z., et al.: An empirical study on heterogeneous defect prediction approaches. IEEE Trans. Softw. Eng. 47(12), 2803–2822 (2021)

    Article  Google Scholar 

  • Da Costa, D.A., McIntosh, S., Shang, W., et al.: A framework for evaluating the results of the SZZ approach for identifying bug-introducing changes. IEEE Trans. Softw. Eng. 43(7), 641–657 (2016)

    Article  Google Scholar 

  • Dam, H.K., Tran, T., Ghose, A.: Explainable software analytics. In: Proceedings of the 40th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER), pp. 53–56 (2018)

  • Fu, W., Menzies, T., Shen, X.: Tuning for software analytics: Is it really necessary? Inf. Softw. Technol. 76, 135–146 (2016)

    Article  Google Scholar 

  • Ghotra, B., McIntosh, S., Hassan, A.E.: Revisiting the impact of classification techniques on the performance of defect prediction models. In: ICSE’15. IEEE, pp. 789–800 (2015)

  • Giray, G., Bennin, K.E., Köksal, Ö., et al.: On the use of deep learning in software defect prediction. J. Syst. Softw. 195, 111537 (2023)

    Article  Google Scholar 

  • Guo, Z., Liu, S., Liu, X., et al.: Code-line-level bugginess identification: How far have we come, and how far have we yet to go? ACM Trans. Softw. Eng. Methodol. 32(4), 1–55 (2023)

    Article  Google Scholar 

  • Hall, T., Beecham, S., Bowes, D., et al.: A systematic literature review on fault prediction performance in software engineering. IEEE Trans. Softw. Eng. 38(6), 1276–1304 (2012)

    Article  Google Scholar 

  • Hosseini, S., Turhan, B., Gunarathna, D.: A systematic literature review and meta-analysis on cross project defect prediction. IEEE Trans. Softw. Eng. 45(2), 111–147 (2019)

    Article  Google Scholar 

  • Huang, Q., Xia, X., Lo, D.: Revisiting supervised and unsupervised models for effort-aware just-in-time defect prediction. Empir. Softw. Eng. 24, 2823–2862 (2019)

    Article  Google Scholar 

  • Jiarpakdee, J., Tantithamthavorn, C., Hassan, A.: The impact of correlated metrics on the interpretation of defect models. IEEE Trans. Softw. Eng. 47(2), 320–331 (2021)

    Article  Google Scholar 

  • Jiarpakdee, J., Tantithamthavorn, C.K., Grundy, J.: Practitioners’ perceptions of the goals and visual explanations of defect prediction models. In: 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR). IEEE, pp. 432–443 (2021b)

  • Jiarpakdee, J., Tantithamthavorn, C., Dam, H.K., et al.: An empirical study of model-agnostic techniques for defect prediction models. IEEE Trans. Softw. Eng. 48(1), 166–185 (2022)

    Article  Google Scholar 

  • Jing, X., Wu, F., Dong, X., et al.: Heterogeneous cross-company defect prediction by unified metric representation and CCA-based transfer learning. In: FSE’15. ACM, pp. 496–507 (2015)

  • Kamei, Y., Shihab, E.: Defect prediction: Accomplishments and future challenges. In: 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER), IEEE, pp. 33–45 (2016)

  • Kamei, Y., Matsumoto, S., Monden, A., et al.: Revisiting common bug prediction findings using effort-aware models. In: 2010 IEEE International Conference on Software Maintenance. IEEE, pp. 1–10 (2010)

  • Kamei, Y., Shihab, E., Adams, B., et al.: A large-scale empirical study of just-in-time quality assurance. IEEE Trans. Softw. Eng. 39(6), 757–773 (2013)

    Article  Google Scholar 

  • Kim, S., Zimmermann, T., Pan, K., et al.: Automatic identification of bug-introducing changes. In: 21st IEEE/ACM International Conference on Automated Software Engineering (ASE’06), IEEE, pp. 81–90 (2006)

  • Kim, S., Zhang, H., Wu, R., et al.: Dealing with noise in defect prediction. In: ICSE’11, pp. 481–490 (2011)

  • Lessmann, S., Baesens, B., Mues, C., et al.: Benchmarking classification models for software defect prediction: a proposed framework and novel findings. IEEE Trans. Softw. Eng. 34(4), 485–496 (2008)

    Article  Google Scholar 

  • Li, N., Shepperd, M.J., Yuchen, G.: A systematic review of unsupervised learning techniques for software defect prediction. Inf. Softw. Technol. 122, 106287 (2020)

    Article  Google Scholar 

  • Li, Z., Jing, X.Y., Zhu, X., et al.: Heterogeneous defect prediction through multiple kernel learning and ensemble learning. In: ICSME’17. IEEE, pp. 91–102 (2017)

  • Li, Z., Jing, X.Y., Wu, F., et al.: Cost-sensitive transfer kernel canonical correlation analysis for heterogeneous defect prediction. Autom. Softw. Eng. 25(2), 201–245 (2018)

    Article  Google Scholar 

  • Li, Z., Jing, X.Y., Zhu, X.: Heterogeneous fault prediction with cost sensitive domain adaptation. Softw. Test. Verif. Reliab. 28(2), 1–22 (2018)

    Article  CAS  Google Scholar 

  • Li, Z., Jing, X.Y., Zhu, X.: Progress on approaches to software defect prediction. IET Softw. 12(3), 161–175 (2018)

    Article  Google Scholar 

  • Li, Z., Jing, X.Y., Zhu, X., et al.: Heterogeneous defect prediction with two-stage ensemble learning. Autom. Softw. Eng. 26(3), 599–651 (2019)

    Article  Google Scholar 

  • Li, Z., Jing, X.Y., Zhu, X., et al.: On the multiple sources and privacy preservation issues for heterogeneous defect prediction. IEEE Trans. Softw. Eng. 45(4), 391–411 (2019)

    Article  Google Scholar 

  • Li, Z., Niu, J., Jing, X.Y., et al.: Cross-project defect prediction via landmark selection-based kernelized discriminant subspace alignment. IEEE Trans. Reliab. 70(3), 996–1013 (2021)

    Article  Google Scholar 

  • Li, Z., Zhang, H., Jing, X.Y., et al.: Dssdpp: data selection and sampling based domain programming predictor for cross-project defect prediction. IEEE Trans. Softw. Eng. 49(4), 1941–1963 (2023)

    Article  Google Scholar 

  • Lo, S.K., Lu, Q., Wang, C., et al.: A systematic literature review on federated machine learning: from a software engineering perspective. ACM Comput. Surv. 54(5), 1–39 (2021)

    Article  Google Scholar 

  • Mende, T., Koschke, R.: Effort-aware defect prediction models. In: 2010 14th European Conference on Software Maintenance and Reengineering, IEEE, pp. 107–116 (2010)

  • Menzies, T., Greenwald, J., Frank, A.: Data mining static code attributes to learn defect predictors. IEEE Trans. Softw. Eng. 33(1), 2–13 (2007)

    Article  Google Scholar 

  • Menzies, T., Milton, Z., Turhan, B., et al.: Defect prediction from static code features: current results, limitations, new approaches. Autom. Softw. Eng. 17(4), 375–407 (2010)

    Article  Google Scholar 

  • Menzies, T., Butcher, A., Cok, D., et al.: Local versus global lessons for defect prediction and effort estimation. IEEE Trans. Softw. Eng. 39(6), 822–834 (2013)

    Article  Google Scholar 

  • Moser, R., Pedrycz, W., Succi, G.: A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction. In: ICSE’08. IEEE, pp. 181–190 (2008)

  • Nam, J., Kim, S.: Heterogeneous defect prediction. In: FSE’15. ACM, pp. 508–519 (2015)

  • Neto, E.C., Da Costa, D.A., Kulesza, U.: The impact of refactoring changes on the SZZ algorithm: an empirical study. In: 2018 IEEE 25th International Conference on Software Analysis, pp. 380–390. Evolution and Reengineering (SANER), IEEE (2018)

  • Ni, C., Wang, W., Yang, K., et al.: The best of both worlds: integrating semantic features with expert features for defect prediction and localization. In: Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. ACM, pp. 672–683 (2022a)

  • Ni, C., Xia, X., Lo, D., et al.: Revisiting supervised and unsupervised methods for effort-aware cross-project defect prediction. IEEE Trans. Softw. Eng. 48(3), 786–802 (2022)

    Article  Google Scholar 

  • Peters, F., Menzies, T., Gong, L., et al.: Balancing privacy and utility in cross-company defect prediction. IEEE Trans. Softw. Eng. 39(8), 1054–1068 (2013)

    Article  Google Scholar 

  • Peters, F., Menzies, T., Layman, L.: Lace2: Better privacy-preserving data sharing for cross project defect prediction. In: ICSE’15, pp. 801–811 (2015)

  • Pornprasit, C., Tantithamthavorn, C.K.: Jitline: a simpler, better, faster, finer-grained just-in-time defect prediction. In: 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR), IEEE, pp 369–379 (2021)

  • Pornprasit, C., Tantithamthavorn, C.K.: Deeplinedp: towards a deep learning approach for line-level defect prediction. IEEE Trans. Softw. Eng. 49(1), 84–98 (2023)

    Article  Google Scholar 

  • Samoaa, H.P., Bayram, F., Salza, P., et al.: A systematic mapping study of source code representation for deep learning in software engineering. IET Softw. 16(4), 351–385 (2022)

    Article  Google Scholar 

  • Shepperd, M., Song, Q., Sun, Z., et al.: Data quality: some comments on the NASA software defect datasets. IEEE Trans. Softw. Eng. 39(9), 1208–1215 (2013)

    Article  Google Scholar 

  • Shepperd, M., Bowes, D., Hall, T.: Researcher bias: the use of machine learning in software defect prediction. IEEE Trans. Softw. Eng. 40(6), 603–616 (2014)

    Article  Google Scholar 

  • Shihab, E., Kamei, Y., Adams, B., et al.: Is lines of code a good measure of effort in effort-aware models? Inf. Softw. Technol. 55(11), 1981–1993 (2013)

    Article  Google Scholar 

  • Śliwerski, J., Zimmermann, T., Zeller, A.: When do changes induce fixes? ACM SIGSOFT Softw. Engi. Notes 30(4), 1–5 (2005)

    Article  Google Scholar 

  • Tang, L., Bao, L., Xia, X., et al.: Neural SZZ algorithm. In: 2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, pp. 1024–1035 (2023)

  • Tantithamthavorn, C., Hassan, A.E.: An experience report on defect modelling in practice: Pitfalls and challenges. In: Proceedings of the 40th International conference on software engineering: software engineering in practice, pp. 286–295 (2018)

  • Tantithamthavorn, C., McIntosh, S., Hassan, A.E., et al.: The impact of mislabelling on the performance and interpretation of defect prediction models. In: ICSE’15. IEEE, pp. 812–823 (2015)

  • Tantithamthavorn, C., McIntosh, S., Hassan, A.E., et al.: The impact of automated parameter optimization on defect prediction models. IEEE Trans. Softw. Eng. 45(7), 683–711 (2019)

    Article  Google Scholar 

  • Tantithamthavorn, C., Hassan, A.E., Matsumoto, K.: The impact of class rebalancing techniques on the performance and interpretation of defect prediction models. IEEE Trans. Softw. Eng. 46(11), 1200–1219 (2020)

    Article  Google Scholar 

  • Tantithamthavorn, C.K., Jiarpakdee, J.: Explainable ai for software engineering. In: 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE), IEEE, pp. 1–2 (2021)

  • Tsantalis, N., Mansouri, M., Eshkevari, L.M., et al.: Accurate and efficient refactoring detection in commit history. In: Proceedings of the 40th International Conference on Software Engineering, pp. 483–494 (2018)

  • Wan, Z., Xia, X., Hassan, A.E., et al.: Perceptions, expectations, and challenges in defect prediction. IEEE Trans. Softw. Eng. 46(11), 1241–1266 (2020)

    Article  Google Scholar 

  • Wang, H., Zhuang, W., Zhang, X.: Software defect prediction based on gated hierarchical LSTMS. IEEE Trans. Reliab. 70(2), 711–727 (2021)

    Article  Google Scholar 

  • Wattanakriengkrai, S., Thongtanunam, P., Tantithamthavorn, C., et al.: Predicting defective lines using a model-agnostic technique. IEEE Trans. Softw. Eng. 48(5), 1480–1496 (2022)

    Article  Google Scholar 

  • Wu, R., Zhang, H., Kim, S., et al.: Relink: recovering links between bugs and changes. In: FSE/ESEC’11, pp 15–25 (2011)

  • Xu, J., Wang, F., Ai, J.: Defect prediction with semantics and context features of codes based on graph representation learning. IEEE Trans. Reliab. 70(2), 613–625 (2020)

    Article  Google Scholar 

  • Xu, Z., Li, L., Yan, M., et al.: A comprehensive comparative study of clustering-based unsupervised defect prediction models. J. Syst. Softw. 172(3), 110862 (2021)

    Article  Google Scholar 

  • Yamamoto, H., Wang, D., Rajbahadur, G.K., et al.: Towards privacy preserving cross project defect prediction with federated learning. In: 2023 IEEE International Conference on Software Analysis, pp. 485–496. Evolution and Reengineering (SANER), IEEE (2023)

  • Yang, Y., Zhou, Y., Liu, J., et al.: Effort-aware just-in-time defect prediction: simple unsupervised models could be better than supervised models. In: FSE’16, pp 157—168 (2016)

  • Yang, Y., Xia, X., Lo, D., et al.: A survey on deep learning for software engineering. ACM Comput. Surv. 54(10s), 1–73 (2022)

    Article  Google Scholar 

  • Zain, Z.M., Sakri, S., Ismail, N.H.A.: Application of deep learning in software defect prediction: systematic literature review and meta-analysis. Inf. Softw. Technol. 158, 107175 (2023)

    Article  Google Scholar 

  • Zhang, F., Zheng, Q., Zou, Y., et al.: Cross-project defect prediction using a connectivity-based unsupervised classifier. In: ICSE’16, pp 309–320 (2016)

  • Zhao, Y., Damevski, K., Chen, H.: A systematic survey of just-in-time software defect prediction. ACM Comput. Surv. 55(10), 1–35 (2023)

    Article  Google Scholar 

  • Zhou, C., He, P., Zeng, C., et al.: Software defect prediction with semantic and structural information of codes based on graph neural networks. Inf. Softw. Technol. 152, 107057 (2022)

    Article  Google Scholar 

  • Zhou, Y., Yang, Y., Lu, H., et al.: How far we have progressed in the journey? An examination of cross-project defect prediction. ACM Trans. Softw. Eng. Methodol. 27(1), 1–51 (2018)

    Article  ADS  Google Scholar 

  • Zimmermann, T., Nagappan, N., Gall, H., et al.: Cross-project defect prediction: a large scale experiment on data vs. domain vs. process. In: FSE/ESEC’09. ACM, pp 91–100 (2009)

Download references

Funding

National Natural Science Foundation of China (Grant Nos.: 61902228 and 62176069), Natural Science Basic Research Program of Shaanxi Province (Grant No.: 2024JC-YBMS-497), and funded by the China Scholarship Council.

Author information

Authors and Affiliations

Authors

Contributions

Zhiqiang Li: methodology, writing Jingwen Niu: investigation, resources Xiao-Yuan Jing: review, editing. All authors reviewed the manuscript.

Corresponding author

Correspondence to Zhiqiang Li.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, Z., Niu, J. & Jing, XY. Software defect prediction: future directions and challenges. Autom Softw Eng 31, 19 (2024). https://doi.org/10.1007/s10515-024-00424-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10515-024-00424-1

Keywords

Navigation