Skip to main content
Log in

An empirical evaluation of defect prediction approaches in within-project and cross-project context

  • Published:
Software Quality Journal Aims and scope Submit manuscript

Abstract

The software defect prediction approaches are evaluated, in within-project context only, with only a few other approaches, according to distinct scenarios and performance indicators. So, we conduct various experiments to evaluate well-known defect prediction approaches using different performance indicators. The evaluations are performed in the scenario of ranking the entities — with and without considering the effort to review the entities and classifying entities in within-project as well as cross-project contexts. The effect of imbalanced datasets on the ranking of the approaches is also evaluated. Our results indicate that in within-project as well as cross-project context, process metrics, the churn of source code, and entropy of source code perform significantly better under the context of classification and ranking — with and without effort consideration. The previous defect metrics and other single metric approaches (like lines of code) perform worst. The ranking of the approaches is not changed by imbalanced datasets. We suggest using the process metrics, the churn of source code, and entropy of source code metrics as predictors in future defect prediction studies and taking care while using the single metric approaches as predictors. Moreover, different evaluation scenarios generate different ordering of approaches in within-project and cross-project contexts. Therefore, we conclude that each problem context has distinct characteristics, and conclusions of within-project studies should not be generalized to cross-project context and vice versa.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Agrawal, A., & Malhotra, R. (2019). Cross project defect prediction for open source software. International Journal of Information Technology.

  • Al Majzoub, H., Elgedawy, I., Akaydın, O., & KöseUlukök, M. (2020). Hcab-smote: A hybrid clustered affinitive borderline smote approach for imbalanced data binary classification.Arabian Journal for Science and Engineering, vol.45, no.4, pp.3205–3222.

  • Arisholm, E., Briand, L. C., & Johannessen, E. B. (2010). A systematic and comprehensive investigation of methods to build and evaluate fault prediction models. Journal of Systems and Software, 83, 2–17.

    Article  Google Scholar 

  • Arisholm, E., Briand, & L. C., Fuglerud, M. (2007). Data mining techniques for building fault-proneness models in telecom java software in The 18th IEEE International Symposium on Software Reliability (ISSRE’07), IEEE.

  • Barua, S., Islam, M. M., Yao, X., & Murase, K. (2014). MWMOTE-majority weighted minority oversampling technique for imbalanced data set learning. IEEE Transactions on Knowledge and Data Engineering, 26, 405–425.

  • Bashir, K., Li, T., Yohannese, C. W., & Yahaya, M. (2020). SMOTEFRIS-INFFC: Handling the challenge of borderline and noisy examples in imbalanced learning for software defect prediction. Journal of Intelligent & Fuzzy Systems, 38, 917–933.

  • Basili, V. R., Briand, L. C., & Melo, W. L. (1996). A validation of object-oriented design metrics as quality indicators. IEEE Transactions on Software Engineering, 22, 751–761.

  • Bennin, K. E., Keung, J., Monden, A., Phannachitta, P., & Mensah, S. (2017). The significant effects of data sampling approaches on software defect prioritization and classification. In Proceedings of the 11th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, pp.364–373, IEEE Press

  • Bennin, K. E., Keung, J. W., & Monden, A. (2019). On the relative value of data resampling approaches for software defect prediction. Empirical Software Engineering, 24(2), 602–636.

    Article  Google Scholar 

  • Bennin, K. E., Keung, J., Phannachitta, P., Monden, A., & Mensah, S. (2017). Mahakil: Diversity based oversampling approach to alleviate the class imbalance issue in software defect prediction. IEEE Transactions on Software Engineering, 44(6), 534–550.

    Article  Google Scholar 

  • Bennin, K. E., Tahir, A., MacDonell, S. G., & Börstler, J. (2022). An empirical study on the effectiveness of data resampling approaches for cross-project software defect prediction. IET Software, 16(2), 185–199.

    Article  Google Scholar 

  • Bhat, N. A., & Farooq, S. U. (2021a). An improved method for training data selection for cross-project defect prediction. Arabian Journal for Science and Engineering, pp. 1–16

  • Bhat, N. A., & Farooq, S. U. (2021b). Local modelling approach for cross-project defect prediction. Intelligent Decision Technologies: An International Journal.

    Google Scholar 

  • Capretz, L. F., & Xu, J. (2008). An empirical validation of object-oriented design metrics for fault prediction. Journal of computer science, 4(7), 571.

    Article  Google Scholar 

  • Calvo, B., & Santaf’e, G. (2015). Scmamp: Statistical Comparison of Multiple Algorithms in Multiple Problems.R package version 0.2.3.

  • Çatal, Ç. (2016). The use of cross-company fault data for the software fault prediction problem. Turkish Journal of Electrical Engineering & Computer Sciences, 24(5), 3714–3723.

    Article  Google Scholar 

  • Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). Smote: synthetic minority over-sampling technique. Journal of artificial intelligence research, 16, 321–357.

    Article  MATH  Google Scholar 

  • Chidamber, S. R., & Kemerer, C. F. (1994). A metrics suite for object oriented design.IEEE Transactions on Software Engineering, vol.20, pp.476–493

  • D’Ambros, M., Lanza, M., & Robbes, R. (2010). An extensive comparison of bug prediction approaches. In 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010), pp.31–41

  • D’Ambros, M., Lanza, M., & Robbes, R. (2012). Evaluating defect prediction approaches: a benchmark and an extensive comparison. Empirical Software Engineering, vol.17, no.4, pp.531–577.

  • Dar, A. W., & Farooq, S. U. (2022). A survey of different approaches for the class imbalance problem in software defect prediction. International Journal of Software Science and Computational Intelligence (IJSSCI), 14(1), 1–26.

    Article  Google Scholar 

  • Demšar, J. (2006). Statistical comparisons of classifiers over multiple data sets. The Journal of Machine Learning Research, 7, 1–30.

    MathSciNet  MATH  Google Scholar 

  • Fawcett, T. (2006). An introduction to roc analysis. Pattern recognition letters, 27(8), 861–874.

    Article  MathSciNet  Google Scholar 

  • Felix, E. A., & Lee, S. P. (2017). Integrated Approach to Software Defect Prediction. IEEE Access, 5, 21524–21547.

  • Feng, S., Keung, J., Yu, X., Xiao, Y., & Zhang, M. (2021). Investigation on the stability of SMOTE-based oversampling techniques in software defect prediction. Information and Software Technology, 139,.

  • Feng, S., Keung, J., Yu, X., Xiao, Y., Bennin, K. E., Kabir, M. A., & Zhang, M. (2021). Coste: Complexity-based oversampling technique to alleviate the class imbalance problem in software defect prediction. Information and Software Technology, 129, 106432.

    Article  Google Scholar 

  • García, V., Sánchez, J., & Mollineda, R. (2012). On the effectiveness of preprocessing methods when dealing with different levels of class imbalance. Knowledge-Based Systems, 25, 13–21.

  • Goel, L., Sharma, M., Khatri, S. K., & Damodaran, D. (2021). Cross-project defect prediction using data sampling for class imbalance learning: an empirical study. International Journal of Parallel, Emergent and Distributed Systems, 36(2), 130–143.

    Article  Google Scholar 

  • Graves, T. L., Karr, A. F., Marron, J. S., & Siy, H. (2000). Predicting fault incidence using software change history. IEEE Transactions on Software Engineering, 26, 653–661.

  • Gyimothy, T., Ferenc, R., & Siket, I. (2005). Empirical validation of object-oriented metrics on open source software for fault prediction. IEEE Transactions on Software Engineering, 31, 897–910.

  • Han, H., Wang, W.-Y., & Mao,B.-H. (2005). Borderline-smote: a new over-sampling method in imbalanced data sets learning. In International conference on intelligent computing, pp.878–887, Springer.

  • Hanley, J. A., & McNeil, B. J. (1982). The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology, 143, 29–36.

  • Hassan, A. E. (2009). Predicting faults using the complexity of code changes in 2009. IEEE 31st International Conference on Software Engineering, pp.78–88.

  • Hassan, A. E., & Holt, R. C. (2005). The top ten list: dynamic fault prediction in 21st IEEE. International Conference on Software Maintenance (ICSM’05), pp.263–272.

  • Haixiang, G., Yijing, L., Shang, J., Mingyun, G., Yuanyue, H., & Bing, G. (2017). Learning from class-imbalanced data: Review of methods and applications. Expert Systems with Applications, 73, 220–239.

  • Henderi, H., Wahyuningsih, T., & Rahwanto, E. (2021). Comparison of min-max normalization and z-score normalization in the k-nearest neighbor (knn) algorithm to test the accuracy of types of breast cancer. International Journal of Informatics and Information Systems, 4(1), 13–20.

    Article  Google Scholar 

  • Hosseini, S., Turhan, B., & Gunarathna, D. (2019). A systematic literature review and meta-analysis on cross project defect prediction. IEEE Transactions on Software Engineering, 45, 111–147.

  • Hosseini, S., Turhan, B., & Mäntylä, M. (2018). A benchmark study on the effectiveness of search-based data selection and feature selection for cross project defect prediction. Information and Software Technology, 95, 296–312.

    Article  Google Scholar 

  • Jain, Y. K., & Bhandare, S. K. (2011). Min max normalization based data perturbation method for privacy protection. International Journal of Computer & Communication Technology, 2(8), 45–50.

    Google Scholar 

  • Jiang, Y., Cukic, B., & Ma, Y. (2008). Techniques for evaluating fault prediction models. Empirical Software Engineering, 13, 561–595.

  • Kamei, Y., Matsumoto, S., Monden, A., Matsumoto, K. I., Adams, B., & Hassan, A. E. (2010). Revisiting common bug prediction findings using effort-aware models in 2010. IEEE International Conference on Software Maintenance, pp.1–10

  • Khoshgoftaar, T., Allen, E., Goel, N., Nandi, A., & McMullan, J. (1996). Detection of software modules with high debug code churn in a very large legacy system. In Proceedings of ISSRE ’96: 7th International Symposium on Software Reliability Engineering, pp.364–371.

  • Khoshgoftaar, T. M., & Allen, E. B. (2003). Ordering fault-prone software modules. Software Quality Journal, 11(1), 19–37.

    Article  Google Scholar 

  • Lessmann, S., Baesens, B., Mues, C., & Pietsch, S. (2008). Benchmarking classification models for software defect prediction: A proposed framework and novel findings. IEEE Transactions on Software Engineering, 34(4), 485–496.

  • Li, Y., Huang, Z., Wang, Y., & Fang, B. (2017). Evaluating data filter on cross-project defect prediction: Comparison and improvements. IEEE Access, 5, 25646–25656.

    Article  Google Scholar 

  • Limsettho, N., Bennin, K. E., Keung, J. W., Hata, H., & Matsumoto, K. (2018). Cross project defect prediction using class distribution estimation and oversampling. Information and Software Technology, 100, 87–102.

    Article  Google Scholar 

  • Ma, Y., Luo, G., Zeng, X., & Chen, A. (2012). Transfer learning for cross-company software defect prediction. Information and Software Technology, 54(3), 248–256.

    Article  Google Scholar 

  • Malhotra, R., & Jain, J. (2022). Predicting defects in imbalanced data using resampling methods: an empirical investigation. PeerJ Computer Science, 8, e573.

    Article  Google Scholar 

  • Menzies, T., Dekhtyar, A., Distefano, J., & Greenwald, J. (2007). Problems with Precision: A Response to comments on’data mining static code attributes to learn defect predictors. IEEE Transactions on Software Engineering, 33(9), 637–640.

  • Menzies, T., Jalali, O., Hihn, J., Baker, D., & Lum, K. (2010). Stable rankings for different effort models. Automated Software Engineering, 17, 409–437.

  • Mende, T., & Koschke, R. (2008). Revisiting the Evaluation of Defect Prediction Models in Proceedings of the 5th International Conference on Predictor Models in Software EngineeringPROMISE ’09, (New York, NY, USA), pp.7:1—-7:10, ACM

  • Mende, T., Koschke, R., & Leszak, M. (2009). Evaluating defect prediction models for a large evolving software system in 2009. 13th European Conference on Software Maintenance and Reengineering, IEEE

  • Mende, T., & Koschke, R. (2010). Effort-Aware Defect Prediction Models in 2010 14th.European Conference on Software Maintenance and Reengineering, pp.107–116

  • Menzies, T., Milton, Z., Turhan, B., Cukic, B., Jiang, Y., & Bener, A. (2010). Defect prediction from static code features: current results, limitations, new approaches. Automated Software Engineering, 17(4), 375–407.

    Article  Google Scholar 

  • Menardi, G., & Torelli, N. (2012). Training and assessing classification rules with imbalanced data. Data Mining and Knowledge Discovery, 28, 92–122.

  • Mnkandla, E., & Mpofu, B. (2016). Software defect prediction using process metrics elasticsearch engine case study in 2016 International Conference on Advances in Computing and Communication Engineering (ICACCE), pp.254–260

  • Moser Pedrycz, W., & Succi, G. (2008). A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction. In Proceedings of the 30th international conference on Software engineering, pp.181–190, ACM

  • Nagappan, N., & Ball, T. (2005). Static analysis tools as early indicators of pre-release defect density in Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005., pp.580–586.

  • Nagappan, N., & Ball, T. (2005). Use of relative code churn measures to predict system defect density. In Proceedings. 27th International Conference on Software Engineering ICSE 2005, pp.284–292.

  • Nagappan, N., Ball, T., & Zeller, A. (2006). Mining metrics to predict component failures. In Proceedings of the 28th international conference on Software engineering, pp.452–461, ACM

  • Ohlsson, N., & Alberg, H. (1996). Predicting fault-prone software modules in telephone switches. IEEE Transactions on Software Engineering, 22(12), 886–894.

    Article  Google Scholar 

  • Ostrand, T., Weyuker, E., & Bell, R. (2005). Predicting the location and number of faults in large software systems. IEEE Transactions on Software Engineering, 31, 340–355.

  • Patro, S., & Sahu, K. K. (2015). Normalization: A preprocessing stage. arXiv preprint http://arxiv.org/abs/1503.06462.

  • Peters, F., Menzies, T., & Marcus, A. (2013). Better cross company defect prediction. In Proceedings of the 10th Working Conference on Mining Software Repositories, pp.409–418, IEEE Press.

  • Pudil, P., Novovicová, J., & Kittler, J. (1994). Floating search methods in feature selection. Pattern Recognition Letters, vol.15, no.11, pp.1119–1125

  • Qiu, S., Xu, H., Deng, J., Jiang, S., & Lu, L. (2019). Transfer Convolutional Neural Network for Cross-Project Defect Prediction. Applied Sciences, 9(13), 2660.

    Article  Google Scholar 

  • Rahman, F., & Devanbu, P. (2013). How, and why, process metrics are better in 2013 35th International Conference on Software Engineering (ICSE), pp.432–441

  • Ryu, D., Jang, J.-I., & Baik, J. (2017). A transfer cost-sensitive boosting approach for cross-project defect prediction. Software Quality Journal, 25(1), 235–272.

    Article  Google Scholar 

  • Suhag, V., Garg, A., Dubey, S. K., & Sharma, B. K. (2020). Analytical approach to cross project defect prediction. In Soft Computing: Theories and Applications (M.Pant, T.K. Sharma, O.P. Verma, R.Singla, and A.Sikander, eds.), (Singapore), pp.713–736, Springer Singapore

  • Sun, Z., Li, J., Sun, H., & He, L. (2021). Cfps: Collaborative filtering based source projects selection for cross-project defect prediction. Applied Soft Computing, 99, 106940.

    Article  Google Scholar 

  • Tomar, D., & Agarwal, S. (2015). An effective weighted multi-class least squares twin support vector machine for imbalanced data classification. International Journal of Computational Intelligence Systems, 8(4), 761.

    Article  Google Scholar 

  • Tomar, D., & Agarwal, S. (2016). Prediction of defective software modules using class imbalance learning. Applied Computational Intelligence and Soft Computing, 2016, 1–12.

    Article  Google Scholar 

  • Turhan, B., Menzies, T., Bener, A. B., & Di Stefano, J. (2009). On the relative value of cross-company and within-company data for defect prediction. Empirical Software Engineering, 14(5), 540–578.

    Article  Google Scholar 

  • Turhan, B. (2012). On the dataset shift problem in software engineering. Empirical Software Engineering, 17(1–2), 62–74.

    Article  Google Scholar 

  • Wang, S., & Yao, X. (2013). Using class imbalance learning for software defect prediction. IEEE Transactions on Reliability, 62(2), 434–443.

    Article  Google Scholar 

  • Xu, Z., Pang, S., Zhang, T., Luo,X.-P., Liu, J., Tang,Y.-T., Yu, X., Xue, L. (2019). Cross project defect prediction via balanced distribution adaptation based transfer learning. Journal of Computer Science and Technology, vol.34, pp.1039–1062.

  • Yu, Q., Qian, J., Jiang, S., Wu, Z., & Zhang, G. (2019). An empirical study on the effectiveness of feature selection for cross-project defect prediction. IEEE Access, 7, 35710–35718.

    Article  Google Scholar 

  • Zhang, H., & Zhang, X. (2007). Comments on data mining static code attributes to learn defect predictors. IEEE Transactions on Software Engineering, vol.33, pp.635–637

  • Zhao, Y. (2012). Rand data mining: Examples and case studies. Academic Press, 2012.

  • Zhou, Z.-H., & Liu, X.-Y. (2006). Training cost-sensitive neural networks with methods addressing the class imbalance problem. IEEE Transactions on Knowledge and Data Engineering, 18, 63–77.

  • Zimmermann, T., Premraj, R., & Zeller, A. (2007). Predicting Defects for Eclipse in Predictor Models in Software Engineering, 2007. PROMISE’07: ICSE Workshops 2007. International Workshop on, p.9

  • Zimmermann, T., Premraj, R., & Zeller, A. (2007). Predicting Faults from Cached History in 29th International Conference on Software Engineering (ICSE’07), pp.489–498

  • Zimmermann, T., Nagappan, N., Gall, H., Giger, E., & Murphy, B. (2009). Cross-project Defect Prediction: A Large Scale Experiment on Data vs. Domain vs. Process. In Proceedings of the the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on The Foundations of Software Engineering, ESEC/FSE 09, (New York, NY, USA), pp.91–100, ACM.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nayeem Ahmad Bhat.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bhat, N.A., Farooq, S.U. An empirical evaluation of defect prediction approaches in within-project and cross-project context. Software Qual J 31, 917–946 (2023). https://doi.org/10.1007/s11219-023-09615-7

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11219-023-09615-7

Keywords

Navigation