Skip to main content
Log in

The impact of class imbalance techniques on crashing fault residence prediction models

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Software crashes occur when the software program is executed wrongly or interrupted compulsively, which negatively impacts on user experience. Since the stack traces offer the exception-related information about software crashes, researchers used features collected from the stack trace to automatically identify whether the fault residence where the crash occurred is in the stack trace, aiming at accelerating the process of crash localization. A recent work conducted the first large-scale empirical study, which investigated the impact of feature selection methods on the performance of classification models for this task. However, the crash data have the intrinsic class imbalance characteristic, i.e., there exists a large difference between the number of crash instances inside and outside the stack trace, which is ignored by the previous work. To fill this gap, in this work, we conduct a large-scale empirical study to explore how different imbalanced learning techniques impact the performance of crashing fault residence prediction models on a benchmark dataset comprising seven software projects with four evaluation indicators. Our experimental results demonstrate that two imbalanced variants of the bagging classifier perform better than other compared techniques in both the normal and cross-project settings, and can constantly generate excellent prediction performance even though the imbalance level changes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Data Availability

The datasets generated during and/or analysed during the current study are available in the GitHub repository, https://github.com/sepine/EMSE-2022.

Notes

  1. http://pitest.org

References

  • Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin M, Ghemawat S, Irving G, Isard M et al (2016) {TensorFlow}: A system for {Large-Scale} machine learning. In: Proceedings of the 12th USENIX symposium on operating systems design and implementation (OSDI), pp 265–283

  • Agrawal A, Menzies T (2018) Is “better data” better than “better data miners”?. In: Proceedings of 40th IEEE/ACM international conference on software engineering (ICSE). IEEE, pp 1050–1061

  • Batista GE, Bazzan AL, Monard MC et al (2003) Balancing training data for automated annotation of keywords: a case study. In: WOB, pp 10–18

  • Batista GE, Prati RC, Monard MC (2004) A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD Explor Newsl 6(1):20–29

    Google Scholar 

  • Bennin KE, Keung JW, Monden A (2019) On the relative value of data resampling approaches for software defect prediction. Empir Softw Eng (EMSE) 24(2):602–636

    Google Scholar 

  • Branco P, Torgo L, Ribeiro RP (2016) A survey of predictive modeling on imbalanced domains. ACM Comput Surv (CSUR) 49(2):1–50

    Google Scholar 

  • Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140

    MATH  Google Scholar 

  • Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    MATH  Google Scholar 

  • Cabral GG, Minku LL, Shihab E, Mujahid S (2019) Class imbalance evolution and verification latency in just-in-time software defect prediction. In: Proceedings of the IEEE/ACM 41st international conference on software engineering (ICSE). IEEE, pp 666–676

  • Catolino G (2017) Just-in-time bug prediction in mobile applications: the domain matters!. In: Proceedings of the IEEE/ACM 4th international conference on mobile software engineering and systems (MOBILESoft). IEEE, pp 201–202

  • Catolino G, Di Nucci D, Ferrucci F (2019) Cross-project just-in-time bug prediction for mobile apps: An empirical assessment. In: Proceedings of the IEEE/ACM 6th international conference on mobile software engineering and systems (MOBILESoft). IEEE, pp 99–110

  • Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357

    MATH  Google Scholar 

  • Chawla NV, Lazarevic A, Hall LO, Bowyer KW (2003) SMOTEBoost: Improving prediction of the minority class in boosting. In: European conference on principles of data mining and knowledge discovery. Springer, pp 107–119

  • Chen C, Liaw A, Breiman L et al (2004) Using random forest to learn imbalanced data. Univ Calif Berkeley 110(1-12):24

    Google Scholar 

  • Chen N, Kim S (2014) Star: Stack trace based automatic crash reproduction via symbolic execution. IEEE Trans Softw Engi (TSE) 41(2):198–220

    Google Scholar 

  • Cover T, Hart P (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13(1):21–27

    MATH  Google Scholar 

  • Dhaliwal T, Khomh F, Zou Y (2011) Classifying field crash reports for fixing bugs: A case study of Mozilla Firefox. In: Proceedings of the 27th IEEE international conference on software maintenance (ICSM). IEEE, pp 333–342

  • Fan W, Stolfo SJ, Zhang J, Chan PK (1999) Adacost: misclassification cost-sensitive boosting. In: ICML, vol 99. Citeseer, pp 97–105

  • Fan Y, Xia X, Lo D, Hassan AE (2018) Chaff from the wheat: Characterizing and determining valid bug reports. IEEE Trans Softw Eng (TSE) 46 (5):495–525

    Google Scholar 

  • Fang C, Liu Z, Shi Y, Huang J, Shi Q (2020) Functional code clone detection with syntax and semantics fusion learning. In: Proceedings of the 29th ACM SIGSOFT international symposium on software testing and analysis (ISSTA), pp 516–527

  • Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139

    MathSciNet  MATH  Google Scholar 

  • Fürnkranz J (1999) Separate-and-conquer rule learning. Artif Intell Rev 13(1):3–54

    MATH  Google Scholar 

  • Gong L, Zhang H, Seo H, Kim S (2014) Locating crashing faults based on crash stack traces. arXiv:14044100

  • Gu Y, Xuan J, Zhang H, Zhang L, Fan Q, Xie X, Qian T (2019) Does the fault reside in a stack trace? Assisting crash localization by predicting crashing fault residence. J Syst Softw (JSS) 148:88–104

    Google Scholar 

  • Han H, Wang WY, Mao BH (2005) Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning. In: International conference on intelligent computing. Springer, pp 878–887

  • Hart P (1968) The condensed nearest neighbor rule (corresp.) IEEE Trans Inf Theory 14(3):515–516

    Google Scholar 

  • He H, Garcia EA (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng (TKDE) 21(9):1263–1284

    Google Scholar 

  • He H, Bai Y, Garcia EA, Li S (2008) ADASYN: Adaptive synthetic sampling approach for imbalanced learning. In: 2008 IEEE international joint conference on neural networks (IEEE world congress on computational intelligence), IEEE, pp 1322–1328

  • Hinton GE (1990) Connectionist learning procedures. In: Machine learning. Elsevier, pp 555–610

  • Ho TK (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20(8):832–844

    Google Scholar 

  • Jing X, Wu F, Dong X, Qi F, Xu B (2015) Heterogeneous cross-company defect prediction by unified metric representation and CCA-based transfer learning. In: Proceedings of the 10th joint meeting on foundations of software engineering (FSE), pp 496–507

  • Kamei Y, Shihab E, Adams B, Hassan AE, Mockus A, Sinha A, Ubayashi N (2012) A large-scale empirical study of just-in-time quality assurance. IEEE Trans Softw Eng (TSE) 39(6):757–773

    Google Scholar 

  • Kamei Y, Fukushima T, McIntosh S, Yamashita K, Ubayashi N, Hassan AE (2016) Studying just-in-time defect prediction using cross-project models. Empir Softw Eng (EMSE) 21(5):2072–2106

    Google Scholar 

  • Kubat M, Matwin S et al (1997) Addressing the curse of imbalanced training sets: one-sided selection. In: ICML, vol 97. Citeseer, pp 179–186

  • Laurikkala J (2001) Improving identification of difficult small classes by balancing class distribution. In: Conference on artificial intelligence in medicine in Europe. Springer, pp 63–66

  • Leisch F (2006) A toolbox for K-centroids cluster analysis. Comput Stat Data Anal 51(2):526–544

    MathSciNet  MATH  Google Scholar 

  • Lerman RI, Yitzhaki S (1984) A note on the calculation and interpretation of the Gini index. Econ Lett 15(3-4):363–368

    Google Scholar 

  • Li K, Xiang Z, Chen T, Wang S, Tan KC (2020) Understanding the automated parameter optimization on transfer learning for cross-project defect prediction: an empirical study. In: Proceedings of the ACM/IEEE 42nd international conference on software engineering (ICSE), pp 566–577

  • Li Y, Ying S, Jia X, Xu Y, Zhao L, Cheng G, Wang B, Xuan J (2018) Eh-recommender: Recommending exception handling strategies based on program context. In: Proceedings of the 23rd international conference on engineering of complex computer systems (ICECCS). IEEE, pp 104–114

  • Liu XY, Wu J, Zhou ZH (2008) Exploratory undersampling for class-imbalance learning. IEEE Trans Syst, Man, Cybern Part B (Cybernetics) 39 (2):539–550

    Google Scholar 

  • Liu Z, Cao W, Gao Z, Bian J, Chen H, Chang Y, Liu TY (2020) Self-paced ensemble for highly imbalanced massive data classification. In: Proceedings of 36th IEEE international conference on data engineering (ICDE). IEEE, pp 841–852

  • Loh WY (2011) Classification and regression trees. Wiley Interdiscip Rev Data Min Knowl Discov 1(1):14–23

    Google Scholar 

  • Louppe G, Geurts P (2012) Ensembles on random patches. In: Joint European conference on machine learning and knowledge discovery in databases. Springer, pp 346–361

  • Maclin R, Opitz D (1997) An empirical evaluation of bagging and boosting. AAAI/IAAI 1997:546–551

    Google Scholar 

  • Mani I, Zhang I (2003). In: Proceedings of workshop on learning from imbalanced datasets, ICML United States, vol 126

  • Mathur AP (2013) Foundations of software testing, 2/e. Pearson Education India

  • McIntosh S, Kamei Y (2017) Are fix-inducing changes a moving target? A longitudinal case study of just-in-time defect prediction. IEEE Trans Softw Eng (TSE) 44(5):412–428

    Google Scholar 

  • Moreno L, Treadway JJ, Marcus A, Shen W (2014) On the use of stack traces to improve text retrieval-based bug localization. In: Proceedings of 30th IEEE international conference on software maintenance and evolution (ICSME). IEEE, pp 151–160

  • Nam J, Pan SJ, Kim S (2013) Transfer defect learning. In: Proceedings of the 35th international conference on software engineering (ICSE). IEEE, pp 382–391

  • Nayrolles M, Hamou-Lhadj A, Tahar S, Larsson A (2017) A bug reproduction approach based on directed model checking and crash traces. J Softw Evol Process (JSEP) 29(3):e1789

    Google Scholar 

  • Nguyen HM, Cooper EW, Kamei K (2011) Borderline over-sampling for imbalanced data classification. Int J Knowl Eng Soft Data Paradigms 3 (1):4–21

    Google Scholar 

  • Pawlak R, Monperrus M, Petitprez N, Noguera C, Seinturier L (2016) SPOON: A library for implementing analyses and transformations of Java source code. Softw Pract Experience 46(9):1155–1179

    Google Scholar 

  • Platt J, et al. (1999) Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Adv Large Margin Classifiers 10(3):61–74

    Google Scholar 

  • Ren X, Xing Z, Xia X, Lo D, Wang X, Grundy J (2019) Neural network-based detection of self-admitted technical debt: From performance to explainability. ACM Trans Softw Eng Methodol (TOSEM) 28(3):1–45

    Google Scholar 

  • Schroter A, Schröter A, Bettenburg N, Premraj R (2010) Do stack traces help developers fix bugs?. In: Proceedings of 7th IEEE working conference on mining software repositories (MSR). IEEE, pp 118–121

  • Seiffert C, Khoshgoftaar TM, Van Hulse J, Napolitano A (2009) RUSBoost: A hybrid approach to alleviating class imbalance. IEEE Trans Syst Man Cybern-Part A Syst Hum 40(1):185–197

    Google Scholar 

  • Shawe-Taylor GKJ, Karakoulas G (1999) Optimizing classifiers for imbalanced training sets. Adv Neural Inf Process Syst 11(11):253

    Google Scholar 

  • Smith MR, Martinez T, Giraud-Carrier C (2014) An instance level analysis of data complexity. Mach Learn 95(2):225–256

    MathSciNet  MATH  Google Scholar 

  • Soltani M, Panichella A, Van Deursen A (2017) A guided genetic algorithm for automated crash reproduction. In: Proceedings of 39th IEEE/ACM international conference on software engineering (ICSE). IEEE, pp 209–220

  • Soltani M, Derakhshanfar P, Devroey X, Van Deursen A (2020) A benchmark-based evaluation of search-based crash reproduction. Empir Softw Eng (EMSE) 25(1):96–138

    Google Scholar 

  • Song Q, Guo Y, Shepperd M (2018) A comprehensive investigation of the role of imbalanced learning for software defect prediction. IEEE Trans Softw Eng (TSE) 45(12):1253–1269

    Google Scholar 

  • Tan M, Tan L, Dara S, Mayeux C (2015) Online defect prediction for imbalanced data. In: Proceedings of 37th IEEE international conference on software engineering (ICSE), vol 2. IEEE, pp 99–108

  • Tantithamthavorn C, McIntosh S, Hassan AE, Matsumoto K (2016) An empirical comparison of model validation techniques for defect prediction models. IEEE Trans Softw Eng (TSE) 43(1):1–18

    Google Scholar 

  • Tantithamthavorn C, Hassan AE, Matsumoto K (2018) The impact of class rebalancing techniques on the performance and interpretation of defect prediction models. IEEE Trans Softw Eng (TSE) 46(11):1200–1219

    Google Scholar 

  • Tomek I, et al. (1976a) An experiment with the edited nearest-neighbor rule

  • Tomek I, et al. (1976b) Two modifications of CNN

  • Viola P, Jones M (2001) Fast and robust classification using asymmetric adaboost and a detector cascade. Adv Neural Inf Process Syst 14

  • Wang S, Yao X (2009) Diversity analysis on imbalanced data sets by using ensemble models. In: 2009 IEEE symposium on computational intelligence and data mining. IEEE, pp 324–331

  • Wang S, Yao X (2013) Using class imbalance learning for software defect prediction. IEEE Trans Reliab 62(2):434–443

    Google Scholar 

  • Wang X, Liu J, Li L, Chen X, Liu X, Wu H (2020) Detecting and explaining self-admitted technical debts with attention-based neural networks. In: Proceedings of the 35th IEEE/ACM international conference on automated software engineering (ASE), pp 871–882

  • Wilson DL (1972) Asymptotic properties of nearest neighbor rules using edited data. IEEE Trans Syst Man Cybern (3):408–421

  • Wong CP, Xiong Y, Zhang H, Hao D, Zhang L, Mei H (2014) Boosting bug-report-oriented fault localization with segmentation and stack-trace analysis. In: Proceedings of 30th IEEE international conference on software maintenance and evolution (ICSME). IEEE, pp 181–190

  • Wu R, Zhang H, Cheung SC, Kim S (2014) Crashlocator: Locating crashing faults based on crash stacks. In: Proceedings of the 23th international symposium on software testing and analysis (ISSTA), pp 204–214

  • Wu R, Wen M, Cheung SC, Zhang H (2018) Changelocator: locate crash-inducing changes based on crash reports. Empir Softw Eng (EMSE) 23(5):2866–2900

    Google Scholar 

  • Xu Z, Li S, Xu J, Liu J, Luo X, Zhang Y, Zhang T, Keung J, Tang Y (2019a) LDFR: Learning deep feature representation for software defect prediction. J Syst Softw (JSS) 158:110402

    Google Scholar 

  • Xu Z, Zhang T, Zhang Y, Tang Y, Liu J, Luo X, Keung J, Cui X (2019b) Identifying crashing fault residence based on cross project model. In: Proceedings of 30th IEEE international symposium on software reliability engineering (ISSRE). IEEE, pp 183–194

  • Xu Z, Zhao K, Yan M, Yuan P, Xu L, Lei Y, Zhang X (2020) Imbalanced metric learning for crashing fault residence prediction. J Syst Softw (JSS) 170:110763

    Google Scholar 

  • Xu Z, Zhao K, Zhang T, Fu C, Yan M, Xie Z, Zhang X, Catolino G (2021) Effort-aware just-in-time bug prediction for mobile apps via cross-triplet deep feature embedding. IEEE Trans Reliab 71(1):204–220

    Google Scholar 

  • Xuan J, Xie X, Monperrus M (2015) Crash reproduction via test case mutation: Let existing test cases help. In: Proceedings of the 10th joint meeting on foundations of software engineering, pp 910–913

  • Yu HF, Huang FL, Lin CJ (2011) Dual coordinate descent methods for logistic regression and maximum entropy models. Mach Learn 85(1-2):41–75

    MathSciNet  MATH  Google Scholar 

  • Zhao K, Liu J, Xu Z, Li L, Yan M, Yu J, Zhou Y (2021a) Predicting crash fault residence via simplified deep forest based on a reduced feature set. In: Proceedings of 29th IEEE/ACM international conference on program comprehension (ICPC). IEEE, pp 242–252

  • Zhao K, Xu Z, Yan M, Zhang T, Yang D, Li W (2021b) A comprehensive investigation of the impact of feature selection techniques on crashing fault residence prediction models. Information and Software Technology (IST) p 106652

  • Zhao K, Xu Z, Zhang T, Tang Y, Yan M (2021c) Simplified deep forest model based just-in-time defect prediction for android mobile apps. IEEE Trans Reliab 70(2):848–859

    Google Scholar 

Download references

Acknowledgements

This work was supported in part by the National Key Research and Development Project (No. 2021YFB1714200), the Open Foundation of Key Laboratory of Dependable Service Computing in Cyber Physical Society, Ministry of Education of China (No. Grant CPSDSC202004), the National Natural Science Foundation of China (No. 62002034, 62002306, and 62272377), the Fundamental Research Funds for the Central Universities (No. 2022CDJDX-005, xxj022019001, and xzy012020009), the Natural Science Foundation of Chongqing (No. cstc2021jcyj-msxmX0538), the Macao Science and Technology Development Fund under Grant (0047/2020/A1 and 0014/2022/A), the CCF-NOFOCUS kunpeng Fund, and the Young Talent Fund of Association for Science and Technology in Shaanxi, China.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Zhou Xu or Meng Yan.

Ethics declarations

Conflict of Interests

The authors have no conflict of interest.

Additional information

Communicated by: Mehdi Mirakhorli

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhao, K., Xu, Z., Yan, M. et al. The impact of class imbalance techniques on crashing fault residence prediction models. Empir Software Eng 28, 49 (2023). https://doi.org/10.1007/s10664-023-10294-y

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10664-023-10294-y

Keywords

Navigation