Abstract
Software crashes occur when the software program is executed wrongly or interrupted compulsively, which negatively impacts on user experience. Since the stack traces offer the exception-related information about software crashes, researchers used features collected from the stack trace to automatically identify whether the fault residence where the crash occurred is in the stack trace, aiming at accelerating the process of crash localization. A recent work conducted the first large-scale empirical study, which investigated the impact of feature selection methods on the performance of classification models for this task. However, the crash data have the intrinsic class imbalance characteristic, i.e., there exists a large difference between the number of crash instances inside and outside the stack trace, which is ignored by the previous work. To fill this gap, in this work, we conduct a large-scale empirical study to explore how different imbalanced learning techniques impact the performance of crashing fault residence prediction models on a benchmark dataset comprising seven software projects with four evaluation indicators. Our experimental results demonstrate that two imbalanced variants of the bagging classifier perform better than other compared techniques in both the normal and cross-project settings, and can constantly generate excellent prediction performance even though the imbalance level changes.
Similar content being viewed by others
Data Availability
The datasets generated during and/or analysed during the current study are available in the GitHub repository, https://github.com/sepine/EMSE-2022.
Notes
References
Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin M, Ghemawat S, Irving G, Isard M et al (2016) {TensorFlow}: A system for {Large-Scale} machine learning. In: Proceedings of the 12th USENIX symposium on operating systems design and implementation (OSDI), pp 265–283
Agrawal A, Menzies T (2018) Is “better data” better than “better data miners”?. In: Proceedings of 40th IEEE/ACM international conference on software engineering (ICSE). IEEE, pp 1050–1061
Batista GE, Bazzan AL, Monard MC et al (2003) Balancing training data for automated annotation of keywords: a case study. In: WOB, pp 10–18
Batista GE, Prati RC, Monard MC (2004) A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD Explor Newsl 6(1):20–29
Bennin KE, Keung JW, Monden A (2019) On the relative value of data resampling approaches for software defect prediction. Empir Softw Eng (EMSE) 24(2):602–636
Branco P, Torgo L, Ribeiro RP (2016) A survey of predictive modeling on imbalanced domains. ACM Comput Surv (CSUR) 49(2):1–50
Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Cabral GG, Minku LL, Shihab E, Mujahid S (2019) Class imbalance evolution and verification latency in just-in-time software defect prediction. In: Proceedings of the IEEE/ACM 41st international conference on software engineering (ICSE). IEEE, pp 666–676
Catolino G (2017) Just-in-time bug prediction in mobile applications: the domain matters!. In: Proceedings of the IEEE/ACM 4th international conference on mobile software engineering and systems (MOBILESoft). IEEE, pp 201–202
Catolino G, Di Nucci D, Ferrucci F (2019) Cross-project just-in-time bug prediction for mobile apps: An empirical assessment. In: Proceedings of the IEEE/ACM 6th international conference on mobile software engineering and systems (MOBILESoft). IEEE, pp 99–110
Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357
Chawla NV, Lazarevic A, Hall LO, Bowyer KW (2003) SMOTEBoost: Improving prediction of the minority class in boosting. In: European conference on principles of data mining and knowledge discovery. Springer, pp 107–119
Chen C, Liaw A, Breiman L et al (2004) Using random forest to learn imbalanced data. Univ Calif Berkeley 110(1-12):24
Chen N, Kim S (2014) Star: Stack trace based automatic crash reproduction via symbolic execution. IEEE Trans Softw Engi (TSE) 41(2):198–220
Cover T, Hart P (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13(1):21–27
Dhaliwal T, Khomh F, Zou Y (2011) Classifying field crash reports for fixing bugs: A case study of Mozilla Firefox. In: Proceedings of the 27th IEEE international conference on software maintenance (ICSM). IEEE, pp 333–342
Fan W, Stolfo SJ, Zhang J, Chan PK (1999) Adacost: misclassification cost-sensitive boosting. In: ICML, vol 99. Citeseer, pp 97–105
Fan Y, Xia X, Lo D, Hassan AE (2018) Chaff from the wheat: Characterizing and determining valid bug reports. IEEE Trans Softw Eng (TSE) 46 (5):495–525
Fang C, Liu Z, Shi Y, Huang J, Shi Q (2020) Functional code clone detection with syntax and semantics fusion learning. In: Proceedings of the 29th ACM SIGSOFT international symposium on software testing and analysis (ISSTA), pp 516–527
Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139
Fürnkranz J (1999) Separate-and-conquer rule learning. Artif Intell Rev 13(1):3–54
Gong L, Zhang H, Seo H, Kim S (2014) Locating crashing faults based on crash stack traces. arXiv:14044100
Gu Y, Xuan J, Zhang H, Zhang L, Fan Q, Xie X, Qian T (2019) Does the fault reside in a stack trace? Assisting crash localization by predicting crashing fault residence. J Syst Softw (JSS) 148:88–104
Han H, Wang WY, Mao BH (2005) Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning. In: International conference on intelligent computing. Springer, pp 878–887
Hart P (1968) The condensed nearest neighbor rule (corresp.) IEEE Trans Inf Theory 14(3):515–516
He H, Garcia EA (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng (TKDE) 21(9):1263–1284
He H, Bai Y, Garcia EA, Li S (2008) ADASYN: Adaptive synthetic sampling approach for imbalanced learning. In: 2008 IEEE international joint conference on neural networks (IEEE world congress on computational intelligence), IEEE, pp 1322–1328
Hinton GE (1990) Connectionist learning procedures. In: Machine learning. Elsevier, pp 555–610
Ho TK (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20(8):832–844
Jing X, Wu F, Dong X, Qi F, Xu B (2015) Heterogeneous cross-company defect prediction by unified metric representation and CCA-based transfer learning. In: Proceedings of the 10th joint meeting on foundations of software engineering (FSE), pp 496–507
Kamei Y, Shihab E, Adams B, Hassan AE, Mockus A, Sinha A, Ubayashi N (2012) A large-scale empirical study of just-in-time quality assurance. IEEE Trans Softw Eng (TSE) 39(6):757–773
Kamei Y, Fukushima T, McIntosh S, Yamashita K, Ubayashi N, Hassan AE (2016) Studying just-in-time defect prediction using cross-project models. Empir Softw Eng (EMSE) 21(5):2072–2106
Kubat M, Matwin S et al (1997) Addressing the curse of imbalanced training sets: one-sided selection. In: ICML, vol 97. Citeseer, pp 179–186
Laurikkala J (2001) Improving identification of difficult small classes by balancing class distribution. In: Conference on artificial intelligence in medicine in Europe. Springer, pp 63–66
Leisch F (2006) A toolbox for K-centroids cluster analysis. Comput Stat Data Anal 51(2):526–544
Lerman RI, Yitzhaki S (1984) A note on the calculation and interpretation of the Gini index. Econ Lett 15(3-4):363–368
Li K, Xiang Z, Chen T, Wang S, Tan KC (2020) Understanding the automated parameter optimization on transfer learning for cross-project defect prediction: an empirical study. In: Proceedings of the ACM/IEEE 42nd international conference on software engineering (ICSE), pp 566–577
Li Y, Ying S, Jia X, Xu Y, Zhao L, Cheng G, Wang B, Xuan J (2018) Eh-recommender: Recommending exception handling strategies based on program context. In: Proceedings of the 23rd international conference on engineering of complex computer systems (ICECCS). IEEE, pp 104–114
Liu XY, Wu J, Zhou ZH (2008) Exploratory undersampling for class-imbalance learning. IEEE Trans Syst, Man, Cybern Part B (Cybernetics) 39 (2):539–550
Liu Z, Cao W, Gao Z, Bian J, Chen H, Chang Y, Liu TY (2020) Self-paced ensemble for highly imbalanced massive data classification. In: Proceedings of 36th IEEE international conference on data engineering (ICDE). IEEE, pp 841–852
Loh WY (2011) Classification and regression trees. Wiley Interdiscip Rev Data Min Knowl Discov 1(1):14–23
Louppe G, Geurts P (2012) Ensembles on random patches. In: Joint European conference on machine learning and knowledge discovery in databases. Springer, pp 346–361
Maclin R, Opitz D (1997) An empirical evaluation of bagging and boosting. AAAI/IAAI 1997:546–551
Mani I, Zhang I (2003). In: Proceedings of workshop on learning from imbalanced datasets, ICML United States, vol 126
Mathur AP (2013) Foundations of software testing, 2/e. Pearson Education India
McIntosh S, Kamei Y (2017) Are fix-inducing changes a moving target? A longitudinal case study of just-in-time defect prediction. IEEE Trans Softw Eng (TSE) 44(5):412–428
Moreno L, Treadway JJ, Marcus A, Shen W (2014) On the use of stack traces to improve text retrieval-based bug localization. In: Proceedings of 30th IEEE international conference on software maintenance and evolution (ICSME). IEEE, pp 151–160
Nam J, Pan SJ, Kim S (2013) Transfer defect learning. In: Proceedings of the 35th international conference on software engineering (ICSE). IEEE, pp 382–391
Nayrolles M, Hamou-Lhadj A, Tahar S, Larsson A (2017) A bug reproduction approach based on directed model checking and crash traces. J Softw Evol Process (JSEP) 29(3):e1789
Nguyen HM, Cooper EW, Kamei K (2011) Borderline over-sampling for imbalanced data classification. Int J Knowl Eng Soft Data Paradigms 3 (1):4–21
Pawlak R, Monperrus M, Petitprez N, Noguera C, Seinturier L (2016) SPOON: A library for implementing analyses and transformations of Java source code. Softw Pract Experience 46(9):1155–1179
Platt J, et al. (1999) Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Adv Large Margin Classifiers 10(3):61–74
Ren X, Xing Z, Xia X, Lo D, Wang X, Grundy J (2019) Neural network-based detection of self-admitted technical debt: From performance to explainability. ACM Trans Softw Eng Methodol (TOSEM) 28(3):1–45
Schroter A, Schröter A, Bettenburg N, Premraj R (2010) Do stack traces help developers fix bugs?. In: Proceedings of 7th IEEE working conference on mining software repositories (MSR). IEEE, pp 118–121
Seiffert C, Khoshgoftaar TM, Van Hulse J, Napolitano A (2009) RUSBoost: A hybrid approach to alleviating class imbalance. IEEE Trans Syst Man Cybern-Part A Syst Hum 40(1):185–197
Shawe-Taylor GKJ, Karakoulas G (1999) Optimizing classifiers for imbalanced training sets. Adv Neural Inf Process Syst 11(11):253
Smith MR, Martinez T, Giraud-Carrier C (2014) An instance level analysis of data complexity. Mach Learn 95(2):225–256
Soltani M, Panichella A, Van Deursen A (2017) A guided genetic algorithm for automated crash reproduction. In: Proceedings of 39th IEEE/ACM international conference on software engineering (ICSE). IEEE, pp 209–220
Soltani M, Derakhshanfar P, Devroey X, Van Deursen A (2020) A benchmark-based evaluation of search-based crash reproduction. Empir Softw Eng (EMSE) 25(1):96–138
Song Q, Guo Y, Shepperd M (2018) A comprehensive investigation of the role of imbalanced learning for software defect prediction. IEEE Trans Softw Eng (TSE) 45(12):1253–1269
Tan M, Tan L, Dara S, Mayeux C (2015) Online defect prediction for imbalanced data. In: Proceedings of 37th IEEE international conference on software engineering (ICSE), vol 2. IEEE, pp 99–108
Tantithamthavorn C, McIntosh S, Hassan AE, Matsumoto K (2016) An empirical comparison of model validation techniques for defect prediction models. IEEE Trans Softw Eng (TSE) 43(1):1–18
Tantithamthavorn C, Hassan AE, Matsumoto K (2018) The impact of class rebalancing techniques on the performance and interpretation of defect prediction models. IEEE Trans Softw Eng (TSE) 46(11):1200–1219
Tomek I, et al. (1976a) An experiment with the edited nearest-neighbor rule
Tomek I, et al. (1976b) Two modifications of CNN
Viola P, Jones M (2001) Fast and robust classification using asymmetric adaboost and a detector cascade. Adv Neural Inf Process Syst 14
Wang S, Yao X (2009) Diversity analysis on imbalanced data sets by using ensemble models. In: 2009 IEEE symposium on computational intelligence and data mining. IEEE, pp 324–331
Wang S, Yao X (2013) Using class imbalance learning for software defect prediction. IEEE Trans Reliab 62(2):434–443
Wang X, Liu J, Li L, Chen X, Liu X, Wu H (2020) Detecting and explaining self-admitted technical debts with attention-based neural networks. In: Proceedings of the 35th IEEE/ACM international conference on automated software engineering (ASE), pp 871–882
Wilson DL (1972) Asymptotic properties of nearest neighbor rules using edited data. IEEE Trans Syst Man Cybern (3):408–421
Wong CP, Xiong Y, Zhang H, Hao D, Zhang L, Mei H (2014) Boosting bug-report-oriented fault localization with segmentation and stack-trace analysis. In: Proceedings of 30th IEEE international conference on software maintenance and evolution (ICSME). IEEE, pp 181–190
Wu R, Zhang H, Cheung SC, Kim S (2014) Crashlocator: Locating crashing faults based on crash stacks. In: Proceedings of the 23th international symposium on software testing and analysis (ISSTA), pp 204–214
Wu R, Wen M, Cheung SC, Zhang H (2018) Changelocator: locate crash-inducing changes based on crash reports. Empir Softw Eng (EMSE) 23(5):2866–2900
Xu Z, Li S, Xu J, Liu J, Luo X, Zhang Y, Zhang T, Keung J, Tang Y (2019a) LDFR: Learning deep feature representation for software defect prediction. J Syst Softw (JSS) 158:110402
Xu Z, Zhang T, Zhang Y, Tang Y, Liu J, Luo X, Keung J, Cui X (2019b) Identifying crashing fault residence based on cross project model. In: Proceedings of 30th IEEE international symposium on software reliability engineering (ISSRE). IEEE, pp 183–194
Xu Z, Zhao K, Yan M, Yuan P, Xu L, Lei Y, Zhang X (2020) Imbalanced metric learning for crashing fault residence prediction. J Syst Softw (JSS) 170:110763
Xu Z, Zhao K, Zhang T, Fu C, Yan M, Xie Z, Zhang X, Catolino G (2021) Effort-aware just-in-time bug prediction for mobile apps via cross-triplet deep feature embedding. IEEE Trans Reliab 71(1):204–220
Xuan J, Xie X, Monperrus M (2015) Crash reproduction via test case mutation: Let existing test cases help. In: Proceedings of the 10th joint meeting on foundations of software engineering, pp 910–913
Yu HF, Huang FL, Lin CJ (2011) Dual coordinate descent methods for logistic regression and maximum entropy models. Mach Learn 85(1-2):41–75
Zhao K, Liu J, Xu Z, Li L, Yan M, Yu J, Zhou Y (2021a) Predicting crash fault residence via simplified deep forest based on a reduced feature set. In: Proceedings of 29th IEEE/ACM international conference on program comprehension (ICPC). IEEE, pp 242–252
Zhao K, Xu Z, Yan M, Zhang T, Yang D, Li W (2021b) A comprehensive investigation of the impact of feature selection techniques on crashing fault residence prediction models. Information and Software Technology (IST) p 106652
Zhao K, Xu Z, Zhang T, Tang Y, Yan M (2021c) Simplified deep forest model based just-in-time defect prediction for android mobile apps. IEEE Trans Reliab 70(2):848–859
Acknowledgements
This work was supported in part by the National Key Research and Development Project (No. 2021YFB1714200), the Open Foundation of Key Laboratory of Dependable Service Computing in Cyber Physical Society, Ministry of Education of China (No. Grant CPSDSC202004), the National Natural Science Foundation of China (No. 62002034, 62002306, and 62272377), the Fundamental Research Funds for the Central Universities (No. 2022CDJDX-005, xxj022019001, and xzy012020009), the Natural Science Foundation of Chongqing (No. cstc2021jcyj-msxmX0538), the Macao Science and Technology Development Fund under Grant (0047/2020/A1 and 0014/2022/A), the CCF-NOFOCUS kunpeng Fund, and the Young Talent Fund of Association for Science and Technology in Shaanxi, China.
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Conflict of Interests
The authors have no conflict of interest.
Additional information
Communicated by: Mehdi Mirakhorli
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhao, K., Xu, Z., Yan, M. et al. The impact of class imbalance techniques on crashing fault residence prediction models. Empir Software Eng 28, 49 (2023). https://doi.org/10.1007/s10664-023-10294-y
Accepted:
Published:
DOI: https://doi.org/10.1007/s10664-023-10294-y