Abstract
Context:
Just-in-time defect prediction (JITDP) leverages modern machine learning models to predict the defect-proneness of commits. Such models require adequate training data, which is unavailable in projects with short histories. To address this problem, cross-project methods reuse the data or models in other projects to make predictions, grounded on the assumption that they share similar defect-related features. However, these features are overlooked, which leads to unsatisfying model performance.
Objective:
This study aims to investigate the relationship between cross-project JITDP performances and project features, thereby improving the performance of cross-project models.
Method:
We propose a F eature-based ENSE mble modeling approach (FENSE) to cross-project JITDP. For a target project, FENSE pairs it to each source project and obtains 20 features. Leveraging them, it can predict the transferability of each off-the-shelf JITDP model. Then FENSE identifies the most transferable ones and combines them to make cross-project predictions. To achieve this, we conduct a large-scale empirical study of 113,906 project pairs in GitHub and investigate the impact of project features.
Results:
The results show that: (1) cross-project transferability is highly related to features including programming language and the defect ratio of the source project; (2) our feature-based model selection scheme can improve the cross-project JITDP performance by 10%; (3) FENSE outperforms other models on five evaluation measures without extra time and space costs.
Conclusions:
Our study suggests that project features can help identify powerful cross-project JITDP models and improve the performance of ensemble approaches.
Similar content being viewed by others
References
Aversano L, Cerulo L, Del Grosso C (2007) Learning from bug-introducing changes to prevent fault prone code. In: Ninth International Workshop on Principles of Software Evolution: In Conjunction with the 6th ESEC/FSE Joint Meeting, Association for Computing Machinery, New York, NY, USA, IWPSE ’07, pp. 19–26, https://doi.org/10.1145/1294948.1294954
Bettenburg N, Hassan A E (2010) Studying the Impact of Social Structures on Software Quality. In: 2010 IEEE 18th International Conference on Program Comprehension, IEEE, Braga, Portugal, pp. 124–133, https://doi.org/10.1109/ICPC.2010, http://ieeexplore.ieee.org/document/5521754/
Briand LC, Melo WL, Wust J (2002) Assessing the applicability of fault-proneness models across object-oriented software projects. IEEE Trans Softw Eng 28(7):706–720. https://doi.org/10.1109/TSE.2002.1019484https://doi.org/10.1109/TSE.2002.1019484
Cabral G G, Minku L L, Shihab E, Mujahid S (2019) Class Imbalance Evolution and Verification Latency in Just-in-Time Software Defect Prediction. In: Proceedings - International Conference on Software Engineering, IEEE, vol. 2019-May, pp. 666–676, https://doi.org/10.1109/ICSE.2019.00076
Capiluppi A, Lago P, Morisio M (2003) Characteristics of open source projects. In: Seventh European Conference onSoftware Maintenance and Reengineering, 2003. Proceedings, pp. 317–327, https://doi.org/10.1109/CSMR.2003.1192440
Catolino G, Di Nucci D, Ferrucci F (2019) Cross-project just-in-time bug prediction for mobile apps: An empirical assessment. In: Proceedings of the 6th International Conference on Mobile Software Engineering and Systems, IEEE Press, MOBILESoft ’19, pp. 99–110
Chen X, Zhao Y, Wang Q, Yuan Z (2018) MULTI: Multi-objective effort-aware just-in-time software defect prediction. Inf Softw Technol 93:1–13. https://doi.org/10.1016/j.infsof.2017.08.004. https://linkinghub.elsevier.com/retrieve/pii/S0950584917304627
Cohen P, West S, Aiken L (2014) Applied multiple regression/correlation analysis for the behavioral sciences. https://doi.org/10.4324/9781410606266
da Costa D A, McIntosh S, Shang W, Kulesza U, Coelho R, Hassan A E (2017) A framework for evaluating the results of the szz approach for identifying bug-introducing changes. IEEE Trans Softw Eng 43(7):641–657. https://doi.org/10.1109/TSE.2016.2616306
Fan Y, Xia X, Alencar da Costa D, Lo D, Hassan A E, Li S (2019) The Impact of Mislabeled Changes by SZZ on Just-in-Time Defect Prediction. IEEE Trans Softw Eng 47(8):1559–1586. https://doi.org/10.1109/TSE.2019.2929761https://doi.org/10.1109/TSE.2019.2929761
Farrar D E, Glauber R R (1967) Multicollinearity in regression analysis: The problem revisited. Rev Econ Stat 49(1):92–107. http://www.jstor.org/stable/1937887
Fu W, Menzies T (2017) Revisiting unsupervised learning for defect prediction. In: Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, ACM, Paderborn Germany, pp. 72–83, https://dl.acm.org/doi/10.1145/3106237.3106257
Fukushima T, Kamei Y, McIntosh S, Yamashita K, Ubayashi N (2014) An Empirical Study of Just-in-Time Defect Prediction Using Cross-Project Models. In: Proceedings of the 11th Working Conference on Mining Software Repositories, New York, NY, USA, MSR 2014, pp 172–181, https://doi.org/10.1145/2597073.2597075
Graves T L, Karr A F, Marron J S, Siy H (2000) Predicting fault incidence using software change history. IEEE Trans Softw Eng 26(7):653–661. https://doi.org/10.1109/32.859533
Guo P J, Zimmermann T, Nagappan N, Murphy B (2010) Characterizing and predicting which bugs get fixed: An empirical study of microsoft windows. In: Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1, Association for Computing Machinery, New York, NY, USA, ICSE ’10, p 495–504, https://doi.org/10.1145/1806799.1806871
Hassan A E (2009) Predicting faults using the complexity of code changes. In: Proceedings - International Conference on Software Engineering, pp. 78–88, https://doi.org/10.1109/ICSE.2009.5070510
Herzig K, Zeller A (2013) The impact of tangled code changes. In: 2013 10th Working Conference on Mining Software Repositories (MSR), pp. 121–130, https://doi.org/10.1109/MSR.2013.6624018
Hindle A, German D M, Holt R (2008) What do large commits tell us? a taxonomical study of large commits. In: Proceedings of the 2008 International Working Conference on Mining Software Repositories, New York, NY, USA, MSR ’08, p 99–108, https://doi-org-s.nudtproxy.yitlink.com/10.1145/1370750.1370773
Ho T K (1995) Random Decision Forests. In: Proceedings of the Third International Conference on Document Analysis and Recognition (Volume 1) - Volume 1, IEEE Computer Society, USA, ICDAR ’95, p. 278
Hoang T, Kang H J, Lo D, Lawall J (2020) CC2Vec: distributed representations of code changes. In: IEEE International Working Conference on Mining Software Repositories, IEEE, vol. 2019-May, pp. 34–45, https://dl.acm.org/doi/10.1145/3377811.3380361
Hoang T, Khanh Dam H, Kamei Y, Lo D, Ubayashi N (2019) DeepJIT: An end-to-end deep learning framework for just-in-time defect prediction. In: IEEE International Working Conference on Mining Software Repositories, IEEE, vol. 2019-May, pp 34–45, https://doi.org/10.1109/MSR.2019.00016
Huang Q, Xia X, Lo D (2017) Supervised vs Unsupervised Models: A Holistic Look at Effort-Aware Just-in-Time Defect Prediction. In: IEEE International Conference on Software Maintenance and Evolution (ICSME), Shanghai, pp. 159–170, https://doi.org/10.1109/icsme.2017.51
Järvelin K, Kekäläinen J (2002) Cumulated gain-based evaluation of ir techniques. ACM Trans. Inf. Syst. 20(4):422–446. https://doi.org/10.1145/582415.582418. https://doi.org/10.1145/582415.582418
Jiang T, Tan L, Kim S (2013) Personalized defect prediction. In: 2013 28th IEEE/ACM International Conference on Automated Software Engineering, ASE 2013 - Proceedings, pp. 279–289, https://doi.org/10.1109/ASE.2013.6693087
Jiarpakdee J, Tantithamthavorn C, Treude C (2018) AutoSpearman: Automatically Mitigating Correlated Software Metrics for Interpreting Defect Models. In: 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME), IEEE, Madrid, pp 92–103, https://doi.org/10.1109/ICSME.2018.00018https://doi.org/10.1109/ICSME.2018.00018, https://ieeexplore.ieee.org/document/8530020/
Kamei Y, Fukushima T, McIntosh S, Yamashita K, Ubayashi N, Hassan A E (2016) Studying just-in-time defect prediction using cross-project models. Empir Softw Eng 21(5):2072–2106. https://doi.org/10.1007/s10664-015-9400-xhttps://doi.org/10.1007/s10664-015-9400-x
Kamei Y, Shihab E, Adams B, Hassan A E, Mockus A, Sinha A, Ubayashi N (2013) A large-scale empirical study of just-in-time quality assurance. IEEE Trans Softw Eng 39(6):757–773. https://doi.org/10.1109/TSE.2012.70https://doi.org/10.1109/TSE.2012.70
Kawata K, Amasaki S, Yokogawa T (2015) Improving relevancy filter methods for cross-project defect prediction. In: 2015 3rd International Conference on Applied Computing and Information Technology/2nd International Conference on Computational Science and Intelligence, pp. 2–7, https://doi.org/10.1109/ACIT-CSI.2015.104
Kim S, Zimmermann T, Pan K, Jr. Whitehead E J (2006) Automatic Identification of Bug-Introducing Changes. In: 21st IEEE/ACM International Conference on Automated Software Engineering (ASE’06), pp. 81–90, https://doi.org/10.1109/ASE.2006.23
Kim S, Whitehead E J, Zhang Y (2008) Classifying software changes: Clean or buggy?. IEEE Trans Softw Eng 34 (2):181–196. https://doi.org/10.1109/TSE.2007.70773
Kock N, Lynn G (2012) Lateral collinearity and misleading results in variance-based sem: An illustration and recommendations. J Assoc Inf Syst 13(7):546–580. https://doi.org/10.17705/1jais.00302
Kondo M, German D M, Mizuno O, Choi E H (2020) The impact of context metrics on just-in-time defect prediction. Empir Softw Eng 25(1):890–939. https://doi.org/10.1007/s10664-019-09736-3
Koru A G, Zhang D, El Emam K, Liu H (2009) An investigation into the functional form of the size-defect relationship for software modules. IEEE Trans Softw Eng 35(2):293–304. https://doi.org/10.1109/TSE.2008.90https://doi.org/10.1109/TSE.2008.90
Krishna R, Menzies T, Fu W (2016) Too much automation? the bellwether effect and its implications for transfer learning. In: ASE 2016 - Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, pp. 122–131, https://doi.org/10.1145/2970276.2970339
Lessmann S, Baesens B, Mues C, Pietsch S (2008) Benchmarking classification models for software defect prediction: A proposed framework and novel findings. IEEE Trans Softw Eng 34(4):485–496. https://doi.org/10.1109/TSE.2008.35
Lewis C, Lin Z, Sadowski C, Zhu X, Ou R, Whitehead E J (2013) Does bug prediction support human developers? Findings from a Google case study. In: Proceedings - International Conference on Software Engineering, pp. 372–381, https://doi.org/10.1109/ICSE.2013.6606583
Leys C, Ley C, Klein O, Bernard P, Licata L (2013) Detecting outliers: Do not use standard deviation around the mean, use absolute deviation around the median. J Exp Soc Psychol 49(4):764–766. https://doi.org/10.1016/j.jesp.2013.03.013. https://www.sciencedirect.com/science/article/pii/S0022103113000668
Li Z, Yu Y, Wang T, Yin G, Li S, Wang H (2021) Are you still working on this an empirical study on pull request abandonment. IEEE Transactions on Software Engineering. https://doi.org/10.1109/TSE.2021.3053403
Li W, Zhang W, Jia X, Huang Z (2020) Effort-Aware semi-Supervised just-in-Time defect prediction. Inf Softw Technol 126:106364. https://doi.org/10.1016/j.infsof.2020.106364. https://linkinghub.elsevier.com/retrieve/pii/S0950584920301324
Lin D, Tantithamthavorn C, Hassan A E (2021) The impact of data merging on the interpretation of cross-project just-in-time defect models. In: IEEE Transactions on Software Engineering, https://doi.org/10.1109/TSE.2021.3073920
Liu J, Zhou Y, Yang Y, Lu H, Xu B (2017) Code Churn: A Neglected Metric in Effort-Aware Just-in-Time Defect Prediction. In: International Symposium on Empirical Software Engineering and Measurement, vol 2017-Novem, pp 11–19, DOI https://doi.org/10.1109/ESEM.2017.8
Ma Y, Luo G, Zeng X, Chen A (2012) Transfer learning for cross-company software defect prediction. Inf Softw Technol 54(3):248–256. https://doi.org/10.1016/j.infsof.2011.09.007
Matsumoto S, Kamei Y, Monden A, Matsumoto K, Nakamura M (2010) An analysis of developer metrics for fault prediction. In: Proceedings of the 6th International Conference on Predictive Models in Software Engineering, Association for Computing Machinery, New York, NY, USA, PROMISE ’10, https://doi.org/10.1145/1868328.1868356
Mende T, Koschke R (2009) Revisiting the evaluation of defect prediction models. In: Proceedings of the 5th International Conference on Predictor Models in Software Engineering - PROMISE ’09, ACM Press, Vancouver, British Columbia, Canada, p 1, http://portal.acm.org/citation.cfm?doid=1540438.1540448
Menzies T, Milton Z, Turhan B, Cukic B, Jiang Y, Bener A (2010) Defect prediction from static code features: Current results, limitations, new approaches. Autom Softw Eng 17(4):375–407. https://doi.org/10.1007/s10515-010-0069-5
Mockus A, Weiss D M (2000) Predicting risk of software changes. Bell Labs Technical Journal 5(2):169–180. https://doi.org/10.1002/bltj.2229
Nagappan N, Ball T (2005) Use of relative code churn measures to predict system defect density. In: Proceedings - International Conference on Software Engineering, vol. 2005, pp 284–292, https://doi.org/10.1109/ICSE.2005.1553571
Nagappan N, Ball T, Zeller A (2006) Mining metrics to predict component failures. In: Proceedings of the 28th International Conference on Software Engineering, Association for Computing Machinery, New York, NY, USA, ICSE ’06, pp. 452–461, https://doi-org-s.nudtproxy.yitlink.com/10.1145/1134285.1134349
Nakagawa S, Schielzeth H (2013) A general and simple method for obtaining R2 from generalized linear mixed-effects models. Methods Ecol Evol 4 (2):133–142. https://doi.org/10.1111/j.2041-210x.2012.00261.x
Nam J, Pan S J, Kim S (2013) Transfer defect learning. In: 2013 35th International Conference on Software Engineering (ICSE), pp 382–391, https://doi.org/10.1109/ICSE.2013.6606584
Ostrand T J, Weyuker E J, Bell R M (2005) Predicting the location and number of faults in large software systems. IEEE Trans Softw Eng 31 (4):340–355. https://doi.org/10.1109/TSE.2005.49
Pan S J, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359. https://doi.org/10.1109/TKDE.2009.191
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: Machine learning in Python. J Mach Learn Res 12:2825–2830
Pornprasit C, Tantithamthavorn C K (2021) JITLine: A Simpler, Better, Faster, Finer-grained Just-In-Time Defect Prediction. In: 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR), IEEE, Madrid, Spain, pp. 369–379, https://doi.org/10.1109/MSR52588.2021.00049, https://ieeexplore.ieee.org/document/9463103/
Purushothaman R, Perry DE (2005) Toward understanding the rhetoric of small source code changes. IEEE Trans Softw Eng 31(6):511–526. https://doi.org/10.1109/TSE.2005.74
Rahman F, Posnett D, Devanbu P (2012) Recalling the ”imprecision” of cross-project defect prediction. In: Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering - FSE ’12, ACM Press, Cary, North Carolina, p. 1, https://doi.org/10.1145/2393596.2393669, http://dl.acm.org/citation.cfm?doid=2393596.2393669
Shihab E, Hassan A E, Adams B, Jiang Z M (2012) An industrial study on the risk of software changes. In: Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering, FSE 2012, https://doi.org/10.1145/2393596.2393670
Śliwerski J, Zimmermann T, Zeller A (2005) When do changes induce fixes?. In: Proceedings of the 2005 International Workshop on Mining Software Repositories, MSR 2005, pp. 1–5, https://doi.org/10.1145/1083142.1083147
Spadini D, Aniche M, Bacchelli A (2018) PyDriller: Python framework for mining software repositories. In: ESEC/FSE 2018 - Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 908–911, https://doi.org/10.1145/3236024.3264598
Tabassum S, Minku L L, Feng D, Cabral G G, Song L (2020) An investigation of cross-project learning in online just-in-time so ware defect prediction. In: Proceedings - International Conference on Software Engineering, pp. 554–565, https://doi.org/10.1145/3377811.3380403
Tan M, Tan L, Dara S, Mayeux C (2015) Online Defect Prediction for Imbalanced Data. In: Proceedings - International Conference on Software Engineering, vol. 2, pp 99–108, https://doi.org/10.1109/ICSE.2015.139
Tantithamthavorn C, Hassan A E (2018) An experience report on defect modelling in practice: Pitfalls and challenges. In: 2018 IEEE/ACM 40th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP), pp. 286–295
Tantithamthavorn C, McIntosh S, Hassan A E, Matsumoto K (2017) An empirical comparison of model validation techniques for defect prediction models. IEEE Trans Softw Eng 43(1):1–18
Tantithamthavorn C, McIntosh S, Hassan A E, Matsumoto K (2018) The impact of automated parameter optimization for defect prediction models. IEEE Trans Softw Eng 45(7):683–711
Tosun A, Bener A (2009) Reducing false alarms in software defect prediction by decision threshold optimization. In: Proceedings of the 2009 3rd International Symposium on Empirical Software Engineering and Measurement, IEEE Computer Society, USA, ESEM’09, pp. 477–480
Turhan B, Menzies T, Bener A B, Di Stefano J (2009) On the relative value of cross-company and within-company data for defect prediction. Empir Softw Eng 14(5):540–578. https://doi.org/10.1007/s10664-008-9103-7
Wang S, Liu T, Nam J, Tan L (2020) Deep Semantic Feature Learning for Software Defect Prediction. IEEE Trans Softw Eng 46(12):1267–1293. https://doi.org/10.1109/TSE.2018.2877612
Wu R, Zhang H, Kim S, Cheung S-C (2011) ReLink: Recovering Links between Bugs and Changes. In: Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering, Association for Computing Machinery, New York, NY, USA, ESEC/FSE ’11, pp 15–25, https://doi.org/10.1145/2025113.2025120
Yan M, Xia X, Fan Y, Hassan A E, Lo D, Li S (2020) Just-In-Time Defect Identification and Localization: A Two-Phase Framework. IEEE Trans Softw Eng 48(1):82–101. https://doi.org/10.1109/TSE.2020.2978819
Yang X, Lo D, Xia X, Sun J (2017) TLEL: A two-layer ensemble learning approach for just-in-time defect prediction. Inf Softw Technol 87:206–220. https://doi.org/10.1016/j.infsof.2017.03.007
Yang X, Lo D, Xia X, Zhang Y, Sun J (2015) Deep Learning for Just-in-Time Defect Prediction. In: Proceedings - 2015 IEEE International Conference on Software Quality, Reliability and Security, QRS 2015, 1, pp. 17–26, https://doi.org/10.1109/QRS.2015.14
Yang Y, Zhou Y, Liu J, Zhao Y, Lu H, Xu L, Xu B, Leung H (2016) Effort-Aware just-in-Time defect prediction: Simple unsupervised models could be better than supervised models. In: Proceedings of the ACM SIGSOFT Symposium on the Foundations of Software Engineering, vol. 13-18-Nove, pp 157–168, https://doi.org/10.1145/2950290.295035
Zeng Z, Zhang Y, Zhang H, Zhang L (2021) Deep Just-in-Time Defect Prediction: How Far Are We?. In: Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis, Association for Computing Machinery, New York, NY, USA, ISSTA 2021, pp 427–438, https://doi.org/10.1145/3460319.3464819
Zhang F, Mockus A, Keivanloo I, Zou Y (2014) Towards building a universal defect prediction model. In: 2013 IEEE International Conference on Software Maintenance, IEEE, Eindhoven, Netherlands, pp. 350–359, https://doi.org/10.1145/2597073.2597078, http://dl.acm.org/citation.cfm?doid=2597073.2597078
Zhang F, Mockus A, Zou Y, Khomh F, Hassan A E (2013) How Does Context Affect the Distribution of Software Maintainability Metrics?. In: Proceedings of the 11th Working Conference on Mining Software Repositories - MSR 2014, ACM Press, Hyderabad, India, pp. 182–191, https://doi.org/10.1109/ICSM.2013.46, http://ieeexplore.ieee.org/document/6676906/
Zhang X, Yu Y, Georgios G, Rastogi A (2022) Pull request decisions explained: An empirical overview. IEEE Transactions on Software Engineering. https://doi.org/10.1109/TSE.2022.3165056
Zhou Z-H (2012) Ensemble methods: Foundations and algorithms
Zimmermann T, Nagappan N, Gall H, Giger E, Murphy B (2009) Cross-Project Defect Prediction: A Large Scale Experiment on Data vs. Domain vs. Process. In: Proceedings of the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on The Foundations of Software Engineering, Association for Computing Machinery, New York, NY, USA, ESEC/FSE ’09, pp 91–100, https://doi.org/10.1145/1595696.1595713
Acknowledgments
This work is supported by National Key R&D Program of China (2020AAA0103504), National Natural Science Foundation of China (No.61872373), and the Major Key Project of PCL.
Author information
Authors and Affiliations
Corresponding authors
Additional information
Communicated by: Markus Borg
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhang, T., Yu, Y., Mao, X. et al. FENSE: A feature-based ensemble modeling approach to cross-project just-in-time defect prediction. Empir Software Eng 27, 162 (2022). https://doi.org/10.1007/s10664-022-10185-8
Accepted:
Published:
DOI: https://doi.org/10.1007/s10664-022-10185-8