Abstract
Software defect prediction plays a key role in guiding resource allocation for software testing. However, previous defect prediction studies still have some limitations: (1) the granularity of defect prediction is still coarse, so high-risk code statements cannot be accurately located; (2) in fine-grained defect prediction, the semantic and structural information available in a single line of code is limited, and the content of code semantic information is not sufficient to achieve semantic differentiation. To address the above problems, we propose a two-phase line-level defect prediction method based on deep learning called LineFlowDP. We first extract the program dependency graph (PDG) of the source files. The lines of code corresponding to the nodes in the PDG are extended semantically with data flow and control flow information and embedded as nodes, and the model is further trained using an relational graph convolutional network. Finally, a graph interpreter GNNExplainer and a social network analysis method are used to rank the lines of code in the defective file according to risk. On 32 datasets from 9 projects, the experimental results show that LineFlowDP is 13%-404% more cost-effective than four state-of-the-art line-level defect prediction methods. The effectiveness of the flow information extension and code line risk ranking methods was also verified via ablation experiments.
Similar content being viewed by others
Data availability
The dataset used in this paper at https://github.com/awsm-research/line-level-defect-prediction. To foster the replication of our study, we will publish the implementation of our LineFlowDP and the baselines at https://github.com/LineFlowDP/LineFlowDP.
References
Abdous S, Abdollahzadeh R, Rohban M H (2023) KS-GNNExplainer: Global Model Interpretation Through Instance Explanations On Histopathology images[J]. arXiv preprint arXiv:2304.08240. https://doi.org/10.48550/arXiv.2304.08240
Aftandilian E, Sauciuc R, Priya S et al (2012) Building Useful Program Analysis Tools using An Extensible Java Compiler[C]//2012 IEEE 12th International Working Conference on Source Code Analysis and Manipulation. IEEE:14–23. https://doi.org/10.1109/SCAM.2012.28
Allamanis M, Brockschmidt M, Khademi M (2017) Learning to Represent Programs with Graphs[J]. arXiv preprint arXiv:1711.00740. https://doi.org/10.48550/arXiv.1711.00740
Cao S, Sun X, Bo L, et al. (2022) MVD: Memory-Related Vulnerability Detection Based on Flow-Sensitive Graph Neural Networks[J]. arXiv preprint arXiv:2203.02660. https://doi.org/10.1145/3510003.3510219
Chawla NV, Bowyer KW, Hall LO et al (2002) SMOTE: Synthetic Minority Over-sampling Technique[J]. J Artif Intell Res 16:321–357. https://doi.org/10.1613/jair.953
Chen X, Zhao YQ, Gu Q, Ni C, Wang Z (2019) Empirical Studies on Multi-objective File-level Software Defect Prediction Method. Ruan Jian Xue Bao/J Software 30(12):3694–3713 (in Chinese). https://doi.org/10.13328/j.cnki.jos.005604
Cheng X, Wang H, Hua J et al (2021) Deepwukong: Statically Detecting Software Vulnerabilities using Deep Graph Neural Network[J]. ACM Trans Software Engin Methodology (TOSEM) 30(3):1–33. https://doi.org/10.1145/3436877
Cho K, Van Merriënboer B, Gulcehre C, et al. (2014) Learning Phrase Representations Using RNN Encoder-Decoder for Statistical Machine Translation[J]. arXiv preprint arXiv:1406.1078. https://doi.org/10.48550/arXiv.1406.1078
Cohen J (2013) Statistical Power Analysis for the Behavioral Sciences[M]. Academic press
Dam HK, Tran T, Pham T et al (2018) Automatic Feature Learning for Predicting Vulnerable Software Components[J]. IEEE Trans Softw Eng 47(1):67–85. https://doi.org/10.1109/TSE.2018.2881961
Ferrante J , Ottenstein K J , Warren JD . The Program Dependence Graph and Its Use in Optimization[J]. International Symposium on Programming, 6th Colloquium, Toulouse, April 17-19, 1984, Proceedings, 1984. https://doi.org/10.1145/24039.24041
Freeman LC (1978) Centrality in Social Networks Conceptual Clarification[J]. Soc Networks 1(3):215–239. https://doi.org/10.1016/0378-8733(78)90021-7
Hata H, Mizuno O, Kikuno T (2010) Fault-prone Module Detection using Large-scale Text Features based on Spam Filtering[J]. Empir Softw Eng 15(2):147–165. https://doi.org/10.1007/s10664-009-9117-9
Hellendoorn VJ, Devanbu P (2017) Are Deep Neural Networks the Best Choice for Modeling Source Code?[C]//Proceedings of the 2017 11th Joint Meeting on Foundations of. Softw Eng:763–773. https://doi.org/10.1145/3106237.3106290
Hin D, Kan A, Chen H, et al. (2022) LineVD: Statement-level Vulnerability Detection using Graph Neural Networks[C]//Proceedings of the 19th International Conference on Mining Software Repositories. 596-607. https://doi.org/10.1145/3524842.3527949
Hindle A, Godfrey MW, Holt RC (2008) Reading Beside the Lines: Indentation as A Proxy for Complexity Metric[C]//2008 16th IEEE International Conference on Program Comprehension. IEEE:133–142. https://doi.org/10.1109/ICPC.2008.13
Huang Q, Xia X, Lo D (2017) Supervised vs unsupervised Models: A Holistic Look at Effort-aware Just-in-time Defect Prediction[C]//2017 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE:159–170. https://doi.org/10.1109/ICSME.2017.51
Ieee S (1994) IEEE Standard Classification for Software Anomalies.[J]. IEEE Standard Indus 9(2):1–4
Jian L, He P, Zhu J et al (2017) Software Defect Prediction via Convolutional Neural Network[C]//. In: 2017 IEEE International Conference on Software Quality, Reliability and Security (QRS). IEEE. https://doi.org/10.1109/QRS.2017.42
Kamei Y, Shihab E (2016) Defect Prediction: Accomplishments and Future Challenges. Leaders Tomorrow Symposium: Future Software Engineering FOSE@SANER 2016:33–45. https://doi.org/10.1109/SANER.2016.56
Katz L (1953) A New Status Index Derived From Sociometric Analysis[J]. Psychometrika 18(1):39–43. https://doi.org/10.1007/BF02289026
Khakhar P, Dubey RK (2022) The Integrity of Machine Learning Algorithms Against Software Defect Prediction[M]//Artificial Intelligence and Machine Learning for EDGE Computing. Academic Press:65–74. https://doi.org/10.1016/B978-0-12-824054-0.00027-7
Kingma DP, Adam BJ (2014) A Method for Stochastic Optimization[J]. arXiv preprint arXiv:1412.6980. https://doi.org/10.48550/arXiv.1412.6980
Kondo M, German DM, Mizuno O et al (2020) The Impact of Context Metrics on Just-In-Time Defect Prediction[J]. Empir Softw Eng 25:890–939. https://doi.org/10.1007/s10664-019-09736-3
Korel B (1987) The Program Dependence Graph in Static Program Testing[J]. Inf Process Lett 24(2):103–108. https://doi.org/10.1016/0020-0190(87)90102-5
Le Q, Mikolov T (2014) Distributed Representations of Sentences and Documents[C]//International conference on machine learning. PMLR:1188–1196 10.48550/ arXiv.1405.4053
Li XZ, Qing DJ, He YP, Ma HT (2022) Fine-grained Bug Location Method Based on Source Code Extension Information. Ruan Jian Xue Bao/J Software 33(11):4008–4026 (in Chinese). https://doi.org/10.13328/j.cnki.jos.006339
Li Y, Wang S, Nguyen T N (2021) Vulnerability Detection with Fine-grained Interpretations[C]//Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 292-303. https://doi.org/10.1145/3468264.3468597
Lou Y, Zhu Q, Dong J, et al. (2021) Boosting Coverage-based Fault Localization via Graph-based Representation Learning[C]//Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 664-676. https://doi.org/10.1145/3468264.3468580
Lucic A, Ter Hoeve MA, Tolomei G et al (2022) Cf-gnnexplainer: Counterfactual Explanations for Graph Neural Networks[C]//International Conference on Artificial Intelligence and Statistics. PMLR:4499–4511
Lundberg S M, Lee S I (2017) A Unified Approach to Interpreting Model Predictions[J]. Advances in Neural Information Processing Systems, 30.
Luo D, Cheng W, Xu D, et al. (2020) Parameterized Explainer for Graph Geural Getwork[J]. Advances in Neural Information Processing Systems, 33: 19620-19631. https://doi.org/10.48550/arXiv.2011.04573
Miletić M, Vukušić M, Mauša G, et al. Cross-release Code Churn Impact on Effort-aware Software Defect Prediction[C]//2018 41st International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO) IEEE, 2018: 1460-1466. https://doi.org/10.23919/MIPRO.2018.8400263
Morasca S, Lavazza L (2020) On the Assessment of Software Defect Prediction Models via ROC Curves[J]. Empir Softw Eng 25:3977–4019. https://doi.org/10.1007/s10664-020-09861-4
Nguyen HH et al (2022) MANDO: Multi-Level Heterogeneous Graph Embeddings for Fine-Grained Detection of Smart Contract Vulnerabilities. In: 2022 IEEE 9th International Conference on Data Science and Advanced Analytics (DSAA), Shenzhen, China, pp 1–10. https://doi.org/10.1109/DSAA54385.2022.10032337
Parnin C, Orso A (2011) Are Automated Debugging Techniques Actually Helping Programmers?[C] //Proceedings of the 2011 international symposium on software testing and analysis. 199-209. https://doi.org/10.1145/2001420.2001445
Pornprasit C, Tantithamthavorn C (2022) DeepLineDP: Towards A Deep Learning Approach for Line-Level Defect Prediction[J]. IEEE Trans Softw Eng. https://doi.org/10.1109/TSE.2022.3144348
Pornprasit C, Tantithamthavorn CK (2021) JITLine: A Simpler, Better, Faster, Finer-grained Just-in-time Defect Prediction[C]//2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR). IEEE:369–379. https://doi.org/10.1109/MSR52588.2021.00049
Pradel M, Sen K (2018) Deepbugs: A Learning Approach to Name-based Bug Detection[J]. Proc ACM Prog Lang 2(OOPSLA):1–25. https://doi.org/10.1145/3276517
Rahman F, Posnett D, Devanbu P (2012) Recalling the “Imprecision” of Cross-project Defect Prediction[C]//Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering. 1-11. https://doi.org/10.1145/2393596.2393669
Ray B, Hellendoorn V, Godhane S, et al. (2016) On the “Naturalness” of Buggy Code[C]/ /Proceedings of the 38th International Conference on Software Engineering. 428-439. https://doi.org/10.1145/2884781.2884848
Ribeiro MT, Singh S, Guestrin C (2016) “Why Should I Trust You?” Explaining the predictions of any classifier[C]//Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. 1135-1144. https://doi.org/10.1145/2939672.2939778
Robbins H, Monro S (1951) A Stochastic Approximation Method[J]. Ann Math Stat 22(3):400–407. https://doi.org/10.1214/aoms/1177729586
Schlichtkrull M, Kipf T N, Bloem P, et al. Modeling Relational Data with Graph Convolutional Networks[C]//The Semantic Web: 15th International Conference, ESWC 2018, Heraklion, Crete, Greece, June 3–7, 2018, Proceedings 15. Springer International Publishing, 2018: 593-607. https://doi.org/10.1007/978-3-319-93417-4_38
Sohn J, Kamei Y, McIntosh S et al (2021) Leveraging Fault Localisation to Enhance Defect Prediction[C]//2021 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE:284–294. https://doi.org/10.1109/SANER50967.2021.00034
Staniak M, Biecek P (2018) Explanations of Model Predictions with Live and BreakDown Packages[J]. arXiv preprint arXiv:1804.01955. 10.48550/ arXiv.1804.01955
Tang L, Tao C, Guo H et al (2022) Software Defect Prediction via GCN based on Structural and Context Information[C]//2022 9th International Conference on Dependable Systems and Their Applications (DSA). IEEE:310–319. https://doi.org/10.1109/DSA56465.2022.00049
Tantithamthavorn C, Hassan AE, Matsumoto K (2018b) The Impact of Class Rebalancing Techniques on the Performance and Interpretation of Defect Prediction Models[J]. IEEE Trans Softw Eng 46(11):1200–1219. https://doi.org/10.1109/TSE.2018.2876537
Tantithamthavorn C, McIntosh S, Hassan A E, et al. (2016a) Automated Parameter Optimization of Classification Techniques for Defect Prediction Models[C]//Proceedings of the 38th International Conference on Software Engineering. 321-332. https://doi.org/10.1145/2884781.2884857
Tantithamthavorn C, McIntosh S, Hassan AE et al (2016b) An Empirical Comparison of Model Validation Techniques for Defect Prediction Models[J]. IEEE Trans Softw Eng 43(1):1–18. https://doi.org/10.1109/TSE.2016.2584050
Tantithamthavorn C, McIntosh S, Hassan AE et al (2018a) The Impact of Automated Parameter Optimization on Defect Prediction Models[J]. IEEE Trans Softw Eng 45(7):683–711. https://doi.org/10.1109/TSE.2018.2794977
Uddin S, Hossain L, Wigand RT (2014) New Direction in Degree Centrality Measure: Towards a Time-variant Approach[J]. Int J Inf Technol Decis Mak 13(04):865–878. https://doi.org/10.1109/SANER.2016.56
Vaswani A, Shazeer N, Parmar N, et al. (2017) Attention is All You Need[J]. Advances in neural information processing systems, 30. https://doi.org/10.48550/arXiv.1706.03762
Wan Z , Xia X , Hassan A E , et al. (2018a) Perceptions, Expectations, and Challenges in Defect Prediction[J]. IEEE Transactions on Software Engineering, PP:1-1. https://doi.org/10.1109/TSE.2018.2877678
Wan Z, Xia X, Hassan AE et al (2018b) Perceptions, Expectations, and Challenges in Defect Prediction[J]. IEEE Trans Softw Eng 46(11):1241–1266. https://doi.org/10.1109/TSE.2018.2877678
Wang H, Khoshgoftaar TM, Napolitano A (2010) A comparative Study of Ensemble Feature Selection Techniques for Software Defect Prediction[C]//2010 Ninth International Conference on Machine Learning and Applications. IEEE:135–140. https://doi.org/10.1109/ICMLA.2010.27
Wang S, Chollak D, Movshovitz-Attias D, et al. (2016) Bugram: Bug Detection with N-gram Language Models[C]//Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering. 708-719. https://doi.org/10.1145/2970276.2970341
Wang S, Liu T, Nam J et al (2018) Deep Semantic Feature Learning for Software Defect Prediction[J]. IEEE Trans Softw Eng 46(12):1267–1293. https://doi.org/10.1109/TSE.2018.2877612
Wang W, Li G, Ma B et al (2020) Detecting Code Clones with Graph Neural Network and Flow-augmented Abstract Syntax Tree[C]//2020 IEEE 27th International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE:261–271. https://doi.org/10.1109/SANER48275.2020.9054857
Wang Y, Wang W, Joty S, et al. (2021) Codet5: Identifier-aware Unified Pre-trained Encoder-decoder Models for Code Understanding and Generation[J]. arXiv preprint arXiv:2109.00859. https://doi.org/10.48550/arXiv.2305.07922
Wattanakriengkrai S, Thongtanunam P, Tantithamthavorn C et al (2020) Predicting Defective Lines Using a Model-Agnostic Technique[J]. IEEE Transac Software Eng. https://doi.org/10.1109/TSE.2020.3023177
Wong WE, Debroy V, Surampudi A et al (2010) Recent Catastrophic Accidents. In: Investigating How Software was Responsible[C]// Fourth International Conference on Secure Software Integration & Reliability Improvement. IEEE Computer Society. https://doi.org/10.1109/TSE.2012.56
Wong WE, Li X, Laplante PA (2017) Be More Familiar with Our Enemies and Pave the Way Forward: A Review of the Roles Bugs Played in Software Failures[J]. J Syst Softw 133:68–94. https://doi.org/10.1016/j.jss.2017.06.069
Wu B, Liang X, Zhang SS, Xu R (2022a) Advances and Applications in Graph Neural Network[J]. Chinese J Comput 45(01):35–68(in Chinese with English abstract). https://doi.org/10.11897/SP.J.1016.2022.00035
Wu Y, Zou D, Dou S, et al. (2022b) VulCNN: An Image-inspired Scalable Vulnerability Detection System[C]//Proceedings of the 44th International Conference on Software Engineering. 2365-2376. https://doi.org/10.1145/3510003.3510229
Wu Z, Pan S, Chen F et al (2020) A Comprehensive Survey on Graph Neural Networks[J]. IEEE Transac Neural Networks Learn Syst 32(1):4–24. https://doi.org/10.1109/TNNLS.2020.2978386
Xiao Y, Jin L, Yang Z et al (2017) The Bayesian Network Based Program Dependence Graph and Its Application to Fault Localization[J]. J Syst Softw:134. https://doi.org/10.1016/j.jss.2017.08.025
Xu J, Ai J, Liu J et al (2022) ACGDP: An Augmented Code Graph-Based System for Software Defect Prediction[J]. IEEE Trans Reliab. https://doi.org/10.1109/TR.2022.3161581
Yan M, Xia X, Fan Y et al (2020) Just-in-time Defect Identification and Localization: A two-phase Framework[J]. IEEE Trans Softw Eng 48(1):82–101. https://doi.org/10.1109/TSE.2020.2978819
Yang F Y, Zeng G D, Zhong F, et al. (2023) Interpretable Software Defect Prediction Incorporating Multiple Rules[C]//2023 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE. 940-947.
Yang Z, Yang D, Dyer C, et al. (2016) Hierarchical Attention Networks for Document Classification[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 1480-1489. https://doi.org/10.18653/v1/N16-1174
Ying R , Bourgeois D , You J , et al. (2019) GNNExplainer: Generating Explanations for Graph Neural Networks[J]. Advances in Neural Information Processing Systems, 32:9240-9251. https://doi.org/10.48550/arXiv.1903.03894
Zeng C, Zhou CY, Lv SK et al (2021) GCN2defect: Graph Convolutional Networks for SMOTETomek-based Software Defect Prediction[C]//2021 IEEE 32nd International Symposium on Software Reliability Engineering (ISSRE). IEEE:69–79. https://doi.org/10.1109/ISSRE52982.2021.00020
Zhang Z, Lei Y, Yan M, et al. (2022) Reentrancy Vulnerability Detection and Localization: A Deep Learning Based Two-phase Approach[C]//Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering. 1-13. https://doi.org/10.1145/3551349.3560428
Zheng W, Chen TF, Hu MT et al (2023) Hybrid Defect Prediction Model Based on Counterfactual Feature Optimization[J]. Human-Centric Intel Syst:1–15
Zhu Q (2020) On the Performance of Matthews Correlation Coefficient (MCC) for Imbalanced Dataset[J]. Pattern Recogn Lett 136:71–80. https://doi.org/10.1016/j.patrec.2020.03.030
Acknowledgements
This paper is supported by the Jiangxi Provincial Key R&D Program Project(20202BBEL53002).
Funding
This paper is supported by the Jiangxi Provincial Key R&D Program Project(20202BBEL53002).
Author information
Authors and Affiliations
Contributions
All authors are contributed equally.
Corresponding author
Ethics declarations
Competing interests
The authors declare they have no conflicts of interest.
Additional information
Communicated by: Leandro L. Minku
Appendix
Appendix
In this section, we present detailed experimental results for the four research questions. Bold and shaded values indicate the method that performs best on each dataset, while Win/Tie/Loss (W/T/L) indicates the number of datasets on which LineFlowDP outperforms/equally performs/worse than the other methods in terms of the metric values. “↑” indicates that the higher the values are, the better the approach, and “↓” indicates that the lower the values are, the better the approach.
1.1 Detailed Results for RQ1
1.2 Detailed Results for RQ2
Results for RQ2.1 are shown in Tables 6, 7 and 8.
Results for RQ2.2 are shown in Tables 9, 10, 11 and 12.
1.3 Detailed Results for RQ3
Results for RQ3.1 are shown in Tables 13, 14 and 15.
Results for RQ3.2 are shown in Tables 16, 17 and 18.
1.4 Detailed Results for RQ4
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Yang, F., Zhong, F., Zeng, G. et al. LineFlowDP: A Deep Learning-Based Two-Phase Approach for Line-Level Defect Prediction. Empir Software Eng 29, 50 (2024). https://doi.org/10.1007/s10664-023-10439-z
Accepted:
Published:
DOI: https://doi.org/10.1007/s10664-023-10439-z