Skip to main content
Log in

bjXnet: an improved bug localization model based on code property graph and attention mechanism

  • Published:
Automated Software Engineering Aims and scope Submit manuscript

Abstract

Bug localization technologies and tools are widely used in software engineering. Although state-of-the-art methods have achieved great progress, they only consider the source code information at the text level, which may establish a wrong correlation between the source code and the bug report, affecting the localization accuracy and reliability. In this paper, we propose an improved bug localization model, which uses the semantics of source codes at the graph level to supplement its semantics at the text level, optimizing and adjusting the graph semantics in combination with the attention mechanism to obtain the code semantic feature including the shallow and deep semantics of the source code. Finally, the correlation between code semantic feature and report semantic feature is measured by cosine similarity. We conduct experiments on three open source Java projects to comprehensively evaluate the performance of proposed model. The experimental results show that the model is significantly better than state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. https://nlp.stanford.edu/projects/glove/.

  2. https://github.com/joernio/joern.

  3. https://www.dgl.ai.

References

  • Baltrušaitis, T., Ahuja, C., Morency, L.-P.: Multimodal machine learning: a survey and taxonomy. IEEE Trans. Pattern Anal. Mach. Intell. 41(2), 423–443 (2018)

    Article  Google Scholar 

  • Bian, P., Liang, B., Huang, J., Shi, W., Wang, X., Zhang, J.: Sinkfinder: harvesting hundreds of unknown interesting function pairs with just one seed. In: Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 1101–1113 (2020)

  • Chakraborty, S., Ray, B.: On multi-modal learning of editing source code. In: 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 443–455 . IEEE (2021)

  • Chen, H., Ding, G., Lin, Z., Zhao, S., Han, J.: Cross-modal image-text retrieval with semantic consistency. In: Proceedings of the 27th ACM International Conference on Multimedia, pp. 1749–1757 (2019)

  • Chen, S., Zhao, Y., Jin, Q., Wu, Q.: Fine-grained video-text retrieval with hierarchical graph reasoning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10638–10647 (2020)

  • Chen, Z.L., Deqing, Z., Hai, J.: Intelligent vulnerability detection system based on abstract syntax tree. J. Cyber Secur. 5(4), 13 (2020)

    Google Scholar 

  • Cheng, X., Wang, H., Hua, J., Xu, G., Sui, Y.: Deepwukong: statically detecting software vulnerabilities using deep graph neural network. ACM Trans. Softw. Eng. Methodol. TOSEM 30(3), 1–33 (2021)

    Article  Google Scholar 

  • Feng, H., Fu, X., Sun, H., Wang, H., Zhang, Y.: Efficient vulnerability detection based on abstract syntax tree and deep learning. In: IEEE INFOCOM 2020-IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), pp. 722–727 . IEEE (2020)

  • Ghadery, E., Movahedi, S., Faili, H., Shakery, A.: MNCN: a multilingual Ngram-based convolutional network for aspect category detection in online reviews. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 6441–6448 (2019)

  • Guo, B., Zhang, C., Liu, J., Ma, X.: Improving text classification with weighted word embeddings via a multi-channel TextCNN model. Neurocomputing 363, 366–374 (2019)

    Article  Google Scholar 

  • He, X., Deng, K., Wang, X., Li, Y., Zhang, Y., Wang, M.: Lightgcn: Simplifying and powering graph convolution network for recommendation. In: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 639–648 (2020)

  • Huo, X., Thung, F., Li, M., Lo, D., Shi, S..-T.: Deep transfer bug localization. IEEE Trans. Softw. Eng. 47(7), 1368–1380 (2019)

    Article  Google Scholar 

  • Jiayong, L.I.U., Jiaxuan, H.C.H.A.N.: Vulnerability detection in source code using statice analysis. J. Cyber Secur. 4, 100–113 (2022)

    Google Scholar 

  • Li, Y., Tarlow, D., Brockschmidt, M., Zemel, R.: Gated graph sequence neural networks. arXiv preprint arXiv:1511.05493 (2015)

  • Li, Y., Wang, S., Nguyen, T.N., Van Nguyen, S.: Improving bug detection via context-based code representation learning and attention-based neural networks. Proc. ACM Program. Lang. 3(OOPSLA), 1–30 (2019)

    Article  Google Scholar 

  • Ling, X., Wu, L., Wang, S., Pan, G., Ma, T., Xu, F., Liu, A.X., Wu, C., Ji, S.: Deep graph matching and searching for semantic code retrieval. ACM Trans. Knowl. Discov. Data TKDD 15(5), 1–21 (2021)

    Article  Google Scholar 

  • Liu, S., Fan, H., Qian, S., Chen, Y., Ding, W., Wang, Z.: Hit: hierarchical transformer with momentum contrast for video-text retrieval. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 11915–11925 (2021)

  • Liu, J., Yang, M., Li, C., Xu, R.: Improving cross-modal image-text retrieval with teacher-student learning. IEEE Trans. Circuits Syst. Video Technol. 31(8), 3242–3253 (2020)

    Article  Google Scholar 

  • Lukins, S.K., Kraft, N.A., Etzkorn, L.H.: Source code retrieval for bug localization using latent Dirichlet allocation. In: 2008 15Th Working Conference on Reverse Engineering, pp. 155–164 . IEEE (2008)

  • Luo, Z., Liu, L., Yin, J., Li, Y., Wu, Z.: Deep learning of graphs with Ngram convolutional neural networks. IEEE Trans. Knowl. Data Eng. 29(10), 2125–2139 (2017)

    Article  Google Scholar 

  • Mithun, N.C., Li, J., Metze, F., Roy-Chowdhury, A.K.: Learning joint embedding with multimodal cues for cross-modal video-text retrieval. In: Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval, pp. 19–27 (2018)

  • Pennington, J., Socher, R., Manning, C.D.: Glove: Global vectors for word representation. In: Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543. http://www.aclweb.org/anthology/D14-1162 (2014)

  • Qi, B., Sun, H., Yuan, W., Zhang, H., Meng, X.: DreamLoc: a deep relevance matching-based framework for bug localization. IEEE Trans. Reliab. 71(1), 235–249 (2021)

    Article  Google Scholar 

  • Qi, C., Zhang, J., Jia, H., Mao, Q., Wang, L., Song, H.: Deep face clustering using residual graph convolutional network. Knowl. Based Syst. 211, 106561 (2021)

    Article  Google Scholar 

  • Saha, R.K., Lease, M., Khurshid, S., Perry, D.E.: Improving bug localization using structured information retrieval. In: 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 345–355. IEEE (2013)

  • Siow, J.K., Liu, S., Xie, X., Meng, G., Liu, Y.: Learning program semantics with code representations: an empirical study. arXiv preprint arXiv:2203.11790 (2022)

  • Sisman, B., Kak, A.C.: Incorporating version histories in information retrieval based bug localization. In: 2012 9th IEEE Working Conference on Mining Software Repositories (MSR), pp. 50–59. IEEE (2012)

  • Wan, Y., Shu, J., Sui, Y., Xu, G., Zhao, Z., Wu, J., Yu, P.: Multi-modal attention network learning for semantic source code retrieval. In: 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 13–25 . IEEE (2019)

  • Wang, H., Bai, X., Yang, M., Zhu, S., Wang, J., Liu, W.: Scene text retrieval via joint text detection and similarity learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4558–4567 (2021)

  • Wang, S., Lo, D.: Version history, similar report, and structure: putting them together for improved bug localization. In: Proceedings of the 22nd International Conference on Program Comprehension, pp. 53–63 (2014)

  • Wang, Z., Zheng, L., Li, Y., Wang, S.: Linkage based face clustering via graph convolution network. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1117–1125 (2019)

  • Wartschinski, L., Noller, Y., Vogel, T., Kehrer, T., Grunske, L.: VUDENC: vulnerability detection with deep learning on a natural codebase for python. Inf. Softw. Technol. 144, 106809 (2022)

    Article  Google Scholar 

  • Wei, Y., Wang, X., Nie, L., He, X., Hong, R., Chua, T.-S.: MMGCN: multi-modal graph convolution network for personalized recommendation of micro-video. In: Proceedings of the 27th ACM International Conference on Multimedia, pp. 1437–1445 (2019)

  • Wu, Y., Wang, S., Huang, Q.: Multi-modal semantic autoencoder for cross-modal retrieval. Neurocomputing 331, 165–175 (2019)

    Article  Google Scholar 

  • Xiao, Y., Keung, J., Mi, Q., Bennin, K.E.: Improving bug localization with an enhanced convolutional neural network. In: 2017 24th Asia-Pacific Software Engineering Conference (APSEC), pp. 338–347. IEEE (2017)

  • Xiao, Y., Keung, J., Bennin, K.E., Mi, Q.: Improving bug localization with word embedding and enhanced convolutional neural networks. Inf. Softw. Technol. 105, 17–29 (2019)

    Article  Google Scholar 

  • Xu, Y., Chen, Z., Huang, B., Liu, X., Dong, C.: Httext: A textcnn-based pre-silicon detection for hardware trojans. In: 2021 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom), pp. 55–62 . IEEE (2021)

  • Yamaguchi, F., Golde, N., Arp, D., Rieck, K.: Modeling and discovering vulnerabilities with code property graphs. In: 2014 IEEE Symposium on Security and Privacy, pp. 590–604. IEEE (2014)

  • Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., Hovy, E.: Hierarchical attention networks for document classification. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1480–1489 (2016)

  • Yao, Z., Peddamail, J.R., Sun, H.: CoaCor: code annotation for code retrieval with reinforcement learning. In: The World Wide Web Conference, pp. 2203–2214 (2019)

  • Ye, X., Shen, H., Ma, X., Bunescu, R., Liu, C.: From word embeddings to document similarities for improved information retrieval in software engineering. In: Proceedings of the 38th International Conference on Software Engineering, pp. 404–415 (2016)

  • Youm, K.C., Ahn, J., Lee, E.: Improved bug localization based on code change histories and bug reports. Inf. Softw. Technol. 82, 177–192 (2017)

    Article  Google Scholar 

  • Yu, L., Chen, L., Dong, J., Li, M., Liu, L., Zhao, B., Zhang, C.: Detecting malicious web requests using an enhanced textcnn. In: 2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC), pp. 768–777. IEEE (2020)

  • Zhang, Y., Roller, S., Wallace, B.: Mgnc-cnn: A simple approach to exploiting multiple word embeddings for sentence classification. arXiv preprint arXiv:1603.00968 (2016)

  • Zhang, Y., Wallace, B.: A sensitivity analysis of (and practitioners’ guide to) convolutional neural networks for sentence classification. arXiv preprint arXiv:1510.03820 (2015)

  • Zhang, J., Xie, R., Ye, W., Zhang, Y., Zhang, S.: Exploiting code knowledge graph for bug localization via bi-directional attention. In: Proceedings of the 28th International Conference on Program Comprehension, pp. 219–229 (2020)

  • Zhou, J., Zhang, H., Lo, D.: Where should the bugs be fixed? More accurate information retrieval-based bug localization based on bug reports. In: 2012 34th International Conference on Software Engineering (ICSE), pp. 14–24 . IEEE (2012)

  • Zhou, Y., Liu, S., Siow, J., Du, X., Liu, Y.: Devign: effective vulnerability identification by learning comprehensive program semantics via graph neural networks. Adv. Neural Inf. Process. Syst. 32, 256 (2019)

    Google Scholar 

  • Zhu, Z., Tong, H., Wang, Y., Li, Y.: Enhancing bug localization with bug report decomposition and code hierarchical network. Knowl. Based Syst. 248, 108741 (2022)

    Article  Google Scholar 

  • Zhuo, Y., Li, Y., Hsiao, J., Ho, C., Li, B.: Clip4hashing: unsupervised deep hashing for cross-modal video-text retrieval. In: Proceedings of the 2022 International Conference on Multimedia Retrieval, pp. 158–166 (2022)

Download references

Acknowledgements

This paper is supported by the National Key Research and Development Program of China (No.2021YFB3100500) and the Frontier Science and Technology Innovation Projects of National Key Research and Development Program (No.2019QY1405).

Author information

Authors and Affiliations

Authors

Contributions

JH and CH conceived the proposed model. JH and SS realized the proposed model by programming. ZL constructed the experiment and evaluated the proposed model comprehensively on the experiment dataset. JH and CH wrote the manuscript text, and all authors reviewed it.

Corresponding author

Correspondence to Cheng Huang.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Han, J., Huang, C., Sun, S. et al. bjXnet: an improved bug localization model based on code property graph and attention mechanism. Autom Softw Eng 30, 12 (2023). https://doi.org/10.1007/s10515-023-00379-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10515-023-00379-9

Keywords

Navigation