Abstract
With the continuous growth of the number and scale of computing software, the occurrence rate of vulnerabilities in program source code has also greatly increased, which is not a good phenomenon for the entire computer open source environment, and also makes vulnerability mining an important research direction. Traditional vulnerability mining relies on security researchers to manually analyze programs or find vulnerabilities based on predefined rules, which can't effectively cope with the rapidly increasing amount of software. At present, there are some methods that combine machine learning or deep learning technology for automatic vulnerability mining. However, these methods only detect the program source code at function level or code block level to predict whether a given program has vulnerabilities, which makes the results of vulnerability mining difficult to be explained and utilized, and also leads to low accuracy and high false alarm rate. On the other hand, the current program processing method does not have a particularly perfect processing solution for the program source code, which leads to a large amount of data redundancy and additional computational overhead. For example, treat it as a textual language, or treat it as a graph of program properties extracted by dynamic analysis. In order to solve the many shortcomings of the current work, this scheme proposes a deep learning-based vulnerability mining model scheme to achieve interpretable fine-grained vulnerability mining.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Peng, H., Mou, L., Li, G., Liu, Y., Zhang, L., Jin, Z.: Building program vector representations for deep learning. In: Zhang, S., Wirsing, M., Zhang, Z. (eds.) KSEM 2015. LNCS (LNAI), vol. 9403, pp. 547–553. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-25159-2_49
Alon, U., Zilberstein, M., Levy, O., et al.: code2vec: learning distributed representations of code. PACMPL. 3(POPL), 40:1–40:29 (2019)
Meng, Q., Wen, S., Feng, C., et al.: Predicting buffer overflow using semi-supervised learning. In: Wang, Y., An, J., Wang, L., et al. (eds.) 9th International Congress on Image and Signal Processing, BioMedicalEngineering and Informatics, CISP-BMEI 2016, Datong, China, October 15–17, 2016, pp. 1959–1963. IEEE (2016)
Alohaly, M., Takabi, H.: When do changes induce software vulnerabilities? In: 3rd IEEE International Conference on Collaboration and Internet Computing, CIC 2017, San Jose, CA, USA, October 15–17, 2017, pp. 59–66. IEEE Computer Society (2017)
Yamaguchi, F., Lottmann, M., Rieck, K.: Generalized vulnerability extrapolation using abstract syntax trees. Zakon, R.H. (ed.) 28th Annual Computer Security Applications Conference, ACSAC 2012, Orlando, FL, USA, 3–7 December 2012, pp. 359–368. ACM (2012)
Yamaguchi, F., Golde, N., Arp, D., et al:.modeling and discovering vulnerabilities with code property graphs. In: 2014 IEEE Symposium on Security and Privacy, SP 2014, Berkeley, CA, USA, May 18–2, 2014, pp. 590–604. IEEE Computer Society (2014)
Yamaguchi, F., Maier, A., Gascon, H., et al.: Automatic inference of search patterns for tain-style vulnerabilities. In: 2015 IEEE Symposium on Security and Privacy, SP 2015, San Jose, CA, USA, May 17–21, 2015, pp. 797–812. IEEE Computer Society (2015)
Lin, G., Zhang, J., Luo, W., et al.: Cross-project transfer representation learning for vulnerable function discovery. IEEE Trans. Indus. Inform. 14(7), 3289–3297 (2018)
Team L. Clang[EB/OL]. (2019-12-20) [2020-2-18]. https://clang.llvm.org/
Duan, X., Wu, J., Ji, S., et al.: VulSniper: focus your attention to shoot fine-grained vulnerabilities. In: Kraus, S. (ed.) Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI 2019, Macao, China, August 10–16, 2019, pp. 4665–4671. ijcai.org (2019)
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate (2015)
Cho, K., van Merrienboer, B., Gülçehre Ç., et al.: Learning phrase representations using rnn encoder-decoder for statistical machine translation. In: Moschitti, A., Pang, B., Daelemans, W. (eds.) Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, October 25–29, 2014, Doha, Qatar, A meeting of SIGDAT, a Special Interest Group of the ACL, 2014, pp. 1724–1734. ACL (2014)
Acknowledgement
This work is partially supported by the Hainan Province Science and Technology Special Fund (ZDYF2020212).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Lin, Z., Guo, Z. (2023). Program Source Code Vulnerability Mining Scheme Based on Abstract Syntax Tree. In: Hung, J.C., Yen, N.Y., Chang, JW. (eds) Frontier Computing. FC 2022. Lecture Notes in Electrical Engineering, vol 1031. Springer, Singapore. https://doi.org/10.1007/978-981-99-1428-9_268
Download citation
DOI: https://doi.org/10.1007/978-981-99-1428-9_268
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-1427-2
Online ISBN: 978-981-99-1428-9
eBook Packages: EngineeringEngineering (R0)