A unit-based symbolic execution method for detecting memory corruption vulnerabilities in executable codes

Baradaran, Sara; Heidari, Mahdi; Kamali, Ali; Mouzarani, Maryam

doi:10.1007/s10207-023-00691-1

A unit-based symbolic execution method for detecting memory corruption vulnerabilities in executable codes

Regular Contribution
Published: 07 May 2023

Volume 22, pages 1277–1290, (2023)
Cite this article

International Journal of Information Security Aims and scope Submit manuscript

274 Accesses
2 Altmetric
Explore all metrics

Abstract

Memory corruption is a serious class of software vulnerabilities, which requires careful attention to be detected and removed from applications before getting exploited and harming the system users. Symbolic execution is a well-known method for analyzing programs and detecting various vulnerabilities, e.g., memory corruption. Although this method is sound and complete in theory, it faces some challenges, such as path explosion, when applied to real-world complex programs. In this paper, we present a method for improving the efficiency of symbolic execution and detecting four classes of memory corruption vulnerabilities in executable codes, i.e., heap-based buffer overflow, stack-based buffer overflow, use-after-free, and double-free. We perform symbolic execution only on test units rather than the whole program to lower the chance of path explosion. In our method, test units are considered parts of the program’s code, which might contain vulnerable statements and are statically identified based on the specifications of memory corruption vulnerabilities. Then, each test unit is symbolically executed to calculate path and vulnerability constraints for each statement of the unit, which determine the conditions on unit input data for executing that statement or activating vulnerabilities in it, respectively. Solving these constraints gives us input values for the test unit, which execute the desired statements and reveal vulnerabilities in them. Finally, we use machine learning to approximate the correlation between system and unit input data. Thereby, we generate system inputs that enter the program, reach vulnerable instructions in the desired test unit, and reveal vulnerabilities in them. This method is implemented as a plug-in for angr framework and evaluated using a group of benchmark programs. The experiments show its superiority over similar tools in accuracy and performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey on run-time packers and mitigation techniques

Article 01 November 2023

An empirical study of automated unit test generation for Python

Article Open access 31 January 2023

APR4Vul: an empirical study of automatic program repair techniques on real-world Java vulnerabilities

Article Open access 06 December 2023

References

Arlinghaus, S.L., Arlinghaus, W.C., Drake, W.D., Nystuen, J.D.: Practical handbook of curve fitting (1994)
Baldoni, R., Coppa, E., D’elia, D.C., Demetrescu, C., Finocchi, I.: A survey of symbolic execution techniques. ACM Comput. Surv. 51(3), 1–39 (2018). https://doi.org/10.1145/3182657
Article Google Scholar
Caballero, J., Grieco, G., Marron, M., Nappa, A.: Undangle: early detection of dangling pointers in use-after-free and double-free vulnerabilities. In: Proceedings of the 2012 International Symposium on Software Testing and Analysis, ISSTA 2012, pp. 133–143. Association for Computing Machinery, New York (2012). https://doi.org/10.1145/2338965.2336769
Cadar, C., Dunbar, D., Engler, D.: Klee: unassisted and automatic generation of high-coverage tests for complex systems programs. In: Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation, OSDI’08, pp. 209–224. USENIX Association, USA (2008). https://doi.org/10.5555/1855741.1855756
Cha, S., Lee, S., Oh, H.: Template-guided concolic testing via online learning. In: Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, ASE 2018, pp. 408–418. Association for Computing Machinery, New York (2018). https://doi.org/10.1145/3238147.3238227
Cha, S., Oh, H.: Concolic testing with adaptively changing search heuristics. In: Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2019, pp. 235–245. Association for Computing Machinery, New York (2019). https://doi.org/10.1145/3338906.3338964
Cha, S., Hong, S., Bak, J., Kim, J., Lee, J., Oh, H.: Enhancing dynamic symbolic execution by automatically learning search heuristics. IEEE Trans. Softw. Eng. 48(9), 3640–3663 (2022). https://doi.org/10.1109/TSE.2021.3101870
Article Google Scholar
Chen, J., Hu, W., Zhang, L., Hao, D., Khurshid, S., Zhang, L.: Learning to accelerate symbolic execution via code transformation. In: Millstein, T. (eds) 32nd European Conference on Object-Oriented Programming (ECOOP 2018), Leibniz International Proceedings in Informatics (LIPIcs), vol. 109, pp. 6:1–6:27. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl (2018). https://doi.org/10.4230/LIPIcs.ECOOP.2018.6
Chen, T., Zhang, X.S., Guo, S.Z., Li, H.Y., Wu, Y.: State of the art: dynamic symbolic execution for automated test generation. Future Gener. Comput. Syst. 29(7), 1758–1773 (2013). https://doi.org/10.1016/j.future.2012.02.006
Article Google Scholar
Davies, M., Păsăreanu, C.S., Raman, V.: Symbolic execution enhanced system testing. In: Proceedings of the 4th International Conference on Verified Software: Theories, Tools, Experiments, VSTTE’12, pp. 294–309. Springer, Berlin (2012). https://doi.org/10.1007/978-3-642-27705-4_23
Gao, Y., Chen, L., Shi, G., Zhang, F.: A comprehensive detection of memory corruption vulnerabilities for C/C++ programs. In: 2018 IEEE International Conference on Parallel and Distributed Processing with Applications, Ubiquitous Computing and Communications, Big Data and Cloud Computing, Social Computing and Networking, Sustainable Computing and Communications (ISPA/IUCC/BDCloud/SocialCom/SustainCom), pp. 354–360 (2018). https://doi.org/10.1109/BDCloud.2018.00062
Jia, X., Zhang, C., Su, P., Yang, Y., Huang, H., Feng, D.: Towards efficient heap overflow discovery. In: USENIX Security Symposium (2017)
Menzies, T., Hu, Y.: Data mining for very busy people. Computer 36(11), 22–29 (2003). https://doi.org/10.1109/MC.2003.1244531
Article Google Scholar
Mouzarani, M., Sadeghiyan, B.: Towards designing an extendable vulnerability detection method for executable codes. Inf. Softw. Technol. 80, 231–244 (2016). https://doi.org/10.1016/j.infsof.2016.09.004
Article Google Scholar
National institute of standards and technology in software assurance reference dataset project. https://samate.nist.gov/SRD. Last accessed 4 March 2022
Ognawala, S., Ochoa, M., Pretschner, A., Limmer, T.: Macke: Compositional analysis of low-level vulnerabilities with symbolic execution. In: Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, ASE 2016, pp. 780–785. Association for Computing Machinery, New York (2016). https://doi.org/10.1145/2970276.2970281
Park, J., Choi, B., Kim, Y.: Automated memory corruption detection through analysis of static variables and dynamic memory usage. Electronics 10(17), 2127 (2021). https://doi.org/10.3390/electronics10172127
Article Google Scholar
Păsăreanu, C.S., Mehlitz, P.C., Bushnell, D.H., Gundy-Burlet, K., Lowry, M., Person, S., Pape, M.: Combining unit-level symbolic execution and system-level concrete execution for testing NASA software. In: Proceedings of the 2008 International Symposium on Software Testing and Analysis, ISSTA ’08, pp. 15–26. Association for Computing Machinery, New York (2008). https://doi.org/10.1145/1390630.1390635
Shoshitaishvili, Y., Wang, R., Salls, C., Stephens, N., Polino, M., Dutcher, A., Grosen, J., Feng, S., Hauser, C., Kruegel, C., Vigna, G.: Sok: (state of) the art of war: offensive techniques in binary analysis. In: 2016 IEEE Symposium on Security and Privacy (SP), pp. 138–157 (2016). https://doi.org/10.1109/SP.2016.17
Stephens, N., Grosen, J., Salls, C., Dutcher, A., Corbetta, J., Shoshitaishvili, Y., Kruegel, C., Vigna, G.: Driller: augmenting fuzzing through selective symbolic execution. In: Network and Distributed System Security Symposium (NDSS) (2016). https://doi.org/10.14722/ndss.2016.23368
Strang, G.: Linear algebra and its applications (2006)
Ubsym: A unit-based symbolic execution method. https://github.com/SoftwareSecurityLab/UbSym
van der Kouwe, E., Nigade, V., Giuffrida, C.: Dangsan: Scalable use-after-free detection. In: Proceedings of the Twelfth European Conference on Computer Systems, EuroSys ’17, pp. 405–419. Association for Computing Machinery, New York (2017). https://doi.org/10.1145/3064176.3064211
Zhang, B., Wu, B., Feng, C., Tang, C.: Memory corruption vulnerabilities detection for android binary software. In: 2015 IEEE International Conference on Signal Processing, Communications and Computing (ICSPCC), pp. 1–5 (2015). https://doi.org/10.1109/ICSPCC.2015.7338757

Download references

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, Isfahan University of Technology, Isfahan, Iran
Sara Baradaran, Mahdi Heidari, Ali Kamali & Maryam Mouzarani

Authors

Sara Baradaran
View author publications
You can also search for this author in PubMed Google Scholar
Mahdi Heidari
View author publications
You can also search for this author in PubMed Google Scholar
Ali Kamali
View author publications
You can also search for this author in PubMed Google Scholar
Maryam Mouzarani
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Maryam Mouzarani.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Data availability

The test cases used in this research are publicly available online and the links to them have been provided in the article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

The base structure of the designed complex programs is a simple authentication code by which users carry out sign-up and sign-in operations. The source code of these programs is presented in Fig. 13. To generate the test program of each vulnerability class, the commented lines for each specific vulnerability should be uncommented. In the following, the structure of the test program that contains heap-based buffer overflow vulnerability is explained as an instance.

This program begins by receiving a username and password in the console to sign-up a user. If the condition in line 94 is satisfied, the vulnerable function signup would be called. In this function, two heap buffers are allocated in lines 6 and 7. As there are two copy operations with memcpy function calls in lines 13 and 16, our solution identifies this function as a test unit. There is a path constraint in this function in line 10; therefore, if the input strings for username and password satisfy the path constraints in lines 94 and 10, and their lengths are more than the lengths of the destination heap buffers in the copy operations, they would cause heap-based buffer overflow. Note that the path constraint in line 94 is out of the test unit and should be determined through machine learning. UbSym calculates the path constraints in line 10 using symbolic execution. It generates appropriate input data for the scanf operations in line 91, which are consistent with both path constraints inside and outside the test unit.

There are two other test units in this program, check and authentication functions, which cause heap-based buffer overflow by calling strcpy and memcpy functions, respectively. The same challenge exists in these functions for our solution to calculate the path constraints inside the test unit and estimate the ones outside it. UbSym could successfully identify these units and generate appropriate test data for the whole program, which explore vulnerable instructions in the unit and cause heap-based buffer overflow in them.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Baradaran, S., Heidari, M., Kamali, A. et al. A unit-based symbolic execution method for detecting memory corruption vulnerabilities in executable codes. Int. J. Inf. Secur. 22, 1277–1290 (2023). https://doi.org/10.1007/s10207-023-00691-1

Download citation

Published: 07 May 2023
Issue Date: October 2023
DOI: https://doi.org/10.1007/s10207-023-00691-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A unit-based symbolic execution method for detecting memory corruption vulnerabilities in executable codes

Abstract

Access this article

Similar content being viewed by others

A survey on run-time packers and mitigation techniques

An empirical study of automated unit test generation for Python

APR4Vul: an empirical study of automatic program repair techniques on real-world Java vulnerabilities

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Data availability

Additional information

Publisher's Note

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A unit-based symbolic execution method for detecting memory corruption vulnerabilities in executable codes

Abstract

Access this article

Similar content being viewed by others

A survey on run-time packers and mitigation techniques

An empirical study of automated unit test generation for Python

APR4Vul: an empirical study of automatic program repair techniques on real-world Java vulnerabilities

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Data availability

Additional information

Publisher's Note

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation