Skip to main content

Advertisement

Log in

A unit-based symbolic execution method for detecting memory corruption vulnerabilities in executable codes

  • Regular Contribution
  • Published:
International Journal of Information Security Aims and scope Submit manuscript

Abstract

Memory corruption is a serious class of software vulnerabilities, which requires careful attention to be detected and removed from applications before getting exploited and harming the system users. Symbolic execution is a well-known method for analyzing programs and detecting various vulnerabilities, e.g., memory corruption. Although this method is sound and complete in theory, it faces some challenges, such as path explosion, when applied to real-world complex programs. In this paper, we present a method for improving the efficiency of symbolic execution and detecting four classes of memory corruption vulnerabilities in executable codes, i.e., heap-based buffer overflow, stack-based buffer overflow, use-after-free, and double-free. We perform symbolic execution only on test units rather than the whole program to lower the chance of path explosion. In our method, test units are considered parts of the program’s code, which might contain vulnerable statements and are statically identified based on the specifications of memory corruption vulnerabilities. Then, each test unit is symbolically executed to calculate path and vulnerability constraints for each statement of the unit, which determine the conditions on unit input data for executing that statement or activating vulnerabilities in it, respectively. Solving these constraints gives us input values for the test unit, which execute the desired statements and reveal vulnerabilities in them. Finally, we use machine learning to approximate the correlation between system and unit input data. Thereby, we generate system inputs that enter the program, reach vulnerable instructions in the desired test unit, and reveal vulnerabilities in them. This method is implemented as a plug-in for angr framework and evaluated using a group of benchmark programs. The experiments show its superiority over similar tools in accuracy and performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Arlinghaus, S.L., Arlinghaus, W.C., Drake, W.D., Nystuen, J.D.: Practical handbook of curve fitting (1994)

  2. Baldoni, R., Coppa, E., D’elia, D.C., Demetrescu, C., Finocchi, I.: A survey of symbolic execution techniques. ACM Comput. Surv. 51(3), 1–39 (2018). https://doi.org/10.1145/3182657

    Article  Google Scholar 

  3. Caballero, J., Grieco, G., Marron, M., Nappa, A.: Undangle: early detection of dangling pointers in use-after-free and double-free vulnerabilities. In: Proceedings of the 2012 International Symposium on Software Testing and Analysis, ISSTA 2012, pp. 133–143. Association for Computing Machinery, New York (2012). https://doi.org/10.1145/2338965.2336769

  4. Cadar, C., Dunbar, D., Engler, D.: Klee: unassisted and automatic generation of high-coverage tests for complex systems programs. In: Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation, OSDI’08, pp. 209–224. USENIX Association, USA (2008). https://doi.org/10.5555/1855741.1855756

  5. Cha, S., Lee, S., Oh, H.: Template-guided concolic testing via online learning. In: Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, ASE 2018, pp. 408–418. Association for Computing Machinery, New York (2018). https://doi.org/10.1145/3238147.3238227

  6. Cha, S., Oh, H.: Concolic testing with adaptively changing search heuristics. In: Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2019, pp. 235–245. Association for Computing Machinery, New York (2019). https://doi.org/10.1145/3338906.3338964

  7. Cha, S., Hong, S., Bak, J., Kim, J., Lee, J., Oh, H.: Enhancing dynamic symbolic execution by automatically learning search heuristics. IEEE Trans. Softw. Eng. 48(9), 3640–3663 (2022). https://doi.org/10.1109/TSE.2021.3101870

    Article  Google Scholar 

  8. Chen, J., Hu, W., Zhang, L., Hao, D., Khurshid, S., Zhang, L.: Learning to accelerate symbolic execution via code transformation. In: Millstein, T. (eds) 32nd European Conference on Object-Oriented Programming (ECOOP 2018), Leibniz International Proceedings in Informatics (LIPIcs), vol. 109, pp. 6:1–6:27. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl (2018). https://doi.org/10.4230/LIPIcs.ECOOP.2018.6

  9. Chen, T., Zhang, X.S., Guo, S.Z., Li, H.Y., Wu, Y.: State of the art: dynamic symbolic execution for automated test generation. Future Gener. Comput. Syst. 29(7), 1758–1773 (2013). https://doi.org/10.1016/j.future.2012.02.006

    Article  Google Scholar 

  10. Davies, M., Păsăreanu, C.S., Raman, V.: Symbolic execution enhanced system testing. In: Proceedings of the 4th International Conference on Verified Software: Theories, Tools, Experiments, VSTTE’12, pp. 294–309. Springer, Berlin (2012). https://doi.org/10.1007/978-3-642-27705-4_23

  11. Gao, Y., Chen, L., Shi, G., Zhang, F.: A comprehensive detection of memory corruption vulnerabilities for C/C++ programs. In: 2018 IEEE International Conference on Parallel and Distributed Processing with Applications, Ubiquitous Computing and Communications, Big Data and Cloud Computing, Social Computing and Networking, Sustainable Computing and Communications (ISPA/IUCC/BDCloud/SocialCom/SustainCom), pp. 354–360 (2018). https://doi.org/10.1109/BDCloud.2018.00062

  12. Jia, X., Zhang, C., Su, P., Yang, Y., Huang, H., Feng, D.: Towards efficient heap overflow discovery. In: USENIX Security Symposium (2017)

  13. Menzies, T., Hu, Y.: Data mining for very busy people. Computer 36(11), 22–29 (2003). https://doi.org/10.1109/MC.2003.1244531

    Article  Google Scholar 

  14. Mouzarani, M., Sadeghiyan, B.: Towards designing an extendable vulnerability detection method for executable codes. Inf. Softw. Technol. 80, 231–244 (2016). https://doi.org/10.1016/j.infsof.2016.09.004

    Article  Google Scholar 

  15. National institute of standards and technology in software assurance reference dataset project. https://samate.nist.gov/SRD. Last accessed 4 March 2022

  16. Ognawala, S., Ochoa, M., Pretschner, A., Limmer, T.: Macke: Compositional analysis of low-level vulnerabilities with symbolic execution. In: Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, ASE 2016, pp. 780–785. Association for Computing Machinery, New York (2016). https://doi.org/10.1145/2970276.2970281

  17. Park, J., Choi, B., Kim, Y.: Automated memory corruption detection through analysis of static variables and dynamic memory usage. Electronics 10(17), 2127 (2021). https://doi.org/10.3390/electronics10172127

    Article  Google Scholar 

  18. Păsăreanu, C.S., Mehlitz, P.C., Bushnell, D.H., Gundy-Burlet, K., Lowry, M., Person, S., Pape, M.: Combining unit-level symbolic execution and system-level concrete execution for testing NASA software. In: Proceedings of the 2008 International Symposium on Software Testing and Analysis, ISSTA ’08, pp. 15–26. Association for Computing Machinery, New York (2008). https://doi.org/10.1145/1390630.1390635

  19. Shoshitaishvili, Y., Wang, R., Salls, C., Stephens, N., Polino, M., Dutcher, A., Grosen, J., Feng, S., Hauser, C., Kruegel, C., Vigna, G.: Sok: (state of) the art of war: offensive techniques in binary analysis. In: 2016 IEEE Symposium on Security and Privacy (SP), pp. 138–157 (2016). https://doi.org/10.1109/SP.2016.17

  20. Stephens, N., Grosen, J., Salls, C., Dutcher, A., Corbetta, J., Shoshitaishvili, Y., Kruegel, C., Vigna, G.: Driller: augmenting fuzzing through selective symbolic execution. In: Network and Distributed System Security Symposium (NDSS) (2016). https://doi.org/10.14722/ndss.2016.23368

  21. Strang, G.: Linear algebra and its applications (2006)

  22. Ubsym: A unit-based symbolic execution method. https://github.com/SoftwareSecurityLab/UbSym

  23. van der Kouwe, E., Nigade, V., Giuffrida, C.: Dangsan: Scalable use-after-free detection. In: Proceedings of the Twelfth European Conference on Computer Systems, EuroSys ’17, pp. 405–419. Association for Computing Machinery, New York (2017). https://doi.org/10.1145/3064176.3064211

  24. Zhang, B., Wu, B., Feng, C., Tang, C.: Memory corruption vulnerabilities detection for android binary software. In: 2015 IEEE International Conference on Signal Processing, Communications and Computing (ICSPCC), pp. 1–5 (2015). https://doi.org/10.1109/ICSPCC.2015.7338757

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Maryam Mouzarani.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Data availability

The test cases used in this research are publicly available online and the links to them have been provided in the article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

The base structure of the designed complex programs is a simple authentication code by which users carry out sign-up and sign-in operations. The source code of these programs is presented in Fig. 13. To generate the test program of each vulnerability class, the commented lines for each specific vulnerability should be uncommented. In the following, the structure of the test program that contains heap-based buffer overflow vulnerability is explained as an instance.

Fig. 13
figure 13

Source codes of the four designed complex programs

This program begins by receiving a username and password in the console to sign-up a user. If the condition in line 94 is satisfied, the vulnerable function signup would be called. In this function, two heap buffers are allocated in lines 6 and 7. As there are two copy operations with memcpy function calls in lines 13 and 16, our solution identifies this function as a test unit. There is a path constraint in this function in line 10; therefore, if the input strings for username and password satisfy the path constraints in lines 94 and 10, and their lengths are more than the lengths of the destination heap buffers in the copy operations, they would cause heap-based buffer overflow. Note that the path constraint in line 94 is out of the test unit and should be determined through machine learning. UbSym calculates the path constraints in line 10 using symbolic execution. It generates appropriate input data for the scanf operations in line 91, which are consistent with both path constraints inside and outside the test unit.

There are two other test units in this program, check and authentication functions, which cause heap-based buffer overflow by calling strcpy and memcpy functions, respectively. The same challenge exists in these functions for our solution to calculate the path constraints inside the test unit and estimate the ones outside it. UbSym could successfully identify these units and generate appropriate test data for the whole program, which explore vulnerable instructions in the unit and cause heap-based buffer overflow in them.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Baradaran, S., Heidari, M., Kamali, A. et al. A unit-based symbolic execution method for detecting memory corruption vulnerabilities in executable codes. Int. J. Inf. Secur. 22, 1277–1290 (2023). https://doi.org/10.1007/s10207-023-00691-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10207-023-00691-1

Keywords

Navigation