Skip to main content
Log in

Machine learning steered symbolic execution framework for complex software code

  • Original Article
  • Published:
Formal Aspects of Computing

Abstract

During program traversing, symbolic execution collects path conditions and feeds them to a constraint solver to obtain feasible solutions. However, complex path conditions, like nonlinear constraints, which widely appear in programs, are hard to be handled efficiently by the existing solvers. In this paper, we adapt the classical symbolic execution framework with a machine learning approach for constraint satisfaction. The approach samples and learns from different solutions to identify potentially feasible area. This sampling-learning style solving can be applied in different class of complex problems easily. Therefore, incorporating this approach, our framework, MLBSE, supports the symbolic execution of not only simple linear path conditions, but also nonlinear arithmetic operations, and even black-box function calls of library methods. Meanwhile, thanks to the theoretical foundation of the machine learning based approach, when the solver fails to solve a path condition, we can have an estimation of the confidence in the satisfiability (ECS) of the problem to give users insights about how the problem is analyzed and whether they could ultimately find a solution. We implement MLBSE on the basis of Symbolic Path Finder (SPF) into a fully automatic Java symbolic execution engine. Users can feed their code to MLBSE directly, which is very convenient to use. To evaluate its performance, 22 real case programs are used as the benchmarks for MLBSE to generate test cases, which involve a total number of 1042 methods that are full of nonlinear operations, floating-point arithmetic as well as native method calls. Experiment results show that the coverage achieved by MLBSE is much higher than the state-of-the-art tools.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Saswat, A., Burke Edmund, K., Yueh, C.T., John, C., Cohen Myra, B., Wolfgang, G., Mark, H., Jean, H.M., Phil, M.M., et al.: An orchestrated survey of methodologies for automated software test case generation. J Syst Softw 86(8), 1978–2001 (2013)

    Article  Google Scholar 

  2. Apache Commons Math (2018) https://commons.apache.org/

  3. Borges M, Amorim MD, Anand S, Bushnell D, Păsăreanu CS (2012) Symbolic execution with interval solving and meta-heuristic search. In: 2012 IEEE fifth international conference on software testing, verification and validation (ICST). IEEE, pp 111–120

  4. Boyer Robert, S., Bernard, E., Levitt Karl, N.: Select–a formal system for testing and debugging programs by symbolic execution. ACM SigPlan Not 10(6), 234–245 (1975)

    Article  Google Scholar 

  5. Barr Earl, T., Thanh, V., Le, V., Zhendong, S.: Automatic detection of floating-point exceptions. ACM SIGPLAN Not 48(1), 549–560 (2013)

    Article  Google Scholar 

  6. Chang David, D., Clayton David, A.: Precise identification of individual promoters for transcription of each strand of human mitochondrial DNA. Cell 36(3), 635–643 (1984)

    Article  Google Scholar 

  7. Cristian, C., Daniel, D., Engler Dawson, R., et al.: Klee: unassisted and automatic generation of high-coverage tests for complex systems programs. OSDI 8, 209–224 (2008)

    Google Scholar 

  8. Siddhartha, C., Edward, G.: Understanding the metropolis-hastings algorithm. Am Stat 49(4), 327–335 (1995)

    Google Scholar 

  9. Cadar C, Godefroid P, Khurshid S, Păsăreanu CS, Sen K, Tillmann N, Visser W (2011) Symbolic execution for software testing in practice: preliminary assessment. In: Proceedings of the 33rd international conference on software engineering. ACM, pp 1066–1071

  10. Clarke Lori, A.: A system to generate test data and symbolically execute programs. IEEE Trans Softw Eng 3, 215–222 (1976)

    Article  MathSciNet  Google Scholar 

  11. Cyclomatic Complexity (2018) http://eclemma.org/jacoco/trunk/doc/counters.html

  12. Dinges P, Agha G (2014) Solving complex path conditions through heuristic search on induced polytopes. In: Proceedings of the 22nd ACM SIGSOFT international symposium on foundations of software engineering. ACM, pp 425–436

  13. Martin, D.: Hilbert's tenth problem is unsolvable. Am Math Mon 80(3), 233–269 (1973)

    Article  MathSciNet  Google Scholar 

  14. Martin, F., Christian, H., Tino, T., Stefan, R., Tobias, S.: Efficient solving of large non-linear arithmetic constraint systems with complex boolean structure. J Satisf Boolean Model Comput 1, 209–236 (2007)

    MATH  Google Scholar 

  15. Fu Zhoulai, S., Zhendong, : Xsat: a fast floating-point satisfiability solver. In: Chaudhuri, S., Farzan, A. (eds.) Computer aided verification, pp. 187–209. Springer, Cham (2016)

  16. Galeotti JP, Fraser G, Arcuri A (2013) Improving search-based test suite generation with dynamic symbolic execution. In: 2013 IEEE 24th international symposium on software reliability engineering (ISSRE). IEEE, pp 360–369

  17. Patrice, G., Nils, K., Koushik, S.: Dart: directed automated random testing. ACM Sigplan Not 40(6), 213–223 (2005)

    Article  Google Scholar 

  18. Fred, G.: Tabu search: a tutorial. Interfaces 20(4), 74–94 (1990)

    Article  Google Scholar 

  19. Gough, B.: GNU scientific library reference manual. Network Theory Ltd, Surrey (2009)

    Google Scholar 

  20. Gies D, Rahmat-samii Y (2004) Particle swarm optimization (pso) for reflector antenna shaping. In: Antennas and propagation society international symposium, 2004. IEEE, vol 3, pp 2289–2292

  21. Klaus, H., Thomas, P.: Model checking java programs using java pathfinder. Int J Softw Tools Technol Transf 2(4), 366–381 (2000)

    Article  Google Scholar 

  22. Jacoco (2018) http://www.eclemma.org/jacoco/

  23. Jovanović, D., De Moura, L.: Solving non-linear arithmetic. In: Gramlich, B., Miller, D., Sattler, U. (eds.) Automated reasoning, pp. 339–354. Springer, Berlin (2012)

    Chapter  Google Scholar 

  24. Kingl James, C.: Symbolic execution and program testing. Commun ACM 19(7), 385–394 (1976)

    Article  MathSciNet  Google Scholar 

  25. Luckow K, Dimjašević M, Giannakopoulou D, Howar F, Isberner M, Kahsai T, Rakamarić Z, Raman V (2016) JDart: a dynamic symbolic analysis framework. In: Chechik M, Raskin J-F (eds) Proceedings of the 22nd international conference on tools and algorithms for the construction and analysis of systems (TACAS), lecture notes in computer science, vol 9636. Springer, Berlin, pp 442–459

  26. Willisa, L., Geuze Hans, J., Slot Jan, W.: Improving structural integrity of cryosections for immunogold labeling. Histochem Cell Biol 106(1), 41–58 (1996)

    Article  Google Scholar 

  27. Li X, Liang Y, Qian H, Hu Y-Q, Bu L, Yu Y, Chen X, Li X (2016) Symbolic execution of complex program driven by machine learning based constraint solving. In: Lo D, Apel S, Khurshid S (eds) Proceedings of the 31st IEEE/ACM international conference on automated software engineering, ASE 2016, Singapore, September 3–7, 2016. ACM, pp 554–559

  28. Phil, M.M.: Search-based software test data generation: a survey. Softw Test Verif Reliab 14(2), 105–156 (2004)

    Article  Google Scholar 

  29. Minizinc (2018) http://www.minizinc.org/

  30. Munos, R.: From bandits to Monte-Carlo tree search: the optimistic principle applied to optimization and planning. Found Trends Mach Learn 7(1), 1–130 (2014)

    Article  Google Scholar 

  31. Păsăreanu CS, Rungta N (2010) Symbolic pathfinder: symbolic execution of java bytecode. In: Proceedings of the IEEE/ACM international conference on automated software engineering. ACM, pp 179–180

  32. Press William, H.: Numerical recipes: the art of scientific computing, 3rd edn. Cambridge University Press, Cambridge (2007)

    MATH  Google Scholar 

  33. Păsăreanu CS, Rungta N, Visser W (2011) Symbolic execution with mixed concrete-symbolic solving. In: Proceedings of the 2011 international symposium on software testing and analysis. ACM, pp 34–44

  34. Păsăreanu Corina, S., Willem, V.: A survey of new trends in symbolic execution for software testing and analysis. Int J Softw Tools Technol Transf 11(4), 339–353 (2009)

    Article  Google Scholar 

  35. Păsăreanu Corina, S., Willem, V., David, B., Jaco, G., Peter, M., Neha, R.: Symbolic pathfinder: integrating symbolic execution with model checking for java bytecode analysis. Autom Softw Eng 20(3), 391–425 (2013)

    Article  Google Scholar 

  36. Qian H, Yu Y (2016) On sampling-and-classification optimization in discrete domains. In: Proceedings of the 2016 IEEE congress on evolutionary computation (CEC'16), Vancouver, Canada, pp 4374–4381

  37. Sen, K., Agha, G.: Cute and jcute: concolic unit testing and explicit path model-checking tools. Computer aided verification, pp. 419–423. Springer, Berlin (2006)

    Chapter  Google Scholar 

  38. Shafiei N, van Breugel F (2014) Automatic handling of native methods in java pathfinder. In: Proceedings of the 2014 international SPIN symposium on model checking of software. ACM, pp 97–100

  39. Souza, M., Borges, M., d'Amorim, M., Păsăreanu, C.S.: Coral: solving complex constraints for symbolic pathfinder. NASA formal methods, pp. 359–374. Springer, Berlin (2011)

    Chapter  Google Scholar 

  40. Scientific Computation (2018) https://github.com/elizabethzhenliu/ScientificComputation

  41. Sen K (2007) Concolic testing. In: Proceedings of the twenty-second IEEE/ACM international conference on automated software engineering. ACM, pp 571–572

  42. Bobak, S., Kevin, S., Ziyu, W., Adams Ryan, P., de Freitas Nando, : Taking the human out of the loop: a review of Bayesian optimization. Proc IEEE 104(1), 148–175 (2016)

  43. Tillmann N, De Halleux J (2008) Pex–white box test generation for. net. In: Tests and proofs. Springer, Berlin, pp 134–153

  44. Yu Y, Qian H, Hu Y-Q (2016) Derivative-free optimization via classification. In: Proceedings of the 30th AAAI conference on artificial intelligence (AAAI'16), Phoenix, AZ

  45. Yu Y, Hu Y-Q, Qian H (2017) Sequential classification-based optimization for direct policy search. In: Proceedings of the 31st AAAI conference on artificial intelligence (AAAI'17), San Francisco, CA, pp 2029–2035

Download references

Acknowledgements

The authors want to thank the anonymous reviewers and editors for their valuable advices on improving this paper. The authors would also thank Mr. Xin Li, Mr. Yuchao Duan, and Mr. Bochuan Chen for their efforts devoted in developing MLBSE. This work is supported in part by the National Key Research and Development Program of China (2020AAA0107200), the National Natural Science Foundation of China (Nos. 61632015, 61690204, 61876077), and the Leading-edge Technology Prohgram of Jiangsu Natural Science Fundation (No. BK20202001).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lei Bu.

Additional information

Zhiming Liu, Xiaoping Chen, Ji Wang and Jim Woodcock

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bu, L., Liang, Y., Xie, Z. et al. Machine learning steered symbolic execution framework for complex software code. Form Asp Comp 33, 301–323 (2021). https://doi.org/10.1007/s00165-021-00538-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00165-021-00538-3

Keywords

Navigation