Abstract
During program traversing, symbolic execution collects path conditions and feeds them to a constraint solver to obtain feasible solutions. However, complex path conditions, like nonlinear constraints, which widely appear in programs, are hard to be handled efficiently by the existing solvers. In this paper, we adapt the classical symbolic execution framework with a machine learning approach for constraint satisfaction. The approach samples and learns from different solutions to identify potentially feasible area. This sampling-learning style solving can be applied in different class of complex problems easily. Therefore, incorporating this approach, our framework, MLBSE, supports the symbolic execution of not only simple linear path conditions, but also nonlinear arithmetic operations, and even black-box function calls of library methods. Meanwhile, thanks to the theoretical foundation of the machine learning based approach, when the solver fails to solve a path condition, we can have an estimation of the confidence in the satisfiability (ECS) of the problem to give users insights about how the problem is analyzed and whether they could ultimately find a solution. We implement MLBSE on the basis of Symbolic Path Finder (SPF) into a fully automatic Java symbolic execution engine. Users can feed their code to MLBSE directly, which is very convenient to use. To evaluate its performance, 22 real case programs are used as the benchmarks for MLBSE to generate test cases, which involve a total number of 1042 methods that are full of nonlinear operations, floating-point arithmetic as well as native method calls. Experiment results show that the coverage achieved by MLBSE is much higher than the state-of-the-art tools.
Similar content being viewed by others
References
Saswat, A., Burke Edmund, K., Yueh, C.T., John, C., Cohen Myra, B., Wolfgang, G., Mark, H., Jean, H.M., Phil, M.M., et al.: An orchestrated survey of methodologies for automated software test case generation. J Syst Softw 86(8), 1978–2001 (2013)
Apache Commons Math (2018) https://commons.apache.org/
Borges M, Amorim MD, Anand S, Bushnell D, Păsăreanu CS (2012) Symbolic execution with interval solving and meta-heuristic search. In: 2012 IEEE fifth international conference on software testing, verification and validation (ICST). IEEE, pp 111–120
Boyer Robert, S., Bernard, E., Levitt Karl, N.: Select–a formal system for testing and debugging programs by symbolic execution. ACM SigPlan Not 10(6), 234–245 (1975)
Barr Earl, T., Thanh, V., Le, V., Zhendong, S.: Automatic detection of floating-point exceptions. ACM SIGPLAN Not 48(1), 549–560 (2013)
Chang David, D., Clayton David, A.: Precise identification of individual promoters for transcription of each strand of human mitochondrial DNA. Cell 36(3), 635–643 (1984)
Cristian, C., Daniel, D., Engler Dawson, R., et al.: Klee: unassisted and automatic generation of high-coverage tests for complex systems programs. OSDI 8, 209–224 (2008)
Siddhartha, C., Edward, G.: Understanding the metropolis-hastings algorithm. Am Stat 49(4), 327–335 (1995)
Cadar C, Godefroid P, Khurshid S, Păsăreanu CS, Sen K, Tillmann N, Visser W (2011) Symbolic execution for software testing in practice: preliminary assessment. In: Proceedings of the 33rd international conference on software engineering. ACM, pp 1066–1071
Clarke Lori, A.: A system to generate test data and symbolically execute programs. IEEE Trans Softw Eng 3, 215–222 (1976)
Cyclomatic Complexity (2018) http://eclemma.org/jacoco/trunk/doc/counters.html
Dinges P, Agha G (2014) Solving complex path conditions through heuristic search on induced polytopes. In: Proceedings of the 22nd ACM SIGSOFT international symposium on foundations of software engineering. ACM, pp 425–436
Martin, D.: Hilbert's tenth problem is unsolvable. Am Math Mon 80(3), 233–269 (1973)
Martin, F., Christian, H., Tino, T., Stefan, R., Tobias, S.: Efficient solving of large non-linear arithmetic constraint systems with complex boolean structure. J Satisf Boolean Model Comput 1, 209–236 (2007)
Fu Zhoulai, S., Zhendong, : Xsat: a fast floating-point satisfiability solver. In: Chaudhuri, S., Farzan, A. (eds.) Computer aided verification, pp. 187–209. Springer, Cham (2016)
Galeotti JP, Fraser G, Arcuri A (2013) Improving search-based test suite generation with dynamic symbolic execution. In: 2013 IEEE 24th international symposium on software reliability engineering (ISSRE). IEEE, pp 360–369
Patrice, G., Nils, K., Koushik, S.: Dart: directed automated random testing. ACM Sigplan Not 40(6), 213–223 (2005)
Fred, G.: Tabu search: a tutorial. Interfaces 20(4), 74–94 (1990)
Gough, B.: GNU scientific library reference manual. Network Theory Ltd, Surrey (2009)
Gies D, Rahmat-samii Y (2004) Particle swarm optimization (pso) for reflector antenna shaping. In: Antennas and propagation society international symposium, 2004. IEEE, vol 3, pp 2289–2292
Klaus, H., Thomas, P.: Model checking java programs using java pathfinder. Int J Softw Tools Technol Transf 2(4), 366–381 (2000)
Jacoco (2018) http://www.eclemma.org/jacoco/
Jovanović, D., De Moura, L.: Solving non-linear arithmetic. In: Gramlich, B., Miller, D., Sattler, U. (eds.) Automated reasoning, pp. 339–354. Springer, Berlin (2012)
Kingl James, C.: Symbolic execution and program testing. Commun ACM 19(7), 385–394 (1976)
Luckow K, Dimjašević M, Giannakopoulou D, Howar F, Isberner M, Kahsai T, Rakamarić Z, Raman V (2016) JDart: a dynamic symbolic analysis framework. In: Chechik M, Raskin J-F (eds) Proceedings of the 22nd international conference on tools and algorithms for the construction and analysis of systems (TACAS), lecture notes in computer science, vol 9636. Springer, Berlin, pp 442–459
Willisa, L., Geuze Hans, J., Slot Jan, W.: Improving structural integrity of cryosections for immunogold labeling. Histochem Cell Biol 106(1), 41–58 (1996)
Li X, Liang Y, Qian H, Hu Y-Q, Bu L, Yu Y, Chen X, Li X (2016) Symbolic execution of complex program driven by machine learning based constraint solving. In: Lo D, Apel S, Khurshid S (eds) Proceedings of the 31st IEEE/ACM international conference on automated software engineering, ASE 2016, Singapore, September 3–7, 2016. ACM, pp 554–559
Phil, M.M.: Search-based software test data generation: a survey. Softw Test Verif Reliab 14(2), 105–156 (2004)
Minizinc (2018) http://www.minizinc.org/
Munos, R.: From bandits to Monte-Carlo tree search: the optimistic principle applied to optimization and planning. Found Trends Mach Learn 7(1), 1–130 (2014)
Păsăreanu CS, Rungta N (2010) Symbolic pathfinder: symbolic execution of java bytecode. In: Proceedings of the IEEE/ACM international conference on automated software engineering. ACM, pp 179–180
Press William, H.: Numerical recipes: the art of scientific computing, 3rd edn. Cambridge University Press, Cambridge (2007)
Păsăreanu CS, Rungta N, Visser W (2011) Symbolic execution with mixed concrete-symbolic solving. In: Proceedings of the 2011 international symposium on software testing and analysis. ACM, pp 34–44
Păsăreanu Corina, S., Willem, V.: A survey of new trends in symbolic execution for software testing and analysis. Int J Softw Tools Technol Transf 11(4), 339–353 (2009)
Păsăreanu Corina, S., Willem, V., David, B., Jaco, G., Peter, M., Neha, R.: Symbolic pathfinder: integrating symbolic execution with model checking for java bytecode analysis. Autom Softw Eng 20(3), 391–425 (2013)
Qian H, Yu Y (2016) On sampling-and-classification optimization in discrete domains. In: Proceedings of the 2016 IEEE congress on evolutionary computation (CEC'16), Vancouver, Canada, pp 4374–4381
Sen, K., Agha, G.: Cute and jcute: concolic unit testing and explicit path model-checking tools. Computer aided verification, pp. 419–423. Springer, Berlin (2006)
Shafiei N, van Breugel F (2014) Automatic handling of native methods in java pathfinder. In: Proceedings of the 2014 international SPIN symposium on model checking of software. ACM, pp 97–100
Souza, M., Borges, M., d'Amorim, M., Păsăreanu, C.S.: Coral: solving complex constraints for symbolic pathfinder. NASA formal methods, pp. 359–374. Springer, Berlin (2011)
Scientific Computation (2018) https://github.com/elizabethzhenliu/ScientificComputation
Sen K (2007) Concolic testing. In: Proceedings of the twenty-second IEEE/ACM international conference on automated software engineering. ACM, pp 571–572
Bobak, S., Kevin, S., Ziyu, W., Adams Ryan, P., de Freitas Nando, : Taking the human out of the loop: a review of Bayesian optimization. Proc IEEE 104(1), 148–175 (2016)
Tillmann N, De Halleux J (2008) Pex–white box test generation for. net. In: Tests and proofs. Springer, Berlin, pp 134–153
Yu Y, Qian H, Hu Y-Q (2016) Derivative-free optimization via classification. In: Proceedings of the 30th AAAI conference on artificial intelligence (AAAI'16), Phoenix, AZ
Yu Y, Hu Y-Q, Qian H (2017) Sequential classification-based optimization for direct policy search. In: Proceedings of the 31st AAAI conference on artificial intelligence (AAAI'17), San Francisco, CA, pp 2029–2035
Acknowledgements
The authors want to thank the anonymous reviewers and editors for their valuable advices on improving this paper. The authors would also thank Mr. Xin Li, Mr. Yuchao Duan, and Mr. Bochuan Chen for their efforts devoted in developing MLBSE. This work is supported in part by the National Key Research and Development Program of China (2020AAA0107200), the National Natural Science Foundation of China (Nos. 61632015, 61690204, 61876077), and the Leading-edge Technology Prohgram of Jiangsu Natural Science Fundation (No. BK20202001).
Author information
Authors and Affiliations
Corresponding author
Additional information
Zhiming Liu, Xiaoping Chen, Ji Wang and Jim Woodcock
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Bu, L., Liang, Y., Xie, Z. et al. Machine learning steered symbolic execution framework for complex software code. Form Asp Comp 33, 301–323 (2021). https://doi.org/10.1007/s00165-021-00538-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00165-021-00538-3