Abstract
Reasoning about strings is becoming a key step at the heart of many program analysis and testing frameworks. Stand-alone string constraint solving tools, called decision procedures, have been the focus of recent research in this area. The aim of this work is to provide algorithms and implementations that can be used by a variety of program analyses through a well-defined interface. This separation enables independent improvement of string constraint solving algorithms and reduces client effort.
We present StrSolve, a decision procedure that reasons about equations over string variables. Our approach scales well with respect to the size of the input constraints, especially compared to other contemporary techniques. Our approach performs an explicit search for a satisfying assignment, but constructs the search space lazily based on an automata representation. We empirically evaluate our approach by comparing it with four existing string decision procedures on a number of tasks. We find that our prototype is, on average, several orders of magnitude faster than the fastest existing approaches, and present evidence that our lazy search space enumeration accounts for most of that benefit.
Similar content being viewed by others
Notes
Hampi has since added support for ranges of length bounds; at the time of writing, it is implemented using a very similar approach.
References
Axelsson, R., Heljanko, K., Lange, M.: Analyzing context-free grammars using an incremental sat solver. In: International Colloquium on Automata, Languages and Programming, pp. 410–422 (2008). doi:10.1007/978-3-540-70583-3_34
Balzarotti, D., Cova, M., Felmetsger, V., Jovanovic, N., Kirda, E., Kruegel, C., Vigna, G.: Saner: composing static and dynamic analysis to validate sanitization in web applications. In: IEEE Symposium on Security and Privacy, pp. 387–401 (2008)
Bjørner, N., Tillmann, N., Voronkov, A.: Path feasibility analysis for string-manipulating programs. In: Tools and Algorithms for the Construction and Analysis of Systems (2009)
Bryant, R.E.: Graph-based algorithms for boolean function manipulation. IEEE Trans. Comput. 35(8), 677–691 (1986)
Cadar, C., Godefroid, P., Khurshid, S., Pasareanu, C.S., Sen, K., Tillmann, N., Visser, W.: Symbolic execution for software testing in practice: preliminary assessment. In: International Conference on Software Engineering, pp. 1066–1071 (2011)
Christensen, A.S., Møller, A., Schwartzbach, M.I.: Precise analysis of string expressions. In: International Symposium on Static Analysis, pp. 1–18 (2003)
de Moura, L.M., Bjørner, N.: Z3: an efficient SMT solver. In: Tools and Algorithms for the Construction and Analysis of Systems (2008)
Detlefs, D., Nelson, G., Saxe, J.B.: Simplify: a theorem prover for program checking. J. ACM 52(3), 365–473 (2005). doi:10.1145/1066100.1066102
Eén, N., Sörensson, N.: An extensible sat-solver. In: Theory and Applications of Satisfiability Testing, pp. 502–518 (2003)
Fu, X., Li, C.C.: Modeling regular replacement for string constraint solving. In: Muñoz, C. (ed.) Proceedings of the Second NASA Formal Methods Symposium (NFM 2010), NASA/CP-2010-216215, NASA, Langley Research Center, Hampton, VA 23681-2199, USA, pp. 67–76 (2010)
Fu, X., Powell, M., Bantegui, M., Li, C.C.: Simple linear string constraints. Form. Asp. Comput. 1–45 (2012). doi:10.1007/s00165-011-0214-3
Fujitsu Laboratories: Fujitsu develops technology to enhance comprehensive testing of Java programs (2010). URL http://www.fujitsu.com/global/news/pr/archives/month/2010/20100112-02.html
Ganesh, V., Dill, D.L.: A decision procedure for bit-vectors and arrays. In: Computer-Aided Verification, pp. 519–531 (2007)
Godefroid, P., Klarlund, N., Sen, K.: DART: directed automated random testing. In: Programming Language Design and Implementation (2005)
Godefroid, P., Kiezun, A., Levin, M.Y.: Grammar-based whitebox fuzzing. In: Programming Language Design and Implementation (2008a)
Godefroid, P., Levin, M., Molnar, D.: Automated whitebox fuzz testing. In: Network Distributed Security Symposium (2008b)
Henriksen, J., Jensen, J., Jørgensen, M., Klarlund, N., Paige, B., Rauhe, T., Sandholm, A.: Mona: monadic second-order logic in practice. In: TACAS ’95. LNCS, vol. 1019. Springer, Berlin (1995)
Hooimeijer, P., Veanes, M.: An evaluation of automata algorithms for string analysis. In: Verification, Model Checking, and Abstract Interpretation, pp. 248–262 (2011)
Hooimeijer, P., Weimer, W.: A decision procedure for subset constraints over regular languages. In: Programming Languages Design and Implementation, pp. 188–198 (2009)
Hooimeijer, P., Weimer, W.: Solving string constraints lazily. In: Automated Software Engineering, pp. 377–386 (2010)
Hooimeijer, P., Livshits, B., Molnar, D., Saxena, P., Veanes, M.: Fast and precise sanitizer analysis with bek. In: USENIX Security Symposium, pp. 1–15 (2011)
Ilie, L., Yu, S.: Follow automata. Inf. Comput. 186(1), 140–162 (2003). doi:10.1016/S0890-5401(03)00090-7
Kiezun, A., Ganesh, V., Guo, P.J., Hooimeijer, P., Ernst, M.D.: Hampi: a solver for string constraints. In: International Symposium on Software Testing and Analysis, pp. 105–116 (2009)
Lakhotia, K., McMinn, P., Harman, M.: Handling dynamic data structures in search based testing. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 1759–1766 (2008)
Lakhotia, K., McMinn, P., Harman, M.: Automated test data generation for coverage: haven’t we solved this problem yet? In: Testing Academia and Industry Conference, pp. 95–104 (2009)
Lakhotia, K., McMinn, P., Harman, M.: An empirical investigation into branch coverage for c programs using cute and Austin. J. Syst. Softw. 83(12), 2379–2391 (2010)
Li, N., Xie, T., Tillmann, N., de Halleux, J., Schulte, W.: Reggae: automated test generation for programs using complex regular expressions. In: Automated Software Engineering Short Paper (2009)
Majumdar, R., Sen, K.: Hybrid concolic testing. In: International Conference on Software Engineering, pp. 416–426 (2007)
Majumdar, R., Xu, R.G.: Directed test generation using symbolic grammars. In: Automated Software Engineering, pp. 134–143 (2007)
Minamide, Y.: Static approximation of dynamically generated web pages. In: International Conference on the World Wide Web, pp. 432–441 (2005). http://doi.acm.org/10.1145/1060745.1060809
Møller, A., Schwartzbach, M.I.: The pointer assertion logic engine. In: Programming Language Design and Implementation, pp. 221–231 (2001)
Moskewicz, M.W., Madigan, C.F., Zhao, Y., Zhang, L., Malik, S.: Chaff: engineering an efficient sat solver. In: Design Automation Conference, pp. 530–535 (2001)
Necula, G.C.: Proof-carrying code. In: Principles of Programming Languages, pp. 106–119 (1997)
Pasareanu, C.S., Mehlitz, P.C., Bushnell, D.H., Gundy-Burlet, K., Lowry, M.R., Person, S., Pape, M.: Combining unit-level symbolic execution and system-level concrete execution for testing NASA software. In: International Symposium on Software Testing and Analysis, pp. 15–26 (2008)
Saxena, P., Akhawe, D., Hanna, S., Mao, F., McCamant, S., Song, D.: A symbolic execution framework for javascript. In: IEEE Symposium on Security and Privacy, pp. 513–528 (2010)
Sipser, M.: Introduction to the Theory of Computation, 2nd edn. Course Technology, Independence (1997)
Su, Z., Wassermann, G.: The essence of command injection attacks in web applications. In: Principles of Programming Languages, pp. 372–382 (2006)
Tateishi, T., Pistoia, M., Tripp, O.: Path- and index-sensitive string analysis based on monadic second-order logic. In: ISSTA ’11, pp. 166–176. ACM, New York (2011)
Veanes, M., de Halleux, P., Tillmann, N.: Rex: symbolic regular expression explorer. In: International Conference on Software Testing, Verification and Validation, pp. 498–507 (2010)
Veanes, M., Hooimeijer, P., Livshits, B., Molnar, D., Bjørner, N.: Symbolic finite state transducers: algorithms and applications. In: Principles of Programming Languages, pp. 137–150 (2012)
Wassermann, G., Su, Z.: Sound and precise analysis of web applications for injection vulnerabilities. In: Programming Languages Design and Implementation, pp. 32–41 (2007)
Wassermann, G., Su, Z.: Static detection of cross-site scripting vulnerabilities. In: International Conference on Software Engineering (2008)
Weimer, W., Nguyen, T., Le Goues, C., Forrest, S.: Automatically finding patches using genetic programming. In: International Conference on Software Engineering, pp. 364–374 (2009)
Xie, Y., Aiken, A.: Saturn: a SAT-based tool for bug detection. In: Computer Aided Verification, pp. 139–143 (2005)
Xie, Y., Aiken, A.: Static detection of security vulnerabilities in scripting languages. In: USENIX Security Symposium, pp. 179–192 (2006)
Yu, F., Alkhalaf, M., Bultan, T.: Generating vulnerability signatures for string manipulating programs using automata-based forward and backward symbolic analyses. In: Automated Software Engineering, pp. 605–609 (2009a)
Yu, F., Bultan, T., Ibarra, O.H.: Symbolic string verification: combining string analysis and size analysis. In: Tools and Algorithms for the Construction and Analysis of Systems (2009b)
Yu, F., Bultan, T., Ibarra, O.H.: Relational string verification using multi-track automata. In: Conference on Implementation and Application of Automata, pp. 290–299 (2010)
Yu, F., Alkhalaf, M., Bultan, T.: Patching vulnerabilities with sanitization synthesis. In: International Conference on Software Engineering, pp. 251–260 (2011)
Author information
Authors and Affiliations
Corresponding author
Additional information
We gratefully acknowledge the support of the National Science Foundation (grants CCF-0905236, CCF-0954024 and CNS-0716478), Air Force Office of Scientific Research grant FA8750-11-2-0039, MURI grant FA9550-07-1-0532, and DARPA grant FA8650-10-C-7089.
Rights and permissions
About this article
Cite this article
Hooimeijer, P., Weimer, W. StrSolve: solving string constraints lazily. Autom Softw Eng 19, 531–559 (2012). https://doi.org/10.1007/s10515-012-0111-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10515-012-0111-x