Skip to main content

StrSolve: solving string constraints lazily

Abstract

Reasoning about strings is becoming a key step at the heart of many program analysis and testing frameworks. Stand-alone string constraint solving tools, called decision procedures, have been the focus of recent research in this area. The aim of this work is to provide algorithms and implementations that can be used by a variety of program analyses through a well-defined interface. This separation enables independent improvement of string constraint solving algorithms and reduces client effort.

We present StrSolve, a decision procedure that reasons about equations over string variables. Our approach scales well with respect to the size of the input constraints, especially compared to other contemporary techniques. Our approach performs an explicit search for a satisfying assignment, but constructs the search space lazily based on an automata representation. We empirically evaluate our approach by comparing it with four existing string decision procedures on a number of tasks. We find that our prototype is, on average, several orders of magnitude faster than the fastest existing approaches, and present evidence that our lazy search space enumeration accounts for most of that benefit.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Notes

  1. 1.

    http://code.google.com/p/strsolve/.

  2. 2.

    Hampi has since added support for ranges of length bounds; at the time of writing, it is implemented using a very similar approach.

References

  1. Axelsson, R., Heljanko, K., Lange, M.: Analyzing context-free grammars using an incremental sat solver. In: International Colloquium on Automata, Languages and Programming, pp. 410–422 (2008). doi:10.1007/978-3-540-70583-3_34

    Chapter  Google Scholar 

  2. Balzarotti, D., Cova, M., Felmetsger, V., Jovanovic, N., Kirda, E., Kruegel, C., Vigna, G.: Saner: composing static and dynamic analysis to validate sanitization in web applications. In: IEEE Symposium on Security and Privacy, pp. 387–401 (2008)

    Google Scholar 

  3. Bjørner, N., Tillmann, N., Voronkov, A.: Path feasibility analysis for string-manipulating programs. In: Tools and Algorithms for the Construction and Analysis of Systems (2009)

    Google Scholar 

  4. Bryant, R.E.: Graph-based algorithms for boolean function manipulation. IEEE Trans. Comput. 35(8), 677–691 (1986)

    MATH  Article  Google Scholar 

  5. Cadar, C., Godefroid, P., Khurshid, S., Pasareanu, C.S., Sen, K., Tillmann, N., Visser, W.: Symbolic execution for software testing in practice: preliminary assessment. In: International Conference on Software Engineering, pp. 1066–1071 (2011)

    Google Scholar 

  6. Christensen, A.S., Møller, A., Schwartzbach, M.I.: Precise analysis of string expressions. In: International Symposium on Static Analysis, pp. 1–18 (2003)

    Google Scholar 

  7. de Moura, L.M., Bjørner, N.: Z3: an efficient SMT solver. In: Tools and Algorithms for the Construction and Analysis of Systems (2008)

    Google Scholar 

  8. Detlefs, D., Nelson, G., Saxe, J.B.: Simplify: a theorem prover for program checking. J. ACM 52(3), 365–473 (2005). doi:10.1145/1066100.1066102

    MathSciNet  Article  Google Scholar 

  9. Eén, N., Sörensson, N.: An extensible sat-solver. In: Theory and Applications of Satisfiability Testing, pp. 502–518 (2003)

    Google Scholar 

  10. Fu, X., Li, C.C.: Modeling regular replacement for string constraint solving. In: Muñoz, C. (ed.) Proceedings of the Second NASA Formal Methods Symposium (NFM 2010), NASA/CP-2010-216215, NASA, Langley Research Center, Hampton, VA 23681-2199, USA, pp. 67–76 (2010)

    Google Scholar 

  11. Fu, X., Powell, M., Bantegui, M., Li, C.C.: Simple linear string constraints. Form. Asp. Comput. 1–45 (2012). doi:10.1007/s00165-011-0214-3

  12. Fujitsu Laboratories: Fujitsu develops technology to enhance comprehensive testing of Java programs (2010). URL http://www.fujitsu.com/global/news/pr/archives/month/2010/20100112-02.html

  13. Ganesh, V., Dill, D.L.: A decision procedure for bit-vectors and arrays. In: Computer-Aided Verification, pp. 519–531 (2007)

    Chapter  Google Scholar 

  14. Godefroid, P., Klarlund, N., Sen, K.: DART: directed automated random testing. In: Programming Language Design and Implementation (2005)

    Google Scholar 

  15. Godefroid, P., Kiezun, A., Levin, M.Y.: Grammar-based whitebox fuzzing. In: Programming Language Design and Implementation (2008a)

    Google Scholar 

  16. Godefroid, P., Levin, M., Molnar, D.: Automated whitebox fuzz testing. In: Network Distributed Security Symposium (2008b)

    Google Scholar 

  17. Henriksen, J., Jensen, J., Jørgensen, M., Klarlund, N., Paige, B., Rauhe, T., Sandholm, A.: Mona: monadic second-order logic in practice. In: TACAS ’95. LNCS, vol. 1019. Springer, Berlin (1995)

    Google Scholar 

  18. Hooimeijer, P., Veanes, M.: An evaluation of automata algorithms for string analysis. In: Verification, Model Checking, and Abstract Interpretation, pp. 248–262 (2011)

    Chapter  Google Scholar 

  19. Hooimeijer, P., Weimer, W.: A decision procedure for subset constraints over regular languages. In: Programming Languages Design and Implementation, pp. 188–198 (2009)

    Google Scholar 

  20. Hooimeijer, P., Weimer, W.: Solving string constraints lazily. In: Automated Software Engineering, pp. 377–386 (2010)

    Google Scholar 

  21. Hooimeijer, P., Livshits, B., Molnar, D., Saxena, P., Veanes, M.: Fast and precise sanitizer analysis with bek. In: USENIX Security Symposium, pp. 1–15 (2011)

    Google Scholar 

  22. Ilie, L., Yu, S.: Follow automata. Inf. Comput. 186(1), 140–162 (2003). doi:10.1016/S0890-5401(03)00090-7

    MathSciNet  MATH  Article  Google Scholar 

  23. Kiezun, A., Ganesh, V., Guo, P.J., Hooimeijer, P., Ernst, M.D.: Hampi: a solver for string constraints. In: International Symposium on Software Testing and Analysis, pp. 105–116 (2009)

    Google Scholar 

  24. Lakhotia, K., McMinn, P., Harman, M.: Handling dynamic data structures in search based testing. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 1759–1766 (2008)

    Google Scholar 

  25. Lakhotia, K., McMinn, P., Harman, M.: Automated test data generation for coverage: haven’t we solved this problem yet? In: Testing Academia and Industry Conference, pp. 95–104 (2009)

    Chapter  Google Scholar 

  26. Lakhotia, K., McMinn, P., Harman, M.: An empirical investigation into branch coverage for c programs using cute and Austin. J. Syst. Softw. 83(12), 2379–2391 (2010)

    Article  Google Scholar 

  27. Li, N., Xie, T., Tillmann, N., de Halleux, J., Schulte, W.: Reggae: automated test generation for programs using complex regular expressions. In: Automated Software Engineering Short Paper (2009)

    Google Scholar 

  28. Majumdar, R., Sen, K.: Hybrid concolic testing. In: International Conference on Software Engineering, pp. 416–426 (2007)

    Google Scholar 

  29. Majumdar, R., Xu, R.G.: Directed test generation using symbolic grammars. In: Automated Software Engineering, pp. 134–143 (2007)

    Google Scholar 

  30. Minamide, Y.: Static approximation of dynamically generated web pages. In: International Conference on the World Wide Web, pp. 432–441 (2005). http://doi.acm.org/10.1145/1060745.1060809

    Google Scholar 

  31. Møller, A., Schwartzbach, M.I.: The pointer assertion logic engine. In: Programming Language Design and Implementation, pp. 221–231 (2001)

    Google Scholar 

  32. Moskewicz, M.W., Madigan, C.F., Zhao, Y., Zhang, L., Malik, S.: Chaff: engineering an efficient sat solver. In: Design Automation Conference, pp. 530–535 (2001)

    Google Scholar 

  33. Necula, G.C.: Proof-carrying code. In: Principles of Programming Languages, pp. 106–119 (1997)

    Google Scholar 

  34. Pasareanu, C.S., Mehlitz, P.C., Bushnell, D.H., Gundy-Burlet, K., Lowry, M.R., Person, S., Pape, M.: Combining unit-level symbolic execution and system-level concrete execution for testing NASA software. In: International Symposium on Software Testing and Analysis, pp. 15–26 (2008)

    Google Scholar 

  35. Saxena, P., Akhawe, D., Hanna, S., Mao, F., McCamant, S., Song, D.: A symbolic execution framework for javascript. In: IEEE Symposium on Security and Privacy, pp. 513–528 (2010)

    Chapter  Google Scholar 

  36. Sipser, M.: Introduction to the Theory of Computation, 2nd edn. Course Technology, Independence (1997)

    MATH  Google Scholar 

  37. Su, Z., Wassermann, G.: The essence of command injection attacks in web applications. In: Principles of Programming Languages, pp. 372–382 (2006)

    Google Scholar 

  38. Tateishi, T., Pistoia, M., Tripp, O.: Path- and index-sensitive string analysis based on monadic second-order logic. In: ISSTA ’11, pp. 166–176. ACM, New York (2011)

    Google Scholar 

  39. Veanes, M., de Halleux, P., Tillmann, N.: Rex: symbolic regular expression explorer. In: International Conference on Software Testing, Verification and Validation, pp. 498–507 (2010)

    Chapter  Google Scholar 

  40. Veanes, M., Hooimeijer, P., Livshits, B., Molnar, D., Bjørner, N.: Symbolic finite state transducers: algorithms and applications. In: Principles of Programming Languages, pp. 137–150 (2012)

    Google Scholar 

  41. Wassermann, G., Su, Z.: Sound and precise analysis of web applications for injection vulnerabilities. In: Programming Languages Design and Implementation, pp. 32–41 (2007)

    Google Scholar 

  42. Wassermann, G., Su, Z.: Static detection of cross-site scripting vulnerabilities. In: International Conference on Software Engineering (2008)

    Google Scholar 

  43. Weimer, W., Nguyen, T., Le Goues, C., Forrest, S.: Automatically finding patches using genetic programming. In: International Conference on Software Engineering, pp. 364–374 (2009)

    Google Scholar 

  44. Xie, Y., Aiken, A.: Saturn: a SAT-based tool for bug detection. In: Computer Aided Verification, pp. 139–143 (2005)

    Chapter  Google Scholar 

  45. Xie, Y., Aiken, A.: Static detection of security vulnerabilities in scripting languages. In: USENIX Security Symposium, pp. 179–192 (2006)

    Google Scholar 

  46. Yu, F., Alkhalaf, M., Bultan, T.: Generating vulnerability signatures for string manipulating programs using automata-based forward and backward symbolic analyses. In: Automated Software Engineering, pp. 605–609 (2009a)

    Google Scholar 

  47. Yu, F., Bultan, T., Ibarra, O.H.: Symbolic string verification: combining string analysis and size analysis. In: Tools and Algorithms for the Construction and Analysis of Systems (2009b)

    Google Scholar 

  48. Yu, F., Bultan, T., Ibarra, O.H.: Relational string verification using multi-track automata. In: Conference on Implementation and Application of Automata, pp. 290–299 (2010)

    Google Scholar 

  49. Yu, F., Alkhalaf, M., Bultan, T.: Patching vulnerabilities with sanitization synthesis. In: International Conference on Software Engineering, pp. 251–260 (2011)

    Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Pieter Hooimeijer.

Additional information

We gratefully acknowledge the support of the National Science Foundation (grants CCF-0905236, CCF-0954024 and CNS-0716478), Air Force Office of Scientific Research grant FA8750-11-2-0039, MURI grant FA9550-07-1-0532, and DARPA grant FA8650-10-C-7089.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Hooimeijer, P., Weimer, W. StrSolve: solving string constraints lazily. Autom Softw Eng 19, 531–559 (2012). https://doi.org/10.1007/s10515-012-0111-x

Download citation

Keywords

  • String
  • Regular language
  • Decision procedure
  • Scalability