Effective Search-Space Pruning for Solvers of String Equations, Regular Expressions and Length Constraints

  • Yunhui Zheng
  • Vijay Ganesh
  • Sanu Subramanian
  • Omer Tripp
  • Julian Dolby
  • Xiangyu Zhang
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9206)

Abstract

In recent years, string solvers have become an essential component in many formal-verification, security-analysis and bug-finding tools. Such solvers typically support a theory of string equations, the length function as well as the regular-expression membership predicate. These enable considerable expressive power, which comes at the cost of slow solving time, and in some cases even nontermination. We present two techniques, designed for word-based SMT string solvers, to mitigate these problems: (i) sound and complete detection of overlapping variables, which is essential to avoiding common cases of nontermination; and (ii) pruning of the search space via bi-directional integration between the string and integer theories, enabling new cross-domain heuristics. We have implemented both techniques atop the Z3-str solver, resulting in a significantly more robust and efficient solver, dubbed Z3str2, for the quantifier-free theory of string equations, the regular-expression membership predicate and linear arithmetic over the length function. We report on a series of experiments over four sets of challenging real-world benchmarks, where we compared Z3str2 with five different string solvers: S3, CVC4, Kaluza, PISA and Stranger. Each of these tools utilizes a different solving strategy and/or string representation (based e.g. on words, bit vectors or automata). The results point to the efficacy of our proposed techniques, which yield dramatic performance improvement. We argue that the techniques presented here are of broad applicability, and can be integrated into other SMT-backed string solvers to improve their performance.

References

  1. 1.
    Z3str2 String constraint solver. https://sites.google.com/site/z3strsolver/
  2. 2.
  3. 3.
  4. 4.
  5. 5.
    Personal communications with the stranger team (2015)Google Scholar
  6. 6.
    Abdulla, P.A., Atig, M.F., Chen, Y.-F., Holík, L., Rezine, A., Rümmer, P., Stenman, J.: String constraints for verification. In: Biere, A., Bloem, R. (eds.) CAV 2014. LNCS, vol. 8559, pp. 150–166. Springer, Heidelberg (2014) Google Scholar
  7. 7.
    Bjørner, N., Tillmann, N., Voronkov, A.: Path feasibility analysis for string-manipulating programs. In: Kowalewski, S., Philippou, A. (eds.) TACAS 2009. LNCS, vol. 5505, pp. 307–321. Springer, Heidelberg (2009) CrossRefGoogle Scholar
  8. 8.
    Christensen, A.S., Møller, A., Schwartzbach, M.I.: Precise analysis of string expressions. In: Cousot, R. (ed.) SAS 2003. LNCS, vol. 2694, pp. 1–18. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  9. 9.
    De Moura, L., Bjørner, N.S.: Z3: an efficient SMT solver. In: Ramakrishnan, C.R., Rehof, J. (eds.) TACAS 2008. LNCS, vol. 4963, pp. 337–340. Springer, Heidelberg (2008) CrossRefGoogle Scholar
  10. 10.
    Ganesh, V., Dill, D.L.: A decision procedure for bit-vectors and arrays. In: Damm, W., Hermanns, H. (eds.) CAV 2007. LNCS, vol. 4590, pp. 519–531. Springer, Heidelberg (2007) CrossRefGoogle Scholar
  11. 11.
    Ganesh, V., Minnes, M., Solar-Lezama, A., Rinard, M.: Word equations with length constraints: what’s decidable? In: Biere, A., Nahir, A., Vos, T. (eds.) HVC. LNCS, vol. 7857, pp. 209–226. Springer, Heidelberg (2013) CrossRefGoogle Scholar
  12. 12.
    Ghosh, I., Shafiei, N., Li, G., Chiang, W.-F.: JST: an automatic test generation tool for industrial java applications with strings. In: Proceedings of the 2013 International Conference on Software Engineering, ICSE 2013, pp. 992–1001 (2013)Google Scholar
  13. 13.
    Hooimeijer, P., Weimer, W.: Solving string constraints lazily. In: Proceedings of the IEEE/ACM International Conference on Automated Software Engineering, ASE 2010, pp. 377–386 (2010)Google Scholar
  14. 14.
    Hopcroft, J.E., Motwani, R., Ullman, J.D.: Introduction to Automata Theory, Languages, and Computation. Pearson/Addison Wesley, Upper Saddle River (2007) Google Scholar
  15. 15.
    Jeż, A.: Recompression: word equations and beyond. In: Béal, M.-P., Carton, O. (eds.) DLT 2013. LNCS, vol. 7907, pp. 12–26. Springer, Heidelberg (2013) CrossRefGoogle Scholar
  16. 16.
    Karhumäki, J., Mignosi, F., Plandowski, W.: The expressibility of languages and relations by word equations. J. ACM 47(3), 483–505 (2000)MathSciNetCrossRefGoogle Scholar
  17. 17.
    Kausler, S.: Evaluation of string constraint solvers using dynamic symbolic execution. Master’s thesis, Boise State University (2014)Google Scholar
  18. 18.
    Kausler, S., Sherman, E.: Evaluation of string constraint solvers in the context of symbolic execution. In: Proceedings of the 29th ACM/IEEE International Conference on Automated Software Engineering, ASE 2014, pp. 259–270. ACM, New York, NY, USA (2014)Google Scholar
  19. 19.
    Kiezun, A., Ganesh, V., Guo, P.J., Hooimeijer, P., Ernst, M.D.: Hampi: a solver for string constraints. In: Proceedings of the Eighteenth International Symposium on Software Testing and Analysis, ISSTA 2009, pp. 105–116 (2009)Google Scholar
  20. 20.
    Li, G., Andreasen, E., Ghosh, I.: SymJS: automatic symbolic testing of javascript web applications. In: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSE 2014, pp. 449–459 (2014)Google Scholar
  21. 21.
    Li, G., Ghosh, I.: PASS: string solving with parameterized array and interval automaton. In: Bertacco, V., Legay, A. (eds.) HVC 2013. LNCS, vol. 8244, pp. 15–31. Springer, Heidelberg (2013) CrossRefGoogle Scholar
  22. 22.
    Liang, T., Reynolds, A., Tinelli, C., Barrett, C., Deters, M.: A DPLL(T) theory solver for a theory of strings and regular expressions. In: Biere, A., Bloem, R. (eds.) CAV 2014. LNCS, vol. 8559, pp. 646–662. Springer, Heidelberg (2014) Google Scholar
  23. 23.
    Makanin, G.S.: The problem of solvability of equations in a free semigroup. Math. Sbornik 103, 147–236 (1977). English transl. in Math USSR Sbornik 32 (1977)MathSciNetGoogle Scholar
  24. 24.
    Matiyasevich, Y.: Word equations, fibonacci numbers, and hilbert’s tenth problem. In: Workshop on Fibonacci Words (2007)Google Scholar
  25. 25.
    Plandowski, W.: Satisfiability of word equations with constants is in pspace. J. ACM 51(3), 483–496 (2004)MathSciNetCrossRefGoogle Scholar
  26. 26.
    Plandowski, W.: An efficient algorithm for solving word equations. In: Proceedings of the 38th Annual ACM Symposium on Theory of Computing, STOC 2006, pp. 467–476 (2006)Google Scholar
  27. 27.
    Redelinghuys, G., Visser, W., Geldenhuys, J.: Symbolic execution of programs with strings. In: Proceedings of the South African Institute for Computer Scientists and Information Technologists Conference, SAICSIT 2012, pp. 139–148 (2012)Google Scholar
  28. 28.
    Saxena, P., Akhawe, D., Hanna, S., Mao, F., McCamant, S., Song, D.: A symbolic execution framework for javascript. In: Proceedings of the 2010 IEEE Symposium on Security and Privacy, SP 2010, pp. 513–528 (2010)Google Scholar
  29. 29.
    Schulz, K.: Makanin’s algorithm for word equations-two improvements and a generalization. In: Schulz, K. (ed.) Word Equations and Related Topics. LNCS, vol. 572, pp. 85–150. Springer, Heidelberg (1992)CrossRefGoogle Scholar
  30. 30.
    Tateishi, T., Pistoia, M., Tripp, O.: Path- and index-sensitive string analysis based on monadic second-order logic. ACM Trans. Softw. Eng. Methodol. 22(4), 33:1–33:33 (2013)CrossRefGoogle Scholar
  31. 31.
    Trinh, M.-T., Chu, D.-H., Jaffar, J.: S3: A symbolic string solver for vulnerability detection in web applications. In: Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, CCS 2014, pp. 1232–1243 (2014)Google Scholar
  32. 32.
    Yu, F., Alkhalaf, M., Bultan, T.: Stranger: an automata-based string analysis tool for PHP. In: Esparza, J., Majumdar, R. (eds.) TACAS 2010. LNCS, vol. 6015, pp. 154–157. Springer, Heidelberg (2010) CrossRefGoogle Scholar
  33. 33.
    Yu, F., Bultan, T., Ibarra, O.H.: Symbolic string verification: combining string analysis and size analysis. In: Kowalewski, S., Philippou, A. (eds.) TACAS 2009. LNCS, vol. 5505, pp. 322–336. Springer, Heidelberg (2009) CrossRefGoogle Scholar
  34. 34.
    Zheng, Y., Ganesh, V., Subramanian, S., Tripp, O., Dolby, J., Zhang, X.: Effective search-space pruning for solvers of string equations, regular expressions and length constraints. Technical report (2015). https://sites.google.com/site/z3strsolver/publications
  35. 35.
    Zheng, Y., Zhang, X., Ganesh, V.: Z3-str: a z3-based string solver for web application analysis. In: Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering, ESEC/FSE 2013, pp. 114–124 (2013)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Yunhui Zheng
    • 1
  • Vijay Ganesh
    • 2
  • Sanu Subramanian
    • 2
  • Omer Tripp
    • 1
  • Julian Dolby
    • 1
  • Xiangyu Zhang
    • 3
  1. 1.IBM T.J. Watson Research CenterYorktown HeightsUSA
  2. 2.University of WaterlooWaterlooCanada
  3. 3.Purdue UniversityWest LafayetteUSA

Personalised recommendations