Skip to main content
Log in

Simple linear string constraints

  • Original Article
  • Published:
Formal Aspects of Computing

Abstract

Modern web applications often suffer from command injection attacks. Even when equipped with sanitization code, many systems can be penetrated due to software bugs. It is desirable to automatically discover such vulnerabilities, given the bytecode of a web application. One approach would be symbolically executing the target system and constructing constraints for matching path conditions and attack patterns. Solving these constraints yields an attack signature, based on which, the attack process can be replayed. Constraint solving is the key to symbolic execution. For web applications, string constraints receive most of the attention because web applications are essentially text processing programs. We present simple linear string equation (SISE), a decidable fragment of the general string constraint system. SISE models a collection of regular replacement operations (such as the greedy, reluctant, declarative, and finite replacement), which are frequently used by text processing programs. Various automata techniques are proposed for simulating procedural semantics such as left-most matching. By composing atomic transducers of a SISE, we show that a recursive algorithm can be used to compute the solution pool, which contains the value range of each variable in concrete solutions. Then a concrete variable solution can be synthesized from a solution pool. To accelerate solver performance, a symbolic representation of finite state transducer is developed. This allows the constraint solver to support a 16-bit Unicode alphabet in practice. The algorithm is implemented in a Java constraint solver called SUSHI. We compare the applicability and performance of SUSHI with Kaluza, a bounded string solver.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Anley C (2002) Advanced SQL injection in SQL server applications. Next generation security software

  2. Anand S, Pasareanu CS, Visser W (2007) JPF-SE: a symbolic execution extension to Java pathfinder. In: Proceedings of the 13th international conference on tools and algorithms for construction and analysis of systems (TACAS), pp 134–138

  3. Alur R, Černý P (2011) Streaming transducers for algorithmic verification of single-pass list-processing programs. In: Proceedings of the 38th annual ACM SIGPLAN-SIGACT symposium on principles of programming languages (POPL), pp 599–610

  4. Brat G, Havelund K, Park S, Visser W (2000) Java path finder: second generation of a Java model checker. In: Workshop on advances in verification

  5. Boyd SW, Keromytis AD (2004) SQLrand: preventing SQL injection attacks. In: Proceedings of the 2nd applied cryptography and network security conference (ACNS). Lecture notes in computer science, vol 3089. Springer, pp 292–302

  6. Büchi JR, Senger S (1998) Definability in the existential theory of concatenation and undecidable extensions of this theory. Zeitschr f math Logik und Grundlagen d Math 34: 337–342

    Google Scholar 

  7. Bjørner N, Tillmann N, Voronkov A (2009) Path feasibility analysis for string-manipulating programs. In: Proceedings of the 15th international conference on tools and algorithms for the construction and analysis of systems (TACAS). Springer, pp 307–321

  8. Chaudhuri A, Foster JS (2010) Symbolic security analysis of ruby-on-rails web applications. In: Proceedings of the 17th ACM conference on computer and communications security (CCS), pp 585–594

  9. Cadar C, Ganesh V, Pawlowski PM, Dill DL, Engler DR (2006) EXE: automatically generating inputs of death. In: Proceedings of the 13th ACM conference on computer and communications security (CCS), pp 322–335

  10. Christey SM (2006) Dynamic evaluation vulnerabilities in PHP applications. http://seclists.org/fulldisclosure/2006/May/35

  11. Christensen AS, Møller A, Schwartzbach MI (2003) Extending Java for high-level web service construction. ACM Trans Program Lang Syst 25(6): 814–875

    Article  Google Scholar 

  12. Christensen AS, Moller A, Schwartzbach MI (2003) Precise analysis of string expressions. In: Proceedings of the 10th international static analysis symposium (SAS), pp 1–18

  13. Caballero J, Poosankam P, McCamant S, Babic D, Song D (2010) Input generation via decomposition and re-stitching: finding bugs in Malware. In: Proceedings of the 17th ACM conference on computer and communications security (CCS), pp 413–425

  14. Fu X, Li CC (2010) A string constraint solver for detecting web application vulnerability. In: Proceedings of the 22nd international conference on software engineering and knowledge engineering (SEKE), pp 535–542

  15. Fu X, Li CC (2010) Modeling regular replacement for string constraint solving. In: Proceedings of the 2nd NASA formal methods symposium (NFM), pp 67–76

  16. Fu X, Lu X, Peltsverger B, Chen S, Qian K, Tao L (2007) A static analysis framework for detecting SQL injection vulnerabilities. In: Proceedings of 31st annual international computer software and applications conference (COMPSAC), pp 87–96

  17. Fu X, Qian K (2008) SAFELI: SQL injection scanner using symbolic execution. In: Proceedings of the 2008 workshop on testing, analysis, and verification of web services and applications, pp 34–39

  18. Fu X, Qian K, Peltsverger B, Tao L, Liu J (2008) APOGEE: automated project grading and instant feedback system for web based computing. In: Proceedings of the 39th SIGCSE technical symposium on computer science education (SIGCSE), pp 77–81

  19. Fu X (2009) SUSHI: a solver for single linear string equations. http://people.hofstra.edu/Xiang_Fu/XiangFu/projects.php

  20. Gould C, Su Z, Devanbu PT (2004) JDBC checker: a static analysis tool for SQL/JDBC applications. In: Proceedings of the 26th international conference on software engineering (ICSE), pp 697–698

  21. Huang YW, Huang SK, Lin TP, Tsai CH (2003) Web application security assessment by fault injection and behavior monitoring. In: Proceedings of the 12th international world wide web conference (WWW), pp 148–159

  22. Hooimeijer P, Livshits B, Molnar D, Saxena P, Veanes M (2011) Fast and precise sanitizer analysis with BEK. In: Proceedings of the 20th USENIX security symposium (to appear)

  23. Henglein F, Nielsen L (2011) Regular expression containment: coinductive axiomatization and computational interpretation. In: Proceedings of the 38th annual ACM SIGPLAN-SIGACT symposium on principles of programming languages (POPL), pp 385–398

  24. Halfond W, Orso A (2005) AMNESIA: analysis and monitoring for NEutralizing SQL-injection attacks. In: Proceedings of the 20th IEEE/ACM international conference on automated software engineering (ASE), pp 174–183

  25. HP WebInspect (2011) https://download.hpsmartupdate.com/webinspect/. Accessed July 2011

  26. Hopcroft JE, Ullman JD (1979) Introduction to automata theory, languages, and computation. Addison-Wesley

  27. Hooimeijer P, Veanes M (2011) An evaluation of automata algorithms for string analysis. In: Proceedings of the 12th international conference on verification, model checking, and abstract interpretation (VMCAI), pp 248–262

  28. Hooimeijer P, Weimer W (2009) A decision procedure for subset constraints over regular languages. In: Proceedings of the 2009 ACM SIGPLAN conference on programming language design and implementation (PLDI), pp 188–198

  29. Hooimeijer P, Weimer W (2010) Solving string constraints lazily. In: Proceedings of the 25th IEEE/ACM international conference on automated software engineering (ASE), pp 377–386

  30. Jurafsky D, Martin JH (2008) Speech and language processing (2e). Prentice Hall

  31. Karttunen L, Chanod J-P, Grefenstette G, Schille A (1996) Regular expressions for language engineering. Nat Lang Eng 2: 305–328

    Article  Google Scholar 

  32. Kiezun A, Ganesh V, Guo PJ, Hooimeijer P, Ernst MD (2009) HAMPI: a solver for string constraints. In: Proceedings of the 18th international symposium on testing and analysis (ISSTA), pp 105–116

  33. Kiezun A, Guo PJ, Jayaraman K, Ernst MD (2009) Automatic creation of SQL injection and cross-site scripting attacks. In: Proceedings of the 31st international conference on software engineering (ICSE), pp 199–209

  34. King JC (1976) Symbolic execution and program testing. Commun ACM 19(7): 385–394

    Article  MATH  Google Scholar 

  35. Kaplan RM, Kay M (1994) Regular models of phonological rule systems. Comput Linguist 20(3): 331–378

    Google Scholar 

  36. Kirkegaard C, Møller A (2006) Static analysis for Java servlets and JSP. In: Proceedings of the 13th international static analysis symposium (SAS), pp 336–352

  37. Labs@gdssecurity.com. (2009) Adobe Flex SDK Input Validation Bug in ‘index.template.html’ Permits Cross-Site Scripting Attacks. http://www.securitytracker.com/alerts/2009/Aug/1022748.html

  38. Lothaire M (2002) Algebraic combinatorics on words. Cambridge University Press

  39. Makanin GS (1977) The problem of solvability of equations in a free semigroup. Math USSR-Sbornik 32(2): 129–198

    Article  MATH  Google Scholar 

  40. Minamide Y (2005) Static approximation of dynamically generated Web pages. In: Proceedings of the 14th international conference on World Wide Web (WWW), pp 432–441

  41. Moser A, Kruegel C, Kirda K (2007) Exploring multiple execution paths for Malware analysis. In: Proceedings of the 2007 IEEE symposium on security and privacy (S&P), pp 231–245

  42. Mohri M, Nederhof MJ (2001) Regular approximation of context-free grammars through transformation. Robustness Lang Speech Technol 153–163

  43. Møller A (2009) The dk.brics.automaton package. http://www.brics.dk/automaton/. Accessed July 2009

  44. Mohri M (1997) Finite-state transducers in language and speech processing. Comput Linguist 23(2): 269–311

    MathSciNet  Google Scholar 

  45. Newsham T (2000) Format string attacks. Bugtraq mailing list. http://seclists.org/bugtraq/2000/Sep/0214.html

  46. Nguyen-Tuong A, Guarnieri S, Greene D, Shirley J, Evans D (2005) Automatically hardening Web applications using precise tainting. In: Proceedings of the 20th IFIP international information security conference (SEC), pp 295–308

  47. Pugh W (1994) The Omega project. http://www.cs.umd.edu/projects/omega/

  48. Rafail J (2001) Cross-site scripting vulnerabilities. CERT Coordination Center, Carnegie Mellon University. http://www.cert.org/archive/pdf/cross_site_scripting.pdf

  49. Rozenberg G, Salomaa A (ed) (1997) Handbook of formal languages. Word, language, grammar, vol 1. Springer

  50. Saxena P, Akhawe D, Hanna S, Mao F, McCamant S, Song D (2010) A symbolic execution framework for JavaScript. In: Proceedings of the 31st IEEE symposium on security and privacy (S&P), pp 513–528

  51. Saxena P, Akhawe D, McCamant S, Song D (2010) Kaluza constraint solver. http://webblaze.cs.berkeley.edu/2010/kaluza/

  52. Shiflett C (2004) Security corner: cross-site request forgeries. http://shiflett.org/articles/cross-site-request-forgeries

  53. Sullo C, Lodge D (2010) Nikto. http://www.cirt.net/nikto2. Accessed July 2010

  54. Veanes M, Bjørner N, de Moura L (2010) Symbolic automata constraint solving. In: Proceedings of the 17th international conference of logic for programming, artificial intelligence, and reasoning (LPAR), pp 640–654

  55. Xie T, Marinov D, Schulte W, Notkin D (2005) Symstra: a framework for generating object-oriented unit tests using symbolic execution. In: Proceedings of the 11th international conference on tools and algorithms for the construction and analysis of systems (TACAS), pp 365–381

  56. Yu F, Alkhalaf M, Bultan T (2009) Generating vulnerability signatures for string manipulating programs using automata-based forward and backward symbolic analyses. In: Proceedings of the 24th IEEE/ACM international conference on automated software engineering (ASE), pp 605–609

  57. Yu F, Alkhalaf M, Bultan T (2010) Stranger: an automata-based string analysis tool for PHP. In: Proceedings of the 16th international conference on tools and algorithms for the construction and analysis of systems (TACAS), pp 154–157

  58. Yu F, Bultan T, Cova M, Ibarra OH (2008) Symbolic string verification: an automata-based approach. In: Proceedings of the 15th SPIN workshop on model checking software (SPIN), pp 306–324

  59. Yu F, Bultan T, Ibarra OH (2009) Symbolic string verification: combining string analysis and size analysis. In: Proceedings of the 15th international conference on tools and algorithms for the construction and analysis of systems (TACAS), pp 322–336. Springer

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiang Fu.

Additional information

Jim Woodcock

This paper is based on the preliminary findings reported in [FLP+07, FQ08, FL10a, FL10b].

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fu, X., Powell, M.C., Bantegui, M. et al. Simple linear string constraints. Form Asp Comp 25, 847–891 (2013). https://doi.org/10.1007/s00165-011-0214-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00165-011-0214-3

Keywords

Navigation