Abstract
Modern web applications often suffer from command injection attacks. Even when equipped with sanitization code, many systems can be penetrated due to software bugs. It is desirable to automatically discover such vulnerabilities, given the bytecode of a web application. One approach would be symbolically executing the target system and constructing constraints for matching path conditions and attack patterns. Solving these constraints yields an attack signature, based on which, the attack process can be replayed. Constraint solving is the key to symbolic execution. For web applications, string constraints receive most of the attention because web applications are essentially text processing programs. We present simple linear string equation (SISE), a decidable fragment of the general string constraint system. SISE models a collection of regular replacement operations (such as the greedy, reluctant, declarative, and finite replacement), which are frequently used by text processing programs. Various automata techniques are proposed for simulating procedural semantics such as left-most matching. By composing atomic transducers of a SISE, we show that a recursive algorithm can be used to compute the solution pool, which contains the value range of each variable in concrete solutions. Then a concrete variable solution can be synthesized from a solution pool. To accelerate solver performance, a symbolic representation of finite state transducer is developed. This allows the constraint solver to support a 16-bit Unicode alphabet in practice. The algorithm is implemented in a Java constraint solver called SUSHI. We compare the applicability and performance of SUSHI with Kaluza, a bounded string solver.
Similar content being viewed by others
References
Anley C (2002) Advanced SQL injection in SQL server applications. Next generation security software
Anand S, Pasareanu CS, Visser W (2007) JPF-SE: a symbolic execution extension to Java pathfinder. In: Proceedings of the 13th international conference on tools and algorithms for construction and analysis of systems (TACAS), pp 134–138
Alur R, Černý P (2011) Streaming transducers for algorithmic verification of single-pass list-processing programs. In: Proceedings of the 38th annual ACM SIGPLAN-SIGACT symposium on principles of programming languages (POPL), pp 599–610
Brat G, Havelund K, Park S, Visser W (2000) Java path finder: second generation of a Java model checker. In: Workshop on advances in verification
Boyd SW, Keromytis AD (2004) SQLrand: preventing SQL injection attacks. In: Proceedings of the 2nd applied cryptography and network security conference (ACNS). Lecture notes in computer science, vol 3089. Springer, pp 292–302
Büchi JR, Senger S (1998) Definability in the existential theory of concatenation and undecidable extensions of this theory. Zeitschr f math Logik und Grundlagen d Math 34: 337–342
Bjørner N, Tillmann N, Voronkov A (2009) Path feasibility analysis for string-manipulating programs. In: Proceedings of the 15th international conference on tools and algorithms for the construction and analysis of systems (TACAS). Springer, pp 307–321
Chaudhuri A, Foster JS (2010) Symbolic security analysis of ruby-on-rails web applications. In: Proceedings of the 17th ACM conference on computer and communications security (CCS), pp 585–594
Cadar C, Ganesh V, Pawlowski PM, Dill DL, Engler DR (2006) EXE: automatically generating inputs of death. In: Proceedings of the 13th ACM conference on computer and communications security (CCS), pp 322–335
Christey SM (2006) Dynamic evaluation vulnerabilities in PHP applications. http://seclists.org/fulldisclosure/2006/May/35
Christensen AS, Møller A, Schwartzbach MI (2003) Extending Java for high-level web service construction. ACM Trans Program Lang Syst 25(6): 814–875
Christensen AS, Moller A, Schwartzbach MI (2003) Precise analysis of string expressions. In: Proceedings of the 10th international static analysis symposium (SAS), pp 1–18
Caballero J, Poosankam P, McCamant S, Babic D, Song D (2010) Input generation via decomposition and re-stitching: finding bugs in Malware. In: Proceedings of the 17th ACM conference on computer and communications security (CCS), pp 413–425
Fu X, Li CC (2010) A string constraint solver for detecting web application vulnerability. In: Proceedings of the 22nd international conference on software engineering and knowledge engineering (SEKE), pp 535–542
Fu X, Li CC (2010) Modeling regular replacement for string constraint solving. In: Proceedings of the 2nd NASA formal methods symposium (NFM), pp 67–76
Fu X, Lu X, Peltsverger B, Chen S, Qian K, Tao L (2007) A static analysis framework for detecting SQL injection vulnerabilities. In: Proceedings of 31st annual international computer software and applications conference (COMPSAC), pp 87–96
Fu X, Qian K (2008) SAFELI: SQL injection scanner using symbolic execution. In: Proceedings of the 2008 workshop on testing, analysis, and verification of web services and applications, pp 34–39
Fu X, Qian K, Peltsverger B, Tao L, Liu J (2008) APOGEE: automated project grading and instant feedback system for web based computing. In: Proceedings of the 39th SIGCSE technical symposium on computer science education (SIGCSE), pp 77–81
Fu X (2009) SUSHI: a solver for single linear string equations. http://people.hofstra.edu/Xiang_Fu/XiangFu/projects.php
Gould C, Su Z, Devanbu PT (2004) JDBC checker: a static analysis tool for SQL/JDBC applications. In: Proceedings of the 26th international conference on software engineering (ICSE), pp 697–698
Huang YW, Huang SK, Lin TP, Tsai CH (2003) Web application security assessment by fault injection and behavior monitoring. In: Proceedings of the 12th international world wide web conference (WWW), pp 148–159
Hooimeijer P, Livshits B, Molnar D, Saxena P, Veanes M (2011) Fast and precise sanitizer analysis with BEK. In: Proceedings of the 20th USENIX security symposium (to appear)
Henglein F, Nielsen L (2011) Regular expression containment: coinductive axiomatization and computational interpretation. In: Proceedings of the 38th annual ACM SIGPLAN-SIGACT symposium on principles of programming languages (POPL), pp 385–398
Halfond W, Orso A (2005) AMNESIA: analysis and monitoring for NEutralizing SQL-injection attacks. In: Proceedings of the 20th IEEE/ACM international conference on automated software engineering (ASE), pp 174–183
HP WebInspect (2011) https://download.hpsmartupdate.com/webinspect/. Accessed July 2011
Hopcroft JE, Ullman JD (1979) Introduction to automata theory, languages, and computation. Addison-Wesley
Hooimeijer P, Veanes M (2011) An evaluation of automata algorithms for string analysis. In: Proceedings of the 12th international conference on verification, model checking, and abstract interpretation (VMCAI), pp 248–262
Hooimeijer P, Weimer W (2009) A decision procedure for subset constraints over regular languages. In: Proceedings of the 2009 ACM SIGPLAN conference on programming language design and implementation (PLDI), pp 188–198
Hooimeijer P, Weimer W (2010) Solving string constraints lazily. In: Proceedings of the 25th IEEE/ACM international conference on automated software engineering (ASE), pp 377–386
Jurafsky D, Martin JH (2008) Speech and language processing (2e). Prentice Hall
Karttunen L, Chanod J-P, Grefenstette G, Schille A (1996) Regular expressions for language engineering. Nat Lang Eng 2: 305–328
Kiezun A, Ganesh V, Guo PJ, Hooimeijer P, Ernst MD (2009) HAMPI: a solver for string constraints. In: Proceedings of the 18th international symposium on testing and analysis (ISSTA), pp 105–116
Kiezun A, Guo PJ, Jayaraman K, Ernst MD (2009) Automatic creation of SQL injection and cross-site scripting attacks. In: Proceedings of the 31st international conference on software engineering (ICSE), pp 199–209
King JC (1976) Symbolic execution and program testing. Commun ACM 19(7): 385–394
Kaplan RM, Kay M (1994) Regular models of phonological rule systems. Comput Linguist 20(3): 331–378
Kirkegaard C, Møller A (2006) Static analysis for Java servlets and JSP. In: Proceedings of the 13th international static analysis symposium (SAS), pp 336–352
Labs@gdssecurity.com. (2009) Adobe Flex SDK Input Validation Bug in ‘index.template.html’ Permits Cross-Site Scripting Attacks. http://www.securitytracker.com/alerts/2009/Aug/1022748.html
Lothaire M (2002) Algebraic combinatorics on words. Cambridge University Press
Makanin GS (1977) The problem of solvability of equations in a free semigroup. Math USSR-Sbornik 32(2): 129–198
Minamide Y (2005) Static approximation of dynamically generated Web pages. In: Proceedings of the 14th international conference on World Wide Web (WWW), pp 432–441
Moser A, Kruegel C, Kirda K (2007) Exploring multiple execution paths for Malware analysis. In: Proceedings of the 2007 IEEE symposium on security and privacy (S&P), pp 231–245
Mohri M, Nederhof MJ (2001) Regular approximation of context-free grammars through transformation. Robustness Lang Speech Technol 153–163
Møller A (2009) The dk.brics.automaton package. http://www.brics.dk/automaton/. Accessed July 2009
Mohri M (1997) Finite-state transducers in language and speech processing. Comput Linguist 23(2): 269–311
Newsham T (2000) Format string attacks. Bugtraq mailing list. http://seclists.org/bugtraq/2000/Sep/0214.html
Nguyen-Tuong A, Guarnieri S, Greene D, Shirley J, Evans D (2005) Automatically hardening Web applications using precise tainting. In: Proceedings of the 20th IFIP international information security conference (SEC), pp 295–308
Pugh W (1994) The Omega project. http://www.cs.umd.edu/projects/omega/
Rafail J (2001) Cross-site scripting vulnerabilities. CERT Coordination Center, Carnegie Mellon University. http://www.cert.org/archive/pdf/cross_site_scripting.pdf
Rozenberg G, Salomaa A (ed) (1997) Handbook of formal languages. Word, language, grammar, vol 1. Springer
Saxena P, Akhawe D, Hanna S, Mao F, McCamant S, Song D (2010) A symbolic execution framework for JavaScript. In: Proceedings of the 31st IEEE symposium on security and privacy (S&P), pp 513–528
Saxena P, Akhawe D, McCamant S, Song D (2010) Kaluza constraint solver. http://webblaze.cs.berkeley.edu/2010/kaluza/
Shiflett C (2004) Security corner: cross-site request forgeries. http://shiflett.org/articles/cross-site-request-forgeries
Sullo C, Lodge D (2010) Nikto. http://www.cirt.net/nikto2. Accessed July 2010
Veanes M, Bjørner N, de Moura L (2010) Symbolic automata constraint solving. In: Proceedings of the 17th international conference of logic for programming, artificial intelligence, and reasoning (LPAR), pp 640–654
Xie T, Marinov D, Schulte W, Notkin D (2005) Symstra: a framework for generating object-oriented unit tests using symbolic execution. In: Proceedings of the 11th international conference on tools and algorithms for the construction and analysis of systems (TACAS), pp 365–381
Yu F, Alkhalaf M, Bultan T (2009) Generating vulnerability signatures for string manipulating programs using automata-based forward and backward symbolic analyses. In: Proceedings of the 24th IEEE/ACM international conference on automated software engineering (ASE), pp 605–609
Yu F, Alkhalaf M, Bultan T (2010) Stranger: an automata-based string analysis tool for PHP. In: Proceedings of the 16th international conference on tools and algorithms for the construction and analysis of systems (TACAS), pp 154–157
Yu F, Bultan T, Cova M, Ibarra OH (2008) Symbolic string verification: an automata-based approach. In: Proceedings of the 15th SPIN workshop on model checking software (SPIN), pp 306–324
Yu F, Bultan T, Ibarra OH (2009) Symbolic string verification: combining string analysis and size analysis. In: Proceedings of the 15th international conference on tools and algorithms for the construction and analysis of systems (TACAS), pp 322–336. Springer
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Fu, X., Powell, M.C., Bantegui, M. et al. Simple linear string constraints. Form Asp Comp 25, 847–891 (2013). https://doi.org/10.1007/s00165-011-0214-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00165-011-0214-3