Formal Aspects of Computing

, Volume 25, Issue 6, pp 847–891 | Cite as

Simple linear string constraints

  • Xiang Fu
  • Michael C. Powell
  • Michael Bantegui
  • Chung-Chih Li
Original Article

Abstract

Modern web applications often suffer from command injection attacks. Even when equipped with sanitization code, many systems can be penetrated due to software bugs. It is desirable to automatically discover such vulnerabilities, given the bytecode of a web application. One approach would be symbolically executing the target system and constructing constraints for matching path conditions and attack patterns. Solving these constraints yields an attack signature, based on which, the attack process can be replayed. Constraint solving is the key to symbolic execution. For web applications, string constraints receive most of the attention because web applications are essentially text processing programs. We present simple linear string equation (SISE), a decidable fragment of the general string constraint system. SISE models a collection of regular replacement operations (such as the greedy, reluctant, declarative, and finite replacement), which are frequently used by text processing programs. Various automata techniques are proposed for simulating procedural semantics such as left-most matching. By composing atomic transducers of a SISE, we show that a recursive algorithm can be used to compute the solution pool, which contains the value range of each variable in concrete solutions. Then a concrete variable solution can be synthesized from a solution pool. To accelerate solver performance, a symbolic representation of finite state transducer is developed. This allows the constraint solver to support a 16-bit Unicode alphabet in practice. The algorithm is implemented in a Java constraint solver called SUSHI. We compare the applicability and performance of SUSHI with Kaluza, a bounded string solver.

Keywords

String analysis Symbolic execution Constraint solving Vulnerability detection 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Anl02.
    Anley C (2002) Advanced SQL injection in SQL server applications. Next generation security softwareGoogle Scholar
  2. APV07.
    Anand S, Pasareanu CS, Visser W (2007) JPF-SE: a symbolic execution extension to Java pathfinder. In: Proceedings of the 13th international conference on tools and algorithms for construction and analysis of systems (TACAS), pp 134–138Google Scholar
  3. Av11.
    Alur R, Černý P (2011) Streaming transducers for algorithmic verification of single-pass list-processing programs. In: Proceedings of the 38th annual ACM SIGPLAN-SIGACT symposium on principles of programming languages (POPL), pp 599–610Google Scholar
  4. BHPV00.
    Brat G, Havelund K, Park S, Visser W (2000) Java path finder: second generation of a Java model checker. In: Workshop on advances in verificationGoogle Scholar
  5. BK04.
    Boyd SW, Keromytis AD (2004) SQLrand: preventing SQL injection attacks. In: Proceedings of the 2nd applied cryptography and network security conference (ACNS). Lecture notes in computer science, vol 3089. Springer, pp 292–302Google Scholar
  6. BS88.
    Büchi JR, Senger S (1998) Definability in the existential theory of concatenation and undecidable extensions of this theory. Zeitschr f math Logik und Grundlagen d Math 34: 337–342Google Scholar
  7. BTV09.
    Bjørner N, Tillmann N, Voronkov A (2009) Path feasibility analysis for string-manipulating programs. In: Proceedings of the 15th international conference on tools and algorithms for the construction and analysis of systems (TACAS). Springer, pp 307–321Google Scholar
  8. CF10.
    Chaudhuri A, Foster JS (2010) Symbolic security analysis of ruby-on-rails web applications. In: Proceedings of the 17th ACM conference on computer and communications security (CCS), pp 585–594Google Scholar
  9. CGP+06.
    Cadar C, Ganesh V, Pawlowski PM, Dill DL, Engler DR (2006) EXE: automatically generating inputs of death. In: Proceedings of the 13th ACM conference on computer and communications security (CCS), pp 322–335Google Scholar
  10. Chr06.
    Christey SM (2006) Dynamic evaluation vulnerabilities in PHP applications. http://seclists.org/fulldisclosure/2006/May/35
  11. CMS03a.
    Christensen AS, Møller A, Schwartzbach MI (2003) Extending Java for high-level web service construction. ACM Trans Program Lang Syst 25(6): 814–875CrossRefGoogle Scholar
  12. CMS03b.
    Christensen AS, Moller A, Schwartzbach MI (2003) Precise analysis of string expressions. In: Proceedings of the 10th international static analysis symposium (SAS), pp 1–18Google Scholar
  13. CPM+10.
    Caballero J, Poosankam P, McCamant S, Babic D, Song D (2010) Input generation via decomposition and re-stitching: finding bugs in Malware. In: Proceedings of the 17th ACM conference on computer and communications security (CCS), pp 413–425Google Scholar
  14. FL10a.
    Fu X, Li CC (2010) A string constraint solver for detecting web application vulnerability. In: Proceedings of the 22nd international conference on software engineering and knowledge engineering (SEKE), pp 535–542Google Scholar
  15. FL10b.
    Fu X, Li CC (2010) Modeling regular replacement for string constraint solving. In: Proceedings of the 2nd NASA formal methods symposium (NFM), pp 67–76Google Scholar
  16. FLP+07.
    Fu X, Lu X, Peltsverger B, Chen S, Qian K, Tao L (2007) A static analysis framework for detecting SQL injection vulnerabilities. In: Proceedings of 31st annual international computer software and applications conference (COMPSAC), pp 87–96Google Scholar
  17. FQ08.
    Fu X, Qian K (2008) SAFELI: SQL injection scanner using symbolic execution. In: Proceedings of the 2008 workshop on testing, analysis, and verification of web services and applications, pp 34–39Google Scholar
  18. FQP+08.
    Fu X, Qian K, Peltsverger B, Tao L, Liu J (2008) APOGEE: automated project grading and instant feedback system for web based computing. In: Proceedings of the 39th SIGCSE technical symposium on computer science education (SIGCSE), pp 77–81Google Scholar
  19. Fu09.
    Fu X (2009) SUSHI: a solver for single linear string equations. http://people.hofstra.edu/Xiang_Fu/XiangFu/projects.php
  20. GSD04.
    Gould C, Su Z, Devanbu PT (2004) JDBC checker: a static analysis tool for SQL/JDBC applications. In: Proceedings of the 26th international conference on software engineering (ICSE), pp 697–698Google Scholar
  21. HHLT03.
    Huang YW, Huang SK, Lin TP, Tsai CH (2003) Web application security assessment by fault injection and behavior monitoring. In: Proceedings of the 12th international world wide web conference (WWW), pp 148–159Google Scholar
  22. HLM+11.
    Hooimeijer P, Livshits B, Molnar D, Saxena P, Veanes M (2011) Fast and precise sanitizer analysis with BEK. In: Proceedings of the 20th USENIX security symposium (to appear)Google Scholar
  23. HN11.
    Henglein F, Nielsen L (2011) Regular expression containment: coinductive axiomatization and computational interpretation. In: Proceedings of the 38th annual ACM SIGPLAN-SIGACT symposium on principles of programming languages (POPL), pp 385–398Google Scholar
  24. HO05.
    Halfond W, Orso A (2005) AMNESIA: analysis and monitoring for NEutralizing SQL-injection attacks. In: Proceedings of the 20th IEEE/ACM international conference on automated software engineering (ASE), pp 174–183Google Scholar
  25. HP.
    HP WebInspect (2011) https://download.hpsmartupdate.com/webinspect/. Accessed July 2011
  26. HU79.
    Hopcroft JE, Ullman JD (1979) Introduction to automata theory, languages, and computation. Addison-WesleyGoogle Scholar
  27. HV11.
    Hooimeijer P, Veanes M (2011) An evaluation of automata algorithms for string analysis. In: Proceedings of the 12th international conference on verification, model checking, and abstract interpretation (VMCAI), pp 248–262Google Scholar
  28. HW09.
    Hooimeijer P, Weimer W (2009) A decision procedure for subset constraints over regular languages. In: Proceedings of the 2009 ACM SIGPLAN conference on programming language design and implementation (PLDI), pp 188–198Google Scholar
  29. HW10.
    Hooimeijer P, Weimer W (2010) Solving string constraints lazily. In: Proceedings of the 25th IEEE/ACM international conference on automated software engineering (ASE), pp 377–386Google Scholar
  30. JM08.
    Jurafsky D, Martin JH (2008) Speech and language processing (2e). Prentice HallGoogle Scholar
  31. KCGS96.
    Karttunen L, Chanod J-P, Grefenstette G, Schille A (1996) Regular expressions for language engineering. Nat Lang Eng 2: 305–328CrossRefGoogle Scholar
  32. KGG+09.
    Kiezun A, Ganesh V, Guo PJ, Hooimeijer P, Ernst MD (2009) HAMPI: a solver for string constraints. In: Proceedings of the 18th international symposium on testing and analysis (ISSTA), pp 105–116Google Scholar
  33. KGJE09.
    Kiezun A, Guo PJ, Jayaraman K, Ernst MD (2009) Automatic creation of SQL injection and cross-site scripting attacks. In: Proceedings of the 31st international conference on software engineering (ICSE), pp 199–209Google Scholar
  34. Kin76.
    King JC (1976) Symbolic execution and program testing. Commun ACM 19(7): 385–394CrossRefMATHGoogle Scholar
  35. KK94.
    Kaplan RM, Kay M (1994) Regular models of phonological rule systems. Comput Linguist 20(3): 331–378Google Scholar
  36. KM06.
    Kirkegaard C, Møller A (2006) Static analysis for Java servlets and JSP. In: Proceedings of the 13th international static analysis symposium (SAS), pp 336–352Google Scholar
  37. Lab09.
    Labs@gdssecurity.com. (2009) Adobe Flex SDK Input Validation Bug in ‘index.template.html’ Permits Cross-Site Scripting Attacks. http://www.securitytracker.com/alerts/2009/Aug/1022748.html
  38. Lot02.
    Lothaire M (2002) Algebraic combinatorics on words. Cambridge University PressGoogle Scholar
  39. Mak77.
    Makanin GS (1977) The problem of solvability of equations in a free semigroup. Math USSR-Sbornik 32(2): 129–198CrossRefMATHGoogle Scholar
  40. Min05.
    Minamide Y (2005) Static approximation of dynamically generated Web pages. In: Proceedings of the 14th international conference on World Wide Web (WWW), pp 432–441Google Scholar
  41. MKK07.
    Moser A, Kruegel C, Kirda K (2007) Exploring multiple execution paths for Malware analysis. In: Proceedings of the 2007 IEEE symposium on security and privacy (S&P), pp 231–245Google Scholar
  42. MN01.
    Mohri M, Nederhof MJ (2001) Regular approximation of context-free grammars through transformation. Robustness Lang Speech Technol 153–163Google Scholar
  43. Mø.
    Møller A (2009) The dk.brics.automaton package. http://www.brics.dk/automaton/. Accessed July 2009
  44. Moh97.
    Mohri M (1997) Finite-state transducers in language and speech processing. Comput Linguist 23(2): 269–311MathSciNetGoogle Scholar
  45. New00.
    Newsham T (2000) Format string attacks. Bugtraq mailing list. http://seclists.org/bugtraq/2000/Sep/0214.html
  46. NTGG+05.
    Nguyen-Tuong A, Guarnieri S, Greene D, Shirley J, Evans D (2005) Automatically hardening Web applications using precise tainting. In: Proceedings of the 20th IFIP international information security conference (SEC), pp 295–308Google Scholar
  47. Pug94.
    Pugh W (1994) The Omega project. http://www.cs.umd.edu/projects/omega/
  48. Raf01.
    Rafail J (2001) Cross-site scripting vulnerabilities. CERT Coordination Center, Carnegie Mellon University. http://www.cert.org/archive/pdf/cross_site_scripting.pdf
  49. RE97.
    Rozenberg G, Salomaa A (ed) (1997) Handbook of formal languages. Word, language, grammar, vol 1. SpringerGoogle Scholar
  50. SAH+10.
    Saxena P, Akhawe D, Hanna S, Mao F, McCamant S, Song D (2010) A symbolic execution framework for JavaScript. In: Proceedings of the 31st IEEE symposium on security and privacy (S&P), pp 513–528Google Scholar
  51. SAMS10.
    Saxena P, Akhawe D, McCamant S, Song D (2010) Kaluza constraint solver. http://webblaze.cs.berkeley.edu/2010/kaluza/
  52. Shi04.
    Shiflett C (2004) Security corner: cross-site request forgeries. http://shiflett.org/articles/cross-site-request-forgeries
  53. SL.
    Sullo C, Lodge D (2010) Nikto. http://www.cirt.net/nikto2. Accessed July 2010
  54. VBdM10.
    Veanes M, Bjørner N, de Moura L (2010) Symbolic automata constraint solving. In: Proceedings of the 17th international conference of logic for programming, artificial intelligence, and reasoning (LPAR), pp 640–654Google Scholar
  55. XMSN05.
    Xie T, Marinov D, Schulte W, Notkin D (2005) Symstra: a framework for generating object-oriented unit tests using symbolic execution. In: Proceedings of the 11th international conference on tools and algorithms for the construction and analysis of systems (TACAS), pp 365–381Google Scholar
  56. YAB09.
    Yu F, Alkhalaf M, Bultan T (2009) Generating vulnerability signatures for string manipulating programs using automata-based forward and backward symbolic analyses. In: Proceedings of the 24th IEEE/ACM international conference on automated software engineering (ASE), pp 605–609Google Scholar
  57. YAB10.
    Yu F, Alkhalaf M, Bultan T (2010) Stranger: an automata-based string analysis tool for PHP. In: Proceedings of the 16th international conference on tools and algorithms for the construction and analysis of systems (TACAS), pp 154–157Google Scholar
  58. YBCI08.
    Yu F, Bultan T, Cova M, Ibarra OH (2008) Symbolic string verification: an automata-based approach. In: Proceedings of the 15th SPIN workshop on model checking software (SPIN), pp 306–324Google Scholar
  59. YBI09.
    Yu F, Bultan T, Ibarra OH (2009) Symbolic string verification: combining string analysis and size analysis. In: Proceedings of the 15th international conference on tools and algorithms for the construction and analysis of systems (TACAS), pp 322–336. SpringerGoogle Scholar

Copyright information

© British Computer Society 2012

Authors and Affiliations

  • Xiang Fu
    • 1
  • Michael C. Powell
    • 1
  • Michael Bantegui
    • 1
  • Chung-Chih Li
    • 2
  1. 1.Department of Computer ScienceHofstra UniversityHempsteadUSA
  2. 2.School of Information and TechnologyIllinois State UniversityNormalUSA

Personalised recommendations