Symbolic String Verification: An Automata-Based Approach

  • Fang Yu
  • Tevfik Bultan
  • Marco Cova
  • Oscar H. Ibarra
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5156)

Abstract

We present an automata-based approach for the verification of string operations in PHP programs based on symbolic string analysis. String analysis is a static analysis technique that determines the values that a string expression can take during program execution at a given program point. This information can be used to verify that string values are sanitized properly and to detect programming errors and security vulnerabilities. In our string analysis approach, we encode the set of string values that string variables can take as automata. We implement all string functions using a symbolic automata representation (MBDD representation from the MONA automata package) and leverage efficient manipulations on MBDDs, e.g., determinization and minimization. Particularly, we propose a novel algorithm for language-based replacement. Our replacement function takes three DFAs as arguments and outputs a DFA. Finally, we apply a widening operator defined on automata to approximate fixpoint computations. If this conservative approximation does not include any bad patterns (specified as regular expressions), we conclude that the program does not contain any errors or vulnerabilities. Our experimental results demonstrate that our approach works quite well in checking the correctness of sanitization operations in real-world PHP applications.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Balzarotti, D., Cova, M., Felmetsger, V., Jovanovic, N., Kirda, E., Kruegel, C., Vigna, G.: Saner: Composing Static and Dynamic Analysis to Validate Sanitization in Web Applications. In: Proc. Symposium on Security and Privacy (2008)Google Scholar
  2. 2.
    Balzarotti, D., Cova, M., Felmetsger, V., Vigna, G.: Multi-module vulnerability analysis of web-based applications. In: Proc. 14th ACM conference on Computer and communications security, pp. 25–35. ACM, New York (2007)CrossRefGoogle Scholar
  3. 3.
    Bartzis, C., Bultan, T.: Widening arithmetic automata. In: Proc. 16th International Conference on Computer Aided Verification, pp. 321–333 (2004)Google Scholar
  4. 4.
    Biehl, M., Klarlund, N., Rauhe, T.: Algorithms for guided tree automata. In: Raymond, D.R., Yu, S., Wood, D. (eds.) WIA 1996. LNCS, vol. 1260. Springer, Heidelberg (1997)Google Scholar
  5. 5.
    Bouajjani, A., Jonsson, B., Nilsson, M., Touili, T.: Regular model checking. In: Proc. 12th International Conference on Computer Aided Verification, pp. 403–418 (2000)Google Scholar
  6. 6.
    Choi, T.-H., Lee, O., Kim, H., Doh, K.-G.: A practical string analyzer by the widening approach. In: Kobayashi, N. (ed.) APLAS 2006. LNCS, vol. 4279, pp. 374–388. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  7. 7.
    Christensen, A.S., Møller, A., Schwartzbach, M.I.: Precise analysis of string expressions. In: Cousot, R. (ed.) SAS 2003. LNCS, vol. 2694, pp. 1–18. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  8. 8.
    Christodorescu, M., Kidd, N., Goh, W.-H.: String analysis for x86 binaries. In: Proc. 6th ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering (PASTE 2005), September 2005, ACM Press, New York (2005)Google Scholar
  9. 9.
    Fu, X., Lu, X., Peltsverger, B., Chen, S., Qian, K., Tao, L.: A static analysis framework for detecting sql injection vulnerabilities. In: Proc. 31st Annual International Computer Software and Applications Conference. COMPSAC 2007, Washington, DC, USA, vol. 1, pp. 87–96. IEEE Computer Society, Los Alamitos (2007)Google Scholar
  10. 10.
    Gerdemann, D., van Noord, G.: Transducers from rewrite rules with backreferences. In: Proc. 9th Conference of the European Chapter of the Association for Computational Linguistics, pp. 126–133 (1999)Google Scholar
  11. 11.
    Gould, C., Su, Z., Devanbu, P.: Static checking of dynamically generated queries in database applications. In: Proc. 26th International Conference on Software Engineering, pp. 645–654 (2004)Google Scholar
  12. 12.
    Karttunen, L.: The replace operator. In: Proc. 33rd annual meeting on Association for Computational Linguistics, pp. 16–23 (1995)Google Scholar
  13. 13.
    Minamide, Y.: Static approximation of dynamically generated web pages. In: Proc. 14th International World Wide Web Conference, pp. 432–441 (2005)Google Scholar
  14. 14.
    Mohri, M., Sproat, R.: An efficient compiler for weighted rewrite rules. In: Proc. 34th annual meeting on Association for Computational Linguistics, pp. 231–238. Association for Computational Linguistics (1996)Google Scholar
  15. 15.
    Shannon, D., Hajra, S., Lee, A., Zhan, D., Khurshid, S.: Abstracting symbolic execution with string analysis. In: Proc. Testing: Academic and Industrial Conference Practice and Research Techniques - MUTATION, Washington, DC, USA, pp. 13–22. IEEE Computer Society, Los Alamitos (2007)CrossRefGoogle Scholar
  16. 16.
    van Noord, G.: FSA utilities toolbox, http://odur.let.rug.nl/~vannoord/Fsa/
  17. 17.
    van Noord, G., Gerdemann, D.: An extendible regular expression compiler for finite-state approaches in natural language processing. In: Proc. of the 4th International Workshop on Implementing Automata (WIA), July 1999, pp. 122–139. Springer, Heidelberg (1999)Google Scholar
  18. 18.
    Wassermann, G., Su, Z.: Sound and precise analysis of web applications for injection vulnerabilities. In: Proc. ACM SIGPLAN 2007 Conference on Programming Language Design and Implementation, pp. 32–41 (2007)Google Scholar
  19. 19.
    Xie, Y., Aiken, A.: Static detection of security vulnerabilities in scripting languages. In: Proc. 15th conference on USENIX Security Symposium, Berkeley, CA, USA, p. 13. USENIX Association (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Fang Yu
    • 1
  • Tevfik Bultan
    • 1
  • Marco Cova
    • 1
  • Oscar H. Ibarra
    • 1
  1. 1.Department of Computer ScienceUniversity of CaliforniaSanta Barbara 

Personalised recommendations