Completeness in Approximate Transduction

  • Mila Dalla Preda
  • Roberto Giacobazzi
  • Isabella Mastroeni
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9837)

Abstract

Symbolic finite automata (SFA) allow the representation of regular languages of strings over an infinite alphabet of symbols. Recently these automata have been studied in the context of abstract interpretation, showing their extreme flexibility in representing languages at different levels of abstraction. Therefore, SFAs can naturally approximate sets of strings by the language they recognise, providing a suitable abstract domain for the analysis of symbolic data structures. In this scenario, transducers model SFA transformations. We characterise the properties of transduction of SFAs that guarantee soundness and completeness of the abstract interpretation of operations manipulating strings. We apply our model to the derivation of sanitisers for preventing cross site scripting attacks in web application security. In this case we extract the code sanitiser directly from the backward (transduction) analysis of the program given the specification of the expected attack in terms of SFA.

Keywords

Abstract interpretation Symbolic automata Symbolic transducers 

References

  1. 1.
    OWASP Top Ten Project (2013). https://www.owasp.org
  2. 2.
    Abdulla, P.A., Jonsson, B., Nilsson, M., Saksena, M.: A survey of regular model checking. In: Gardner, P., Yoshida, N. (eds.) CONCUR 2004. LNCS, vol. 3170, pp. 35–48. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  3. 3.
    Berstel, J.: Transductions and Context-Free Languages. Teubner-Verlag, Stuttgart (2009)MATHGoogle Scholar
  4. 4.
    Bjørner, N., Veanes, M.: Symbolic transducers. Technical report MSR-TR-2011-3, Microsoft Research (2011)Google Scholar
  5. 5.
    Christensen, A.S., Møller, A.: Precise analysis of string expressions. In: Cousot, R. (ed.) SAS 2003. LNCS, vol. 2694, pp. 1–18. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  6. 6.
    Chugh, R., Meister, J.A., Jhala, R., Lerner, S.: Staged information flow for JavaScript. In: Hind, M., Diwan, A. (eds.) PLDI, pp. 50–62. ACM (2009)Google Scholar
  7. 7.
    Cousot, P., Cousot, R.: Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints. In: Conference Record of the 4th ACM Symposium on Principles of Programming Languages (POPL 1977), pp. 238–252. ACM Press (1977)Google Scholar
  8. 8.
    Cousot, P., Cousot, R.: Systematic design of program analysis frameworks. In: Conference Record of the 6th ACM Symposium on Principles of Programming Languages (POPL 1979), pp. 269–282. ACM Press (1979)Google Scholar
  9. 9.
    Cousot, P., Cousot, R.: Formal language, grammar and set-constraint-based program analysis by abstract interpretation. In: Proceedings of the Seventh ACM Conference on Functional Programming Languages and Computer Architecture, pp. 170–181. ACM Press, New York, 25–28 June 1995Google Scholar
  10. 10.
    Dalla Preda, M., Giacobazzi, R., Lakhotia, A., Mastroeni, I.: Abstract symbolic automata: mixed syntactic/semantic similarity analysis of executables. In: Rajamani, S.K., Walker, D. (eds.) Proceedings of the 42nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2015, Mumbai, India, pp. 329–341. ACM, 15–17 January 2015Google Scholar
  11. 11.
    D’Antoni, L., Veanes, M.: Equivalence of extended symbolic finite transducers. In: Sharygina, N., Veith, H. (eds.) CAV 2013. LNCS, vol. 8044, pp. 624–639. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  12. 12.
    D’Antoni, L., Veanes, M.: Minimization of symbolic automata. In: Jagannathan, S., Sewell, P. (eds.) POPL, pp. 541–554. ACM (2014)Google Scholar
  13. 13.
    D’Antoni, L., Veanes, M.: Extended symbolic finite automata and transducers. Formal Methods Syst. Des. 47(1), 93–119 (2015)CrossRefMATHGoogle Scholar
  14. 14.
    Doh, K.-G., Kim, H., Schmidt, D.A.: Abstract parsing: static analysis of dynamically generated string output using LR-parsing technology. In: Palsberg, J., Su, Z. (eds.) SAS 2009. LNCS, vol. 5673, pp. 256–272. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  15. 15.
    Giacobazzi, R., Quintarelli, E.: Incompleteness, counterexamples, and refinements in abstract model-checking. In: Cousot, P. (ed.) SAS 2001. LNCS, vol. 2126, pp. 356–373. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  16. 16.
    Giacobazzi, R., Ranzato, F., Scozzari, F.: Making abstract interpretation complete. J. ACM 47(2), 361–416 (2000)MathSciNetCrossRefMATHGoogle Scholar
  17. 17.
    Heintze, N., Jaffar, J.: Set constraints and set-based analysis. In: Borning, A. (ed.) PPCP 1994. LNCS, vol. 874, pp. 281–298. Springer, Heidelberg (1994)CrossRefGoogle Scholar
  18. 18.
    Hooimeijer, P., Livshits, B., Molnar, D., Saxena, P., Veanes, M.: Bek: Modeling imperative string operations with symbolic transducers. Technical report MSR-TR-2010-154, November 2010Google Scholar
  19. 19.
    Hooimeijer, P., Livshits, B., Molnar, D., Saxena, P., Veanes, M.: Fast and precise sanitizer analysis with BEK. In: Proceedings of the 20th USENIX Security Symposium, San Francisco, CA, USA, USENIX Association, August 8–12 2011Google Scholar
  20. 20.
    Hooimeijer, P., Veanes, M.: An evaluation of automata algorithms for string analysis. In: Jhala, R., Schmidt, D. (eds.) VMCAI 2011. LNCS, vol. 6538, pp. 248–262. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  21. 21.
    Kim, H., Doh, K.-G., Schmidt, D.A.: Static validation of dynamically generated HTML documents based on abstract parsing and semantic processing. In: Logozzo, F., Fähndrich, M. (eds.) Static Analysis. LNCS, vol. 7935, pp. 194–214. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  22. 22.
    Minamide, Y.: Static approximation of dynamically generated web pages. In: Ellis, A., Hagino, T. (eds.) Proceedings of the 14th International Conference on World Wide Web, WWW 2005, Chiba, Japan, pp. 432–441. ACM, 10–14 May 2005Google Scholar
  23. 23.
    Podelski, A.: Automata as proofs. In: Giacobazzi, R., Berdine, J., Mastroeni, I. (eds.) VMCAI 2013. LNCS, vol. 7737, pp. 13–14. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  24. 24.
    Ray, D., Ligatti, J.: Defining code-injection attacks. In: Field, J., Hicks, M. (eds.) POPL, pp. 179–190. ACM (2012)Google Scholar
  25. 25.
    Thiemann, P.: Grammar-based analysis of string expressions. In: Morrisett, J.G., Fähndrich, M. (eds.) Proceedings of TLDI 2005: 2005 ACM SIGPLAN International Workshop on Types in Languages Design and Implementation, Long Beach, CA, USA, pp. 59–70. ACM, 10 January 2005Google Scholar
  26. 26.
    Veanes, M.: Symbolic string transformations with regular lookahead and rollback. In: Voronkov, A., Virbitskaite, I. (eds.) PSI 2014. LNCS, vol. 8974, pp. 335–350. Springer, Heidelberg (2015)Google Scholar
  27. 27.
    Veanes, M., Halleux, P.d., Tillmann, N.: Rex: symbolic regular expression explorer. In: Proceedings of the 2010 Third International Conference on Software Testing, Verification and Validation, ICST 2010, Washington, DC, USA, pp. 498–507. IEEE Computer Society (2010)Google Scholar
  28. 28.
    Veanes, M., Hooimeijer, P., Livshits, B., Molnar, D., Bjørner, N.: Symbolic finite state transducers: algorithms and applications. In: Field, J., Hicks, M. (eds.) POPL, pp. 137–150. ACM (2012)Google Scholar
  29. 29.
    Venet, A.: Automatic analysis of pointer aliasing for untyped programs. Sci. Comput. Program. 35(2), 223–248 (1999)MathSciNetCrossRefMATHGoogle Scholar
  30. 30.
    Yu, F., Alkhalaf, M., Bultan, T.: Patching vulnerabilities with sanitization synthesis. In: Taylor, R.N., Gall, H.C., Medvidovic, N. (eds.) Proceedings of the 33rd International Conference on Software Engineering, ICSE 2011, Waikiki, Honolulu, HI, USA, pp. 251–260. ACM, 21–28 May 2011Google Scholar

Copyright information

© Springer-Verlag GmbH Germany 2016

Authors and Affiliations

  • Mila Dalla Preda
    • 1
  • Roberto Giacobazzi
    • 1
    • 2
  • Isabella Mastroeni
    • 1
  1. 1.University of VeronaVeronaItaly
  2. 2.IMDEA Software InstituteMadridSpain

Personalised recommendations