Abstract Parsing: Static Analysis of Dynamically Generated String Output Using LR-Parsing Technology

  • Kyung-Goo Doh
  • Hyunha Kim
  • David A. Schmidt
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5673)


We combine LR(k)-parsing technology and data-flow analysis to analyze, in advance of execution, the documents generated dynamically by a program. Based on the document language’s context-free reference grammar and the program’s control structure, the analysis predicts how the documents will be generated and parses the predicted documents. Our strategy remembers context-free structure by computing abstract LR-parse stacks. The technique is implemented in Objective Caml and has statically validated a suite of PHP programs that dynamically generate HTML documents.




Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Agrawal, G.: Simultaneous demand-driven data-flow and call graph analysis. In: Proc. Int’l. Conf. Software Maintenance, Oxford (1999)Google Scholar
  2. 2.
    Aho, A., Ullman, J.: Principles of Compiler Design. Addison-Wesley, Reading (1977)MATHGoogle Scholar
  3. 3.
    Brabrand, C., Møller, A., Schwartzbach, M.I.: The <bigwig> project. ACM Trans. Internet Technology 2 (2002)Google Scholar
  4. 4.
    Choi, T.-H., Lee, O., Kim, H., Doh, K.-G.: A practical string analyzer by the widening approach. In: Kobayashi, N. (ed.) APLAS 2006. LNCS, vol. 4279, pp. 374–388. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  5. 5.
    Christensen, A.S., Møller, A., Schwartzbach, M.I.: Static analysis for dynamic XML. In: Proc. PLAN-X 2002 (2002)Google Scholar
  6. 6.
    Christensen, A.S., Møller, A., Schwartzbach, M.I.: Extending Java for high-level web service construction. ACM TOPLAS 25 (2003)Google Scholar
  7. 7.
    Duesterwald, E., Gupta, R., Soffa, M.L.: A practical framework for demand-driven interprocedural data flow analysis. ACM TOPLAS 19, 992–1030 (1997)CrossRefGoogle Scholar
  8. 8.
    Horwitz, S., Reps, T., Sagiv, M.: Demand interprocedural dataflow analysis. In: Proc. 3rd ACM SIGSOFT Symp. Foundations of Software Engg. (1995)Google Scholar
  9. 9.
    Hosoya, H.: XDuce: A typed XML processing language. Technical Report (2008), http://xduce.sourceforge.net/
  10. 10.
    Hosoya, H., Vouillon, J., Pierce, B.C.: Regular expression types for XML. ACM TOPLAS 27, 46–90 (2005)CrossRefMATHGoogle Scholar
  11. 11.
    Jones, N.D., Mycroft, A.: Data flow analysis of applicative programs using minimal function graphs. In: Proc. 13th Symp. POPL, pp. 296–306. ACM Press, New York (1986)Google Scholar
  12. 12.
    Jovanovich, N., Kruegel, C., Kirda, E.: Pixy: A static analysis tool for detecting web application vulnerabilities. In: Proc. IEEE Symp. on Security and Privacy, pp. 258–263 (2006)Google Scholar
  13. 13.
    Kirkegaard, C., Møller, A.: Static analysis for Java Servlets and JSP. In: Yi, K. (ed.) SAS 2006. LNCS, vol. 4134, pp. 336–352. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  14. 14.
    Minamide, Y.: Static approximation of dynamically generated web pages. In: Proc. 14th ACM Int’l Conf. on the World Wide Web, pp. 432–441 (2005)Google Scholar
  15. 15.
    Minimide, Y., Tozawa, A.: XML validation for context-free grammars. In: Kobayashi, N. (ed.) APLAS 2006. LNCS, vol. 4279, pp. 357–373. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  16. 16.
    Nielson, F., Nielson, H.R., Hankin, C.: Principles of Program Analysis. Springer, Heidelberg (1999)CrossRefMATHGoogle Scholar
  17. 17.
    Nishiyama, T., Minimide, Y.: A translation from the HTML DTD into a regular hedge grammar. In: Ibarra, O.H., Ravikumar, B. (eds.) CIAA 2008. LNCS, vol. 5148, pp. 122–131. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  18. 18.
    Thiemann, P.: Grammar-based analysis of string expressions. In: Proc. ACM workshop Types in languages design and implementation, pp. 59–70 (2005)Google Scholar
  19. 19.
    Wassermann, G., Gould, C., Su, Z., Devanbu, P.: Static checking of dymanically generated queries in database applications. ACM Trans. Software Engineering and Methodology 16(4), 1–27 (2007)CrossRefGoogle Scholar
  20. 20.
    Wassermann, G., Su, Z.: The essence of command injection attacks in web applications. In: Proc. 33d ACM POPL, pp. 372–382 (2006)Google Scholar
  21. 21.
    Wassermann, G., Su, Z.: Sound and precise analysis of web applications for injection vulnerabilities. In: Proc. ACM PLDI, pp. 32–41 (2007)Google Scholar
  22. 22.
    Xie, Y., Aiken, A.: Static detection of security vulnerabilities in scripting languages. In: Proc. 15th USENIX Security Symp. (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Kyung-Goo Doh
    • 1
  • Hyunha Kim
    • 1
  • David A. Schmidt
    • 2
  1. 1.Hanyang UniversityAnsanSouth Korea
  2. 2.Kansas State University, ManhattanKansasUSA

Personalised recommendations