Program Analyses Using Newton’s Method (Invited Paper)
Esparza et al. generalized Newton’s method—a numerical-analysis algorithm for finding roots of real-valued functions—to a method for finding fixed-points of systems of equations over semirings. Their method provides a new way to solve interprocedural dataflow-analysis problems. As in its real-valued counterpart, each iteration of their method solves a simpler “linearized” problem.
Because essentially all fast iterative numerical methods are forms of Newton’s method, this advance is exciting because it may provide the key to creating faster program-analysis algorithms. However, there is an important difference between the dataflow-analysis and numerical-analysis contexts: when Newton’s method is used in numerical problems, commutativity of multiplication is relied on to rearrange an expression of the form “\(a * Y * b + c * Y * d\)” into “\((a * b + c * d) * Y\).” Equations with such expressions correspond to path problems described by regular languages. In contrast, when Newton’s method is used for interprocedural dataflow analysis, the “multiplication” operation involves function composition, and hence is non-commutative: “\(a * Y * b + c * Y * d\)” cannot be rearranged into “\((a * b + c * d) * Y\).” Equations with the former expressions correspond to path problems described by linear context-free languages (LCFLs).
The invited talk that this paper accompanies presented a method that we developed in 2015 for solving the LCFL sub-problems produced during successive rounds of Newton’s method. It uses some algebraic slight-of-hand to turn a class of LCFL path problems into regular-language path problems. This result is surprising because a reasonable sanity check—formal-language theory—suggests that it should be impossible: after all, the LCFL languages are a strict superset of the regular languages.
The talk summarized several concepts and prior results on which that result is based. The method described applies to predicate abstraction, on which most of today’s software model checkers rely, as well as to other abstract domains used in program analysis.
This work was supported, in part, by a gift from Rajiv and Ritu Batra; DARPA MUSE award FA8750-14-2-0270 and DARPA STAC award FA8750-15-C-0082; and by the UW-Madison Office of the Vice Chancellor for Research and Graduate Education with funding from the Wisconsin Alumni Research Foundation. Any opinions, findings, and conclusions or recommendations expressed in this publication are those of the authors, and do not necessarily reflect the views of the sponsoring agencies.
- 2.Bouajjani, A., Esparza, J., Touili, T.: A generic approach to the static analysis of concurrent programs with procedures. In: POPL (2003)Google Scholar
- 3.Cousot, P., Cousot, R.: Static determination of dynamic properties of recursive procedures. In: Neuhold, E. (ed.) Formal Descriptions of Programming Concepts, IFIP WG 2.2, St. Andrews, Canada, August 1977, pp. 237–277. North-Holland (1978)Google Scholar
- 7.Hopkins, M., Kozen, D.: Parikh’s theorem in commutative Kleene algebra. In: LICS (1999)Google Scholar
- 10.Müller-Olm, M., Seidl, H.: Precise interprocedural analysis through linear algebra. In: POPL (2004)Google Scholar
- 11.Reps, T., Horwitz, S., Sagiv, M.: Precise interprocedural dataflow analysis via graph reachability. In: POPL (1995)Google Scholar
- 13.Sharir, M., Pnueli, A.: Two Approaches to Interprocedural Data Flow Analysis. In: Program Flow Analysis: Theory and Applications. Prentice-Hall (1981)Google Scholar