Abstract
Interprocedural data-flow analyses form an expressive and useful paradigm of numerous static analysis applications, such as live variables analysis, alias analysis and null pointers analysis. The most widely-used framework for interprocedural data-flow analysis is IFDS, which encompasses distributive data-flow functions over a finite domain. On-demand data-flow analyses restrict the focus of the analysis on specific program locations and data facts. This setting provides a natural split between (i) an offline (or preprocessing) phase, where the program is partially analyzed and analysis summaries are created, and (ii) an online (or query) phase, where analysis queries arrive on demand and the summaries are used to speed up answering queries.
In this work, we consider on-demand IFDS analyses where the queries concern program locations of the same procedure (aka same-context queries). We exploit the fact that flow graphs of programs have low treewidth to develop faster algorithms that are space and time optimal for many common data-flow analyses, in both the preprocessing and the query phase. We also use treewidth to develop query solutions that are embarrassingly parallelizable, i.e. the total work for answering each query is split to a number of threads such that each thread performs only a constant amount of work. Finally, we implement a static analyzer based on our algorithms, and perform a series of on-demand analysis experiments on standard benchmarks. Our experimental results show a drastic speed-up of the queries after only a lightweight preprocessing phase, which significantly outperforms existing techniques.
The research was partly supported by Austrian Science Fund (FWF) Grant No. NFN S11407-N23 (RiSE/SHiNE), FWF Schrödinger Grant No. J-4220, Vienna Science and Technology Fund (WWTF) Project ICT15-003, Facebook PhD Fellowship Program, IBM PhD Fellowship Program, and DOC Fellowship No. 24956 of the Austrian Academy of Sciences (ÖAW). A longer version of this work is available at [17].
Chapter PDF
Similar content being viewed by others
Keywords
References
T. J. Watson libraries for analysis (WALA). https://github.com/wala/WALA (2003)
Appel, A.W., Palsberg, J.: Modern Compiler Implementation in Java. Cambridge University Press, 2nd edn. (2003)
Arzt, S., Rasthofer, S., Fritz, C., Bodden, E., Bartel, A., Klein, J., Le Traon, Y., Octeau, D., McDaniel, P.: FlowDroid: Precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for android apps. In: PLDI. pp. 259–269 (2014)
Babich, W.A., Jazayeri, M.: The method of attributes for data flow analysis. Acta Informatica 10(3) (1978)
Bebenita, M., Brandner, F., Fahndrich, M., Logozzo, F., Schulte, W., Tillmann, N., Venter, H.: Spur: A trace-based JIT compiler for CIL. In: OOPSLA. pp. 708–725 (2010)
Blackburn, S.M., Garner, R., Hoffman, C., Khan, A.M., McKinley, K.S., Bentzur, R., Diwan, A., Feinberg, D., Frampton, D., Guyer, S.Z., Hirzel, M., Hosking, A., Jump, M., Lee, H., Moss, J.E.B., Phansalkar, A., Stefanović, D., VanDrunen, T., von Dincklage, D., Wiedermann, B.: The DaCapo benchmarks: Java benchmarking development and analysis. In: OOPSLA. pp. 169–190 (2006)
Bodden, E.: Inter-procedural data-flow analysis with IFDS/IDE and soot. In: SOAP. pp. 3–8 (2012)
Bodden, E., Tolêdo, T., Ribeiro, M., Brabrand, C., Borba, P., Mezini, M.: Spllift: Statically analyzing software product lines in minutes instead of years. In: PLDI. pp. 355–364 (2013)
Bodlaender, H., Gustedt, J., Telle, J.A.: Linear-time register allocation for a fixed number of registers. In: SODA (1998)
Bodlaender, H.L.: A linear-time algorithm for finding tree-decompositions of small treewidth. SIAM Journal on computing 25(6), 1305–1317 (1996)
Bodlaender, H.L., Hagerup, T.: Parallel algorithms with optimal speedup for bounded treewidth. SIAM Journal on Computing 27(6), 1725–1746 (1998)
Burgstaller, B., Blieberger, J., Scholz, B.: On the tree width of ada programs. In: Ada-Europe. pp. 78–90 (2004)
Callahan, D., Cooper, K.D., Kennedy, K., Torczon, L.: Interprocedural constant propagation. In: CC (1986)
Chatterjee, K., Choudhary, B., Pavlogiannis, A.: Optimal dyck reachability for data-dependence and alias analysis. In: POPL. pp. 30:1–30:30 (2017)
Chatterjee, K., Goharshady, A., Goharshady, E.: The treewidth of smart contracts. In: SAC (2019)
Chatterjee, K., Goharshady, A.K., Goyal, P., Ibsen-Jensen, R., Pavlogiannis, A.: Faster algorithms for dynamic algebraic queries in basic RSMs with constant treewidth. ACM Transactions on Programming Languages and Systems 41(4), 1–46 (2019)
Chatterjee, K., Goharshady, A.K., Ibsen-Jensen, R., Pavlogiannis, A.: Optimal and perfectly parallel algorithms for on-demand data-flow analysis. arXiv preprint 2001.11070 (2020)
Chatterjee, K., Goharshady, A.K., Okati, N., Pavlogiannis, A.: Efficient parameterized algorithms for data packing. In: POPL. pp. 1–28 (2019)
Chatterjee, K., Goharshady, A.K., Pavlogiannis, A.: JTDec: A tool for tree decompositions in soot. In: ATVA. pp. 59–66 (2017)
Chatterjee, K., Ibsen-Jensen, R., Goharshady, A.K., Pavlogiannis, A.: Algorithms for algebraic path properties in concurrent systems of constant treewidth components. ACM Transactions on Programming Langauges and Systems 40(3), 9 (2018)
Chatterjee, K., Ibsen-Jensen, R., Pavlogiannis, A.: Optimal reachability and a space-time tradeoff for distance queries in constant-treewidth graphs. In: ESA (2016)
Chaudhuri, S., Zaroliagis, C.D.: Shortest paths in digraphs of small treewidth. Part i: Sequential algorithms. Algorithmica 27(3-4), 212–226 (2000)
Chaudhuri, S.: Subcubic algorithms for recursive state machines. In: POPL (2008)
Chen, T., Lin, J., Dai, X., Hsu, W.C., Yew, P.C.: Data dependence profiling for speculative optimizations. In: CC. pp. 57–72 (2004)
Cousot, P., Cousot, R.: Static determination of dynamic properties of recursive procedures. In: IFIP Conference on Formal Description of Programming Concepts (1977)
Cygan, M., Fomin, F.V., Kowalik, Ł., Lokshtanov, D., Marx, D., Pilipczuk, M., Pilipczuk, M., Saurabh, S.: Parameterized algorithms, vol. 4 (2015)
Duesterwald, E., Gupta, R., Soffa, M.L.: Demand-driven computation of interprocedural data flow. POPL (1995)
Dutta, S.: Anatomy of a compiler. Circuit Cellar 121, 30–35 (2000)
Flückiger, O., Scherer, G., Yee, M.H., Goel, A., Ahmed, A., Vitek, J.: Correctness of speculative optimizations with dynamic deoptimization. In: POPL. pp. 49:1–49:28 (2017)
Giegerich, R., Möncke, U., Wilhelm, R.: Invariance of approximate semantics with respect to program transformations. In: ECI (1981)
Gould, C., Su, Z., Devanbu, P.: Jdbc checker: A static analysis tool for SQL/JDBC applications. In: ICSE. pp. 697–698 (2004)
Grove, D., Torczon, L.: Interprocedural constant propagation: A study of jump function implementation. In: PLDI (1993)
Guarnieri, S., Pistoia, M., Tripp, O., Dolby, J., Teilhet, S., Berg, R.: Saving the world wide web from vulnerable javascript. In: ISSTA. pp. 177–187 (2011)
Gustedt, J., Mæhle, O.A., Telle, J.A.: The treewidth of java programs. In: ALENEX. pp. 86–97 (2002)
Harel, D., Tarjan, R.E.: Fast algorithms for finding nearest common ancestors. SIAM Journal on Computing 13(2), 338–355 (1984)
Horwitz, S., Reps, T., Sagiv, M.: Demand interprocedural dataflow analysis. ACM SIGSOFT Software Engineering Notes (1995)
Hovemeyer, D., Pugh, W.: Finding bugs is easy. ACM SIGPLAN Notices 39(12), 92–106 (Dec 2004)
Klaus Krause, P., Larisch, L., Salfelder, F.: The tree-width of C. Discrete Applied Mathematics (03 2019)
Knoop, J., Steffen, B.: The interprocedural coincidence theorem. In: CC (1992)
Krüger, S., Späth, J., Ali, K., Bodden, E., Mezini, M.: CrySL: An Extensible Approach to Validating the Correct Usage of Cryptographic APIs. In: ECOOP. pp. 10:1–10:27 (2018)
Lee, Y.f., Marlowe, T.J., Ryder, B.G.: Performing data flow analysis in parallel. In: ACM/IEEE Supercomputing. pp. 942–951 (1990)
Lee, Y.F., Ryder, B.G.: A comprehensive approach to parallel data flow analysis. In: ICS. pp. 236–247 (1992)
Lin, J., Chen, T., Hsu, W.C., Yew, P.C., Ju, R.D.C., Ngai, T.F., Chan, S.: A compiler framework for speculative optimizations. ACM Transactions on Architecture and Code Optimization 1(3), 247–271 (2004)
Muchnick, S.S.: Advanced Compiler Design and Implementation. Morgan Kaufmann (1997)
Naeem, N.A., Lhoták, O., Rodriguez, J.: Practical extensions to the ifds algorithm. CC (2010)
Nanda, M.G., Sinha, S.: Accurate interprocedural null-dereference analysis for java. In: ICSE. pp. 133–143 (2009)
Rapoport, M., Lhoták, O., Tip, F.: Precise data flow analysis in the presence of correlated method calls. In: SAS. pp. 54–71 (2015)
Reps, T.: Program analysis via graph reachability. ILPS (1997)
Reps, T.: Undecidability of context-sensitive data-dependence analysis. ACM Transactions on Programming Languages and Systems 22(1), 162–186 (2000)
Reps, T., Horwitz, S., Sagiv, M.: Precise interprocedural dataflow analysis via graph reachability. In: POPL. pp. 49–61 (1995)
Reps, T.: Demand interprocedural program analysis using logic databases. In: Applications of Logic Databases, vol. 296 (1995)
Robertson, N., Seymour, P.D.: Graph minors. iii. planar tree-width. Journal of Combinatorial Theory, Series B 36(1), 49–64 (1984)
Rodriguez, J., Lhoták, O.: Actor-based parallel dataflow analysis. In: CC. pp. 179–197 (2011)
Rountev, A., Kagan, S., Marlowe, T.: Interprocedural dataflow analysis in the presence of large libraries. In: CC. pp. 2–16 (2006)
Sagiv, M., Reps, T., Horwitz, S.: Precise interprocedural dataflow analysis with applications to constant propagation. Theoretical Computer Science (1996)
Schubert, P.D., Hermann, B., Bodden, E.: PhASAR: An inter-procedural static analysis framework for C/C++. In: TACAS. pp. 393–410 (2019)
Shang, L., Xie, X., Xue, J.: On-demand dynamic summary-based points-to analysis. In: CGO. pp. 264–274 (2012)
Sharir, M., Pnueli, A.: Two approaches to interprocedural data flow analysis. In: Program flow analysis: Theory and applications. Prentice-Hall (1981)
Smaragdakis, Y., Bravenboer, M., Lhoták, O.: Pick your contexts well: Understanding object-sensitivity. In: POPL. pp. 17–30 (2011)
Späth, J., Ali, K., Bodden, E.: Context-, flow-, and field-sensitive data-flow analysis using synchronized pushdown systems. In: POPL. pp. 48:1–48:29 (2019)
Sridharan, M., Bodík, R.: Refinement-based context-sensitive points-to analysis for java. ACM SIGPLAN Notices 41(6), 387–400 (2006)
Sridharan, M., Gopan, D., Shan, L., Bodík, R.: Demand-driven points-to analysis for java. In: OOPSLA. pp. 59–76 (2005)
Thorup, M.: All structured programs have small tree width and good register allocation. Information and Computation 142(2), 159–181 (1998)
Torczon, L., Cooper, K.: Engineering a Compiler. Morgan Kaufmann, 2nd edn. (2011)
Vallée-Rai, R., Co, P., Gagnon, E., Hendren, L.J., Lam, P., Sundaresan, V.: Soot - a Java bytecode optimization framework. In: CASCON. p. 13 (1999)
Xu, G., Rountev, A., Sridharan, M.: Scaling cfl-reachability-based points-to analysis using context-sensitive must-not-alias analysis. In: ECOOP (2009)
Yan, D., Xu, G., Rountev, A.: Demand-driven context-sensitive alias analysis for java. In: ISSTA. pp. 155–165 (2011)
Yuan, X., Gupta, R., Melhem, R.: Demand-driven data flow analysis for communication optimization. Parallel Processing Letters 07(04), 359–370 (1997)
Zheng, X., Rugina, R.: Demand-driven alias analysis for c. In: POPL. pp. 197–208 (2008)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Copyright information
© 2020 The Author(s)
About this paper
Cite this paper
Chatterjee, K., Goharshady, A.K., Ibsen-Jensen, R., Pavlogiannis, A. (2020). Optimal and Perfectly Parallel Algorithms for On-demand Data-Flow Analysis. In: Müller, P. (eds) Programming Languages and Systems. ESOP 2020. Lecture Notes in Computer Science(), vol 12075. Springer, Cham. https://doi.org/10.1007/978-3-030-44914-8_5
Download citation
DOI: https://doi.org/10.1007/978-3-030-44914-8_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-44913-1
Online ISBN: 978-3-030-44914-8
eBook Packages: Computer ScienceComputer Science (R0)