Skip to main content

Efficient Interprocedural Data-Flow Analysis Using Treedepth and Treewidth

  • Conference paper
  • First Online:
Verification, Model Checking, and Abstract Interpretation (VMCAI 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13881))

Abstract

We consider interprocedural data-flow analysis as formalized by the standard IFDS framework, which can express many widely-used static analyses such as reaching definitions, live variables, and null-pointer. We focus on the well-studied on-demand setting in which queries arrive one-by-one in a stream and each query should be answered as fast as possible. While the classical IFDS algorithm provides a polynomial-time solution for this problem, it is not scalable in practice. More specifically, it will either require a quadratic-time preprocessing phase or takes linear time per query, both of which are untenable for modern huge codebases with hundreds of thousands of lines. Previous works have already shown that parameterizing the problem by the treewidth of the program’s control-flow graph is promising and can lead to significant gains in efficiency. Unfortunately, these results were only applicable to the limited special case of same-context queries.

In this work, we obtain significant speedups for the general case of on-demand IFDS with queries that are not necessarily same-context. This is achieved by exploiting a new graph sparsity parameter, namely the treedepth of the program’s call graph. Our approach is the first to exploit the sparsity of control-flow graphs and call graphs at the same time and parameterize by both the treewidth and the treedepth. We obtain an algorithm with a linear preprocessing phase that can answer each query in constant time wrt the size of the input. Finally, our experimental results demonstrate that our approach significantly outperforms the classical IFDS and its on-demand variant.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Instead of single data facts \(d_1\) and \(d_2\), we can also use a set of data facts at each of \(\ell _1\) and \(\ell _2,\) but as we will see in Sect. 2, this does not affect the generality.

  2. 2.

    This algorithm uses the Word-RAM model of computation. The division by \(\lg n\) is obtained by encoding \(\lg n\) bits in one word.

  3. 3.

    The bags do not have to be disjoint.

  4. 4.

    The name partial order tree is not standard in this context, but we use it throughout this work since it provides a good intuition about the nature of T. Usually, the term “treedepth decomposition” is used instead.

  5. 5.

    For generating each query, we randomly and uniformly picked two points in the exploded supergraph. Note that none of our queries are same-context. Even when the two points of the query are in the same function, we are asking for reachability using interprocedurally valid paths that are not necessarily same-context.

References

  1. T.J. Watson libraries for analysis, with frontends for Java, Android, and JavaScript, and many common static program analyses. https://github.com/wala/WALA

  2. Ahmadi, A., Daliri, M., Goharshady, A.K., Pavlogiannis, A.: Efficient approximations for cache-conscious data placement. In: PLDI, pp. 857–871 (2022)

    Google Scholar 

  3. Aiswarya, C.: How treewidth helps in verification. ACM SIGLOG News 9(1), 6–21 (2022)

    Article  Google Scholar 

  4. Arzt, S., et al.: FlowDroid: precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for Android apps. In: PLDI, pp. 259–269 (2014)

    Google Scholar 

  5. Asadi, A., Chatterjee, K., Goharshady, A.K., Mohammadi, K., Pavlogiannis, A.: Faster algorithms for quantitative analysis of MCs and MDPs with small treewidth. In: Hung, D.V., Sokolsky, O. (eds.) ATVA 2020. LNCS, vol. 12302, pp. 253–270. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59152-6_14

    Chapter  Google Scholar 

  6. Babich, W.A., Jazayeri, M.: The method of attributes for data flow analysis: Part II. Demand analysis. Acta Informatica 10, 265–272 (1978)

    Google Scholar 

  7. Bebenita, M., et al.: SPUR: a trace-based JIT compiler for CIL. In: OOPSLA, pp. 708–725 (2010)

    Google Scholar 

  8. Blackburn, S.M., et al.: The DaCapo benchmarks: Java benchmarking development and analysis. In: OOPSLA, pp. 169–190 (2006)

    Google Scholar 

  9. Bodden, E.: Inter-procedural data-flow analysis with IFDS/IDE and soot. In: SOAP, pp. 3–8 (2012)

    Google Scholar 

  10. Bodden, E., Tolêdo, T., Ribeiro, M., Brabrand, C., Borba, P., Mezini, M.: SPLLIFT: statically analyzing software product lines in minutes instead of years. In: PLDI, pp. 355–364 (2013)

    Google Scholar 

  11. Bodlaender, H.L.: Dynamic programming on graphs with bounded treewidth. In: ICALP, pp. 105–118 (1988)

    Google Scholar 

  12. Bodlaender, H.L.: A linear time algorithm for finding tree-decompositions of small treewidth. In: STOC, pp. 226–234 (1993)

    Google Scholar 

  13. Bodlaender, H.L.: A tourist guide through treewidth. Acta Cybern. 11(1–2), 1–21 (1993)

    MathSciNet  MATH  Google Scholar 

  14. Bodlaender, H.L., et al.: Rankings of graphs. SIAM J. Discret. Math. 11(1), 168–181 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  15. Bodlaender, H.L., Hagerup, T.: Parallel algorithms with optimal speedup for bounded treewidth. SIAM J. Comput. 27(6), 1725–1746 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  16. Burgstaller, B., Blieberger, J., Scholz, B.: On the tree width of Ada programs. In: Llamosí, A., Strohmeier, A. (eds.) Ada-Europe 2004. LNCS, vol. 3063, pp. 78–90. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24841-5_6

    Chapter  Google Scholar 

  17. Callahan, D., Cooper, K.D., Kennedy, K., Torczon, L.: Interprocedural constant propagation. In: CC, pp. 152–161 (1986)

    Google Scholar 

  18. Chang, W., Streiff, B., Lin, C.: Efficient and extensible security enforcement using dynamic data flow analysis. In: CCS, pp. 39–50 (2008)

    Google Scholar 

  19. Chatterjee, K., Goharshady, A.K., Goharshady, E.K.: The treewidth of smart contracts. In: SAC, pp. 400–408 (2019)

    Google Scholar 

  20. Chatterjee, K., Goharshady, A.K., Goyal, P., Ibsen-Jensen, R., Pavlogiannis, A.: Faster algorithms for dynamic algebraic queries in basic RSMs with constant treewidth. TOPLAS 41(4), 23:1–23:46 (2019)

    Google Scholar 

  21. Chatterjee, K., Goharshady, A.K., Ibsen-Jensen, R., Pavlogiannis, A.: Algorithms for algebraic path properties in concurrent systems of constant treewidth components. In: POPL, pp. 733–747 (2016)

    Google Scholar 

  22. Chatterjee, K., Goharshady, A.K., Ibsen-Jensen, R., Pavlogiannis, A.: Optimal and perfectly parallel algorithms for on-demand data-flow analysis. In: ESOP, pp. 112–140 (2020)

    Google Scholar 

  23. Chatterjee, K., Goharshady, A.K., Okati, N., Pavlogiannis, A.: Efficient parameterized algorithms for data packing. In: POPL, pp. 53:1–53:28 (2019)

    Google Scholar 

  24. Chatterjee, K., Goharshady, A.K., Pavlogiannis, A.: JTDec: a tool for tree decompositions in soot. In: D’Souza, D., Narayan Kumar, K. (eds.) ATVA 2017. LNCS, vol. 10482, pp. 59–66. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68167-2_4

    Chapter  Google Scholar 

  25. Chatterjee, K., Ibsen-Jensen, R., Goharshady, A.K., Pavlogiannis, A.: Algorithms for algebraic path properties in concurrent systems of constant treewidth components. TOPLAS 40(3), 9:1–9:43 (2018)

    Google Scholar 

  26. Chatterjee, K., Ibsen-Jensen, R., Pavlogiannis, A.: Faster algorithms for quantitative verification in constant treewidth graphs. In: Kroening, D., Păsăreanu, C.S. (eds.) CAV 2015. LNCS, vol. 9206, pp. 140–157. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-21690-4_9

    Chapter  MATH  Google Scholar 

  27. Chatterjee, K., Ibsen-Jensen, R., Pavlogiannis, A.: Quantitative verification on product graphs of small treewidth. In: FSTTCS, pp. 42:1–42:23 (2021)

    Google Scholar 

  28. Chen, T., Lin, J., Dai, X., Hsu, W.-C., Yew, P.-C.: Data dependence profiling for speculative optimizations. In: Duesterwald, E. (ed.) CC 2004. LNCS, vol. 2985, pp. 57–72. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24723-4_5

    Chapter  Google Scholar 

  29. Chow, A.L., Rudmik, A.: The design of a data flow analyzer. In: CC, pp. 106–113 (1982)

    Google Scholar 

  30. Collard, J.F., Knoop, J.: A comparative study of reaching-definitions analyses (1998)

    Google Scholar 

  31. Dangel, A., Fournier, C., et al.: PMD Eclipse plugin. https://github.com/pmd/pmd-eclipse-plugin

  32. Das, A., Lal, A.: Precise null pointer analysis through global value numbering. In: D’Souza, D., Narayan Kumar, K. (eds.) ATVA 2017. LNCS, vol. 10482, pp. 25–41. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68167-2_2

    Chapter  Google Scholar 

  33. Dell, H., Komusiewicz, C., Talmon, N., Weller, M.: The PACE 2017 parameterized algorithms and computational experiments challenge: the second iteration. In: IPEC, pp. 30:1–30:12 (2018)

    Google Scholar 

  34. Duesterwald, E., Gupta, R., Soffa, M.L.: Demand-driven computation of interprocedural data flow. In: POPL, pp. 37–48 (1995)

    Google Scholar 

  35. Eclipse Foundation: Eclipse documentation, Java development user guide. http://help.eclipse.org/2022-06/index.jsp?topic=/org.eclipse.jdt.doc.user/reference/preferences/java/compiler/ref-preferences-errors-warnings.htm

  36. Ferrara, A., Pan, G., Vardi, M.Y.: Treewidth in verification: local vs. global. In: Sutcliffe, G., Voronkov, A. (eds.) LPAR 2005. LNCS (LNAI), vol. 3835, pp. 489–503. Springer, Heidelberg (2005). https://doi.org/10.1007/11591191_34

    Chapter  Google Scholar 

  37. Flückiger, O., Scherer, G., Yee, M., Goel, A., Ahmed, A., Vitek, J.: Correctness of speculative optimizations with dynamic deoptimization. In: POPL, pp. 49:1–49:28 (2018)

    Google Scholar 

  38. Goharshady, A.K.: Parameterized and algebro-geometric advances in static program analysis. Ph.D. thesis, Institute of Science and Technology Austria, Klosterneuburg, Austria (2020)

    Google Scholar 

  39. Goharshady, A.K., Hooshmandasl, M.R., Meybodi, M.A.: [1, 2]-sets and [1, 2]-total sets in trees with algorithms. Discret. Appl. Math. 198, 136–146 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  40. Goharshady, A.K., Mohammadi, F.: An efficient algorithm for computing network reliability in small treewidth. Reliab. Eng. Syst. Saf. 193, 106665 (2020)

    Article  Google Scholar 

  41. Gould, C., Su, Z., Devanbu, P.T.: JDBC checker: a static analysis tool for SQL/JDBC applications. In: ICSE, pp. 697–698 (2004)

    Google Scholar 

  42. Grove, D., Torczon, L.: Interprocedural constant propagation: a study of jump function implementations. In: PLDI, pp. 90–99 (1993)

    Google Scholar 

  43. Gupta, R., Benson, D., Fang, J.Z.: Path profile guided partial dead code elimination using predication. In: PACT, pp. 102–113 (1997)

    Google Scholar 

  44. Gustedt, J., Mæhle, O.A., Telle, J.A.: The treewidth of Java programs. In: Mount, D.M., Stein, C. (eds.) ALENEX 2002. LNCS, vol. 2409, pp. 86–97. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45643-0_7

    Chapter  Google Scholar 

  45. Horwitz, S., Reps, T.W., Sagiv, S.: Demand interprocedural dataflow analysis. In: FSE, pp. 104–115 (1995)

    Google Scholar 

  46. Khedker, U., Sanyal, A., Sathe, B.: Data Flow Analysis: Theory and Practice. CRC Press, Boca Raton (2017)

    Book  Google Scholar 

  47. Kildall, G.A.: A unified approach to global program optimization. In: POPL, pp. 194–206 (1973)

    Google Scholar 

  48. Kildall, G.A.: Global Expression Optimization During Compilation. University of Washington (1972)

    Google Scholar 

  49. Knoop, J., Steffen, B.: Efficient and optimal bit-vector data flow analyses: a uniform interprocedural framework. Institut für Informatik und Praktische Mathematik Kiel, Bericht (1993)

    Google Scholar 

  50. Knoop, J., Steffen, B., Vollmer, J.: Parallelism for free: efficient and optimal bitvector analyses for parallel programs. TOPLAS 18(3), 268–299 (1996)

    Article  Google Scholar 

  51. Kowalik, Ł., Mucha, M., Nadara, W., Pilipczuk, M., Sorge, M., Wygocki, P.: The PACE 2020 parameterized algorithms and computational experiments challenge: treedepth. In: IPEC, pp. 37:1–37:18 (2020)

    Google Scholar 

  52. Kurdahi, F.J., Parker, A.C.: REAL: a program for register allocation. In: DAC, pp. 210–215 (1987)

    Google Scholar 

  53. Lin, J., et al.: A compiler framework for speculative optimizations. TACO (3), 247–271 (2004)

    Google Scholar 

  54. Meybodi, M.A., Goharshady, A.K., Hooshmandasl, M.R., Shakiba, A.: Optimal mining: maximizing Bitcoin miners’ revenues from transaction fees. In: Blockchain, pp. 266–273. IEEE (2022)

    Google Scholar 

  55. Meyer, B.: Ending null pointer crashes. Commun. ACM 60(5), 8–9 (2017)

    Article  Google Scholar 

  56. Nadara, W., Pilipczuk, M., Smulewicz, M.: Computing treedepth in polynomial space and linear FPT time. CoRR abs/2205.02656 (2022)

    Google Scholar 

  57. Naeem, N.A., Lhoták, O., Rodriguez, J.: Practical extensions to the IFDS algorithm. In: Gupta, R. (ed.) CC 2010. LNCS, vol. 6011, pp. 124–144. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-11970-5_8

    Chapter  Google Scholar 

  58. Nanda, M.G., Sinha, S.: Accurate interprocedural null-dereference analysis for Java. In: ICSE, pp. 133–143. IEEE (2009)

    Google Scholar 

  59. Nešetřil, J., De Mendez, P.O.: Sparsity: Graphs, Structures, and Algorithms. Springer, Cham (2012)

    Book  MATH  Google Scholar 

  60. Nesetril, J., de Mendez, P.O.: Tree-depth, subgraph coloring and homomorphism bounds. Eur. J. Comb. 27(6), 1022–1041 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  61. Nguyen, T.V.N., Irigoin, F., Ancourt, C., Coelho, F.: Automatic detection of uninitialized variables. In: Hedin, G. (ed.) CC 2003. LNCS, vol. 2622, pp. 217–231. Springer, Heidelberg (2003). https://doi.org/10.1007/3-540-36579-6_16

    Chapter  Google Scholar 

  62. Obdržálek, J.: Fast mu-calculus model checking when tree-width is bounded. In: Hunt, W.A., Somenzi, F. (eds.) CAV 2003. LNCS, vol. 2725, pp. 80–92. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-45069-6_7

    Chapter  Google Scholar 

  63. Pessoa, T., Monteiro, M.P., Bryton, S., et al.: An eclipse plugin to support code smells detection. arXiv preprint arXiv:1204.6492 (2012)

  64. Pothen, A.: The complexity of optimal elimination trees. Technical report (1988)

    Google Scholar 

  65. Rapoport, M., Lhoták, O., Tip, F.: Precise data flow analysis in the presence of correlated method calls. In: Blazy, S., Jensen, T. (eds.) SAS 2015. LNCS, vol. 9291, pp. 54–71. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-48288-9_4

    Chapter  Google Scholar 

  66. Reps, T.: Undecidability of context-sensitive data-dependence analysis. TOPLAS 22(1), 162–186 (2000)

    Article  Google Scholar 

  67. Reps, T.W.: Demand interprocedural program analysis using logic databases. In: Ramakrishnan, R. (ed.) Applications of Logic Databases. SECS, pp. 163–196. Springer, Boston (1993). https://doi.org/10.1007/978-1-4615-2207-2_8

    Chapter  Google Scholar 

  68. Reps, T.W.: Program analysis via graph reachability. Inf. Softw. Technol. 40(11–12), 701–726 (1998)

    Article  Google Scholar 

  69. Reps, T.W., Horwitz, S., Sagiv, S.: Precise interprocedural dataflow analysis via graph reachability. In: POPL, pp. 49–61 (1995)

    Google Scholar 

  70. Robertson, N., Seymour, P.D.: Graph minors. III. Planar tree-width. J. Comb. Theory Ser. B 36(1), 49–64 (1984)

    Google Scholar 

  71. Robertson, N., Seymour, P.D.: Graph minors. II. Algorithmic aspects of tree-width. J. Algorithms 7(3), 309–322 (1986)

    Google Scholar 

  72. Rountev, A., Kagan, S., Marlowe, T.: Interprocedural dataflow analysis in the presence of large libraries. In: Mycroft, A., Zeller, A. (eds.) CC 2006. LNCS, vol. 3923, pp. 2–16. Springer, Heidelberg (2006). https://doi.org/10.1007/11688839_2

    Chapter  Google Scholar 

  73. Sagiv, S., Reps, T.W., Horwitz, S.: Precise interprocedural dataflow analysis with applications to constant propagation. Theor. Comput. Sci. 167, 131–170 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  74. Shang, L., Xie, X., Xue, J.: On-demand dynamic summary-based points-to analysis. In: CGO, pp. 264–274 (2012)

    Google Scholar 

  75. Späth, J., Ali, K., Bodden, E.: Context-, flow-, and field-sensitive data-flow analysis using synchronized pushdown systems. In: POPL, pp. 48:1–48:29 (2019)

    Google Scholar 

  76. Sridharan, M., Bodík, R.: Refinement-based context-sensitive points-to analysis for Java. In: PLDI, pp. 387–400 (2006)

    Google Scholar 

  77. Sridharan, M., Gopan, D., Shan, L., Bodík, R.: Demand-driven points-to analysis for Java. In: OOPSLA, pp. 59–76 (2005)

    Google Scholar 

  78. Thorup, M.: All structured programs have small tree-width and good register allocation. Inf. Comput. 142(2), 159–181 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  79. Vallée-Rai, R., Co, P., Gagnon, E., Hendren, L.J., Lam, P., Sundaresan, V.: Soot - a Java bytecode optimization framework. In: CASCON, p. 13. IBM (1999)

    Google Scholar 

  80. Xu, G., Rountev, A., Sridharan, M.: Scaling CFL-reachability-based points-to analysis using context-sensitive must-not-alias analysis. In: Drossopoulou, S. (ed.) ECOOP 2009. LNCS, vol. 5653, pp. 98–122. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-03013-0_6

    Chapter  Google Scholar 

  81. Yan, D., Xu, G., Rountev, A.: Demand-driven context-sensitive alias analysis for Java. In: ISSTA, pp. 155–165 (2011)

    Google Scholar 

  82. Zheng, X., Rugina, R.: Demand-driven alias analysis for C. In: POPL, pp. 197–208 (2008)

    Google Scholar 

Download references

Acknowledgments

The research was partially supported by the Hong Kong Research Grants Council ECS Project Number 26208122, the HKUST-Kaisa Joint Research Institute Project Grant HKJRI3A-055 and the HKUST Startup Grant R9272. Author names are ordered alphabetically.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Amir Kafshdar Goharshady or Ahmed Khaled Zaher .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Goharshady, A.K., Zaher, A.K. (2023). Efficient Interprocedural Data-Flow Analysis Using Treedepth and Treewidth. In: Dragoi, C., Emmi, M., Wang, J. (eds) Verification, Model Checking, and Abstract Interpretation. VMCAI 2023. Lecture Notes in Computer Science, vol 13881. Springer, Cham. https://doi.org/10.1007/978-3-031-24950-1_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-24950-1_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-24949-5

  • Online ISBN: 978-3-031-24950-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics