Skip to main content

Effects of Program Representation on Pointer Analyses — An Empirical Study

  • 1527 Accesses

Part of the Lecture Notes in Computer Science book series (LNTCS,volume 12649)

Abstract

Static analysis frameworks, such as Soot and Wala, are used by researchers to prototype and compare program analyses. These frameworks vary on heap abstraction, modeling library classes, and underlying intermediate program representation (IR). Often, these variations pose a threat to the validity of the results as the implications of comparing the same analysis implementation in different frameworks are still unexplored. Earlier studies have focused on the precision, soundness, and recall of the algorithms implemented in these frameworks; however, little to no work has been done to evaluate the effects of program representation. In this work, we fill this gap and study the impact of program representation on pointer analysis. Unfortunately, existing metrics are insufficient for such a comparison due to their inability to isolate each aspect of the program representation. Therefore, we define two novel metrics that measure these analyses’ precision after isolating the influence of class-hierarchy and intermediate representation. Our results establish that the minor differences in the class hierarchy and IR do not impact program analysis significantly. Besides, they reveal the sources of unsoundness that aid researchers in developing program analysis.

Keywords

  • Pointer Analysis
  • Java
  • Program Analysis
  • Empirical Studies

References

  1. Antoniadis, T., Triantafyllou, K., Smaragdakis, Y.: Porting doop to soufflé;: A tale of inter-engine portability for datalog-based analyses. In: Proceedings of the 6th ACM SIGPLAN International Workshop on State Of the Art in Program Analysis. pp. 25–30. SOAP 2017, ACM, New York, NY, USA (2017). https://doi.org/10.1145/3088515.3088522, https://doi.org/10.1145/3088515.3088522

  2. Blackburn, S.M., Garner, R., Hoffmann, C., Khang, A.M., McKinley, K.S., Bentzur, R., Diwan, A., Feinberg, D., Frampton, D., Guyer, S.Z., Hirzel, M., Hosking, A., Jump, M., Lee, H., Moss, J.E.B., Phansalkar, A., Stefanović, D., VanDrunen, T., von Dincklage, D., Wiedermann, B.: The dacapo benchmarks: Java benchmarking development and analysis. In: Proceedings of the 21st Annual ACM SIGPLAN Conference on Object-oriented Programming Systems, Languages, and Applications. pp. 169–190. OOPSLA ’06, ACM, New York, NY, USA (2006). https://doi.org/10.1145/1167473.1167488, https://doi.org/10.1145/1167473.1167488

  3. Bravenboer, M., Smaragdakis, Y.: Strictly declarative specification of sophisticated points-to analyses. In: Proceedings of the 24th ACM SIGPLAN Conference on Object Oriented Programming Systems Languages and Applications. pp. 243–262. OOPSLA ’09, ACM, New York, NY, USA (2009). https://doi.org/10.1145/1640089.1640108, https://doi.org/10.1145/1640089.1640108

  4. Cytron, R., Ferrante, J., Rosen, B.K., Wegman, M.N., Zadeck, F.K.: Efficiently computing static single assignment form and the control dependence graph. ACM Trans. Program. Lang. Syst. 13(4), 451–490 (Oct 1991). https://doi.org/10.1145/115372.115320

  5. Dietrich, J., Sui, L., Rasheed, S., Tahir, A.: On the construction of soundness oracles. In: Proceedings of the 6th ACM SIGPLAN International Workshop on State Of the Art in Program Analysis. pp. 37–42. SOAP 2017, Association for Computing Machinery, New York, NY, USA (2017). https://doi.org/10.1145/3088515.3088520, https://doi.org/10.1145/3088515.3088520

  6. Fourtounis, G., Triantafyllou, L., Smaragdakis, Y.: Identifying java calls in native code via binary scanning. In: Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis. pp. 388–400. ISSTA 2020, Association for Computing Machinery, New York, NY, USA (2020). https://doi.org/10.1145/3395363.3397368, https://doi.org/10.1145/3395363.3397368

  7. Fourtounis, G., Triantafyllou, L., Smaragdakis, Y.: Identifying java calls in native code via binary scanning. In: Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis. pp. 388–400. ISSTA 2020, Association for Computing Machinery, New York, NY, USA (2020). https://doi.org/10.1145/3395363.3397368, https://doi.org/10.1145/3395363.3397368

  8. GitHub: https://github.com/cmorty/. https://github.com/cmorty/avrora/blob/222ea1645b67bc40429881526555d19bced4a590/src/avrora/arch/avr/AVRInstrBuilder.java (August 2020), (Accessed on 05.08.2020)

  9. Grech, N., Fourtounis, G., Francalanza, A., Smaragdakis, Y.: Heaps don’t lie: Countering unsoundness with heap snapshots. Proc. ACM Program. Lang. 1(OOPSLA) (Oct 2017). https://doi.org/10.1145/3133892, https://doi.org/10.1145/3133892

  10. Grech, N., Kastrinis, G., Smaragdakis, Y.: Efficient Reflection String Analysis via Graph Coloring. In: Millstein, T. (ed.) 32nd European Conference on Object-Oriented Programming (ECOOP 2018). Leibniz International Proceedings in Informatics (LIPIcs), vol. 109, pp. 26:1–26:25. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany (2018). https://doi.org/10.4230/LIPIcs.ECOOP.2018.26, http://drops.dagstuhl.de/opus/volltexte/2018/9231

  11. Grech, N., Smaragdakis, Y.: P/taint: Unified points-to and taint analysis. Proc. ACM Program. Lang. 1(OOPSLA), 102:1–102:28 (Oct 2017). https://doi.org/10.1145/3133926, https://doi.org/10.1145/3133926

  12. Jordan, H., Scholz, B., Subotić, P.: Soufflé: On synthesis of program analyzers. In: Chaudhuri, S., Farzan, A. (eds.) Computer Aided Verification. pp. 422–430. Springer International Publishing, Cham (2016), https://doi.org/10.1007/978-3-319-41540-6_23

  13. Kastrinis, G., Smaragdakis, Y.: Hybrid context-sensitivity for points-to analysis. In: Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation. p. 423–434. PLDI ’13, Association for Computing Machinery, New York, NY, USA (2013). https://doi.org/10.1145/2491956.2462191, https://doi.org/10.1145/2491956.2462191

  14. Li, Y., Tan, T., Møller, A., Smaragdakis, Y.: A principled approach to selective context sensitivity for pointer analysis. ACM Trans. Program. Lang. Syst. 42(2) (May 2020). https://doi.org/10.1145/3381915, https://doi.org/10.1145/3381915

  15. Li, Y., Tan, T., Sui, Y., Xue, J.: Self-inferencing reflection resolution for java. In: Jones, R. (ed.) ECOOP 2014 - Object-Oriented Programming. pp. 27–53. Springer Berlin Heidelberg, Berlin, Heidelberg (2014), https://doi.org/10.1007/978-3-662-44202-9_2

  16. Li, Y., Tan, T., Xue, J.: Effective soundness-guided reflection analysis. In: Blazy, S., Jensen, T. (eds.) Static Analysis. pp. 162–180. Springer Berlin Heidelberg, Berlin, Heidelberg (2015), https://doi.org/10.1007/978-3-662-48288-9_10

  17. Li, Y., Tan, T., Xue, J.: Understanding and analyzing java reflection. ACM Trans. Softw. Eng. Methodol. 28(2) (Feb 2019). https://doi.org/10.1145/3295739, https://doi.org/10.1145/3295739

  18. Liu, J., Li, Y., Tan, T., Xue, J.: Reflection analysis for java: Uncovering more reflective targets precisely. In: 2017 IEEE 28th International Symposium on Software Reliability Engineering (ISSRE). pp. 12–23 (2017), https://doi.org/10.1109/ISSRE.2017.36

  19. Milanova, A., Rountev, A., Ryder, B.G.: Parameterized object sensitivity forpoints-to analysis for java. ACM Trans. Softw. Eng. Methodol. 14(1), 1–41 (Jan 2005). https://doi.org/10.1145/1044834.1044835, https://doi.org/10.1145/1044834.1044835

  20. Ramalingam, G.: The undecidability of aliasing. ACM Trans. Program. Lang. Syst. 16(5), 1467–1471 (Sep 1994). https://doi.org/10.1145/186025.186041, https://doi.org/10.1145/186025.186041

  21. Reif, M., Kübler, F., Eichberg, M., Helm, D., Mezini, M.: Judge: Identifying, Understanding, and Evaluating Sources of Unsoundness in Call Graphs. In: Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis (to appear). ISSTA 2019 (2019). https://doi.org/10.1145/3293882.3330555, https://doi.org/10.1145/3293882.3330555

  22. Scholz, B., Jordan, H., Subotić, P., Westmann, T.: On fast large-scale program analysis in datalog. In: Proceedings of the 25th International Conference on Compiler Construction. pp. 196–206. CC 2016, ACM, New York, NY, USA (2016). https://doi.org/10.1145/2892208.2892226, https://doi.org/10.1145/2892208.2892226

  23. Sharir, M., Pnueli, A.: Two approaches to interprocedural data flow analysis. New York Univ. Comput. Sci. Dept., New York, NY (1978), https://cds.cern.ch/record/120118

  24. Smaragdakis, Y., Balatsouras, G.: Pointer analysis. Found. Trends Program. Lang. 2(1), 1–69 ( 2015). https://doi.org/10.1561/2500000014, https://doi.org/10.1561/2500000014

  25. Smaragdakis, Y., Balatsouras, G., Kastrinis, G., Bravenboer, M.: More sound static handling of java reflection. In: Feng, X., Park, S. (eds.) Programming Languages and Systems - 13th Asian Symposium, APLAS 2015, Pohang, South Korea, November 30 - December 2, 2015, Proceedings. Lecture Notes in Computer Science, vol. 9458, pp. 485–503. Springer (2015). https://doi.org/10.1007/978-3-319-26529-2_26, https://doi.org/10.1007/978-3-319-26529-2_26

  26. Smaragdakis, Y., Bravenboer, M., Lhoták, O.: Pick your contexts well: Understanding object-sensitivity. In: Proceedings of the 38th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. pp. 17–30. POPL ’11, ACM, New York, NY, USA (2011). https://doi.org/10.1145/1926385.1926390, https://doi.org/10.1145/1926385.1926390

  27. Smaragdakis, Y., Kastrinis, G.: Defensive Points-To Analysis: Effective Soundness via Laziness. In: Millstein, T. (ed.) 32nd European Conference on Object-Oriented Programming (ECOOP 2018). Leibniz International Proceedings in Informatics (LIPIcs), vol. 109, pp. 23:1–23:28. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany (2018). https://doi.org/10.4230/LIPIcs.ECOOP.2018.23, http://drops.dagstuhl.de/opus/volltexte/2018/9228

  28. Soot: Soot - a framework for analyzing and transforming java and android applications (Jan 2019), http://sable.github.io/soot/

  29. Späth, J., Ali, K., Bodden, E.: Ideal: Efficient and precise alias-aware dataflow analysis. In: 2017 International Conference on Object-Oriented Programming, Languages and Applications (OOPSLA/SPLASH). ACM Press (Oct 2017), https://doi.org/10.1145/3133923

  30. Späth, J., Ali, K., Bodden, E.: Context-, flow-, and field-sensitive data-flow analysis using synchronized pushdown systems. Proc. ACM Program. Lang. 3(POPL), 48:1–48:29 (2019). https://doi.org/10.1145/3290361, https://doi.org/10.1145/3290361

  31. Späth, J., Do, L.N.Q., Ali, K., Bodden, E.: Boomerang: Demand-driven flow- and context-sensitive pointer analysis for java. In: Krishnamurthi, S., Lerner, B.S. (eds.) 30th European Conference on Object-Oriented Programming, ECOOP 2016, July 18-22, 2016, Rome, Italy. LIPIcs, vol. 56, pp. 22:1–22:26. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2016). https://doi.org/10.4230/LIPIcs.ECOOP.2016.22, https://doi.org/10.4230/LIPIcs.ECOOP.2016.22

  32. Sui, L., Dietrich, J., Emery, M., Rasheed, S., Tahir, A.: On the soundness of call graph construction in the presence of dynamic language features - a benchmark and tool evaluation. In: Ryu, S. (ed.) Programming Languages and Systems. pp. 69–88. Springer International Publishing, Cham (2018), https://doi.org/10.1007/978-3-030-02768-1_4

  33. Sui, L., Dietrich, J., Tahir, A., Fourtounis, G.: On the recall of static call graph construction in practice. In: Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering. p. 1049–1060. ICSE ’20, Association for Computing Machinery, New York, NY, USA (2020). https://doi.org/10.1145/3377811.3380441, https://doi.org/10.1145/3377811.3380441

  34. Tan, T., Li, Y., Xue, J.: Efficient and precise points-to analysis: Modeling the heap by merging equivalent automata. In: Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation. pp. 278–291. PLDI 2017, Association for Computing Machinery, New York, NY, USA (2017). https://doi.org/10.1145/3062341.3062360

  35. Vallée-Rai, R., Co, P., Gagnon, E., Hendren, L., Lam, P., Sundaresan, V.: Soot - a java bytecode optimization framework. In: Proceedings of the 1999 Conference of the Centre for Advanced Studies on Collaborative Research. p. 13. CASCON ’99, IBM Press (1999), https://doi.org/10.5555/781995.782008

  36. Vallée-Rai, R., Gagnon, E., Hendren, L., Lam, P., Pominville, P., Sundaresan, V.: Optimizing java bytecode using the soot framework: Is it feasible? In: Watt, D.A. (ed.) Compiler Construction. pp. 18–34. Springer Berlin Heidelberg, Berlin, Heidelberg (2000), https://doi.org/10.1007/3-540-46423-9_2

  37. WALA: Watson libraries for program analysis (Jan 2019), http://wala.sourceforge.net/wiki/index.php/Main_Page

  38. Wala: Intermediate representation (IR) (Aug 2020), https://github.com/wala/WALA/wiki/Intermediate-Representation-(IR)

  39. Wala: Pointer analysis (Aug 2020), https://github.com/wala/WALA/wiki/Pointer-Analysis

  40. Wei, F., Roy, S., Ou, X., Robby: Amandroid: A precise and general inter-component data flow analysis framework for security vetting of android apps. ACM Trans. Priv. Secur. 21(3) (Apr 2018). https://doi.org/10.1145/3183575, https://doi.org/10.1145/3183575

  41. Wikipedia: Datalog (Jan 2019), https://en.wikipedia.org/wiki/Datalog

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jyoti Prakash .

Editor information

Editors and Affiliations

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and Permissions

Copyright information

© 2021 The Author(s)

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Prakash, J., Tiwari, A., Hammer, C. (2021). Effects of Program Representation on Pointer Analyses — An Empirical Study. In: Guerra, E., Stoelinga, M. (eds) Fundamental Approaches to Software Engineering. FASE 2021. Lecture Notes in Computer Science(), vol 12649. Springer, Cham. https://doi.org/10.1007/978-3-030-71500-7_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-71500-7_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-71499-4

  • Online ISBN: 978-3-030-71500-7

  • eBook Packages: Computer ScienceComputer Science (R0)