Staged Points-to Analysis for Large Code Bases

  • Nicholas Allen
  • Bernhard Scholz
  • Padmanabhan Krishnan
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9031)

Abstract

Bug checker tools for Java require fine-grained heap abstractions including object-sensitive call graphs, field information for objects, and points-to sets for program variables to find bugs in source codes. However, heap abstractions coined commonly as points-to analysis, have high runtime-complexity especially when the points-to analysis is context- sensitive, and, hence, state-of-the-art points-to analyses do not scale for large code bases.

In this paper, we introduce a new points-to framework that facilitates the computation of context-sensitive points-to analysis for large code bases. The framework is demand-driven, i.e., a client queries the points-to information for some program variables. The novelty of our approach is a pre-analysis technique that is a combination of staged points-to analyses with program slicing and program compaction. We implemented the proposed points-to framework in Datalog for a proprietary bug checker that could identify security vulnerabilities in the OpenJDKTM library which has approximately 1.3 million variables and 500,000 allocation-sites. For the clients that we have chosen, our technique is able to eliminate about 73% of all variables and about 95% of allocation-sites. Thus our points-to framework scales for code bases with millions of program variables and hundreds of thousands of methods.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Abiteboul, S., Hull, R., Vianu, V.: Foundations of Databases. Addison-Wesley (1995)Google Scholar
  2. 2.
    Andersen, L.O.: Program Analysis and Specialization for the C Programming Language. Ph.D. thesis, DIKU, University of Copenhagen (Fall 1994)Google Scholar
  3. 3.
    Appel, A.W.: Modern Compiler Implementation in Java. Cambridge University Press (1998)Google Scholar
  4. 4.
    Ball, T., Rajamani, S.K.: The SLAM toolkit. In: Berry, G., Comon, H., Finkel, A. (eds.) CAV 2001. LNCS, vol. 2102, pp. 260–264. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  5. 5.
    Bessey, A., Block, K., Chelf, B., Chou, A., Fulton, B., Hallem, S., Henri-Gros, C., Kamsky, A., McPeak, S., Engler, D.: A few billion lines of code later – using static analysis to find bugs in the real world. Comm. ACM 53, 66–75 (2010)Google Scholar
  6. 6.
    Blackburn, S.M., Garner, R., Hoffmann, C., Khan, A.M., McKinley, K.S., Bentzur, R., Diwan, A., Feinberg, D., Frampton, D., Guyer, S.Z., Hirzel, M., Hosking, A., Jump, M., Lee, H., Moss, J.E.B., Phansalkar, A., Stefanovic, D., VanDrunen, T., von Dincklage, D., Wiedermann, B.: The DaCapo benchmarks: Java benchmarking development and analysis. In: OOPSLA 2006: Proceedings of the 21st Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications (2006)Google Scholar
  7. 7.
    Bravenboer, M., Smaragdakis, Y.: Strictly declarative specification of sophisticated points-to analyses. In: Proceeding of the 24th ACM SIGPLAN Conference on Object Oriented Programming Systems Languages and Applications, OOPSLA 2009, pp. 243–262. ACM (2009), http://doi.acm.org/10.1145/1640089.1640108
  8. 8.
    Cifuentes, C., Keynes, N., Li, L., Hawes, N., Valdiviezo, M.: Transitioning Parfait into a development tool. IEEE Security and Privacy 10(3), 16–23 (2012)CrossRefGoogle Scholar
  9. 9.
    Corporation, O.: Secure coding guidelines for java se (April 2014), http://www.oracle.com/technetwork/java/seccodeguide-139067.html
  10. 10.
    Debray, S.K., Evans, W., Muth, R., De Sutter, B.: Compiler techniques for code compaction. ACM Transactions on Programming Languages and Systems 22(2), 378–415 (2000)CrossRefGoogle Scholar
  11. 11.
    Feng, Y., Anand, S., Dillig, I., Aiken, A.: Apposcopy: Semantics-based detection of android malware through static analysis. In: International Symposium on Foundations of Software Engineering (2014) (to appear)Google Scholar
  12. 12.
    Gotsman, A., Berdine, J., Cook, B.: Interprocedural shape analysis with separated heap abstractions. In: Yi, K. (ed.) SAS 2006. LNCS, vol. 4134, pp. 240–260. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  13. 13.
    Green, T.J., Aref, M., Karvounarakis, G.: Logicblox, platform and language: A tutorial. In: Barceló, P., Pichler, R. (eds.) Datalog 2.0 2012. LNCS, vol. 7494, pp. 1–8. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  14. 14.
    Hind, M., Pioli, A.: Which pointer analysis should i use? In: Proceedings of the ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA), pp. 113–123. ACM (2000)Google Scholar
  15. 15.
    Lhoták, O., Hendren, L.J.: Context-sensitive points-to analysis: Is it worth it? In: Mycroft, A., Zeller, A. (eds.) CC 2006. LNCS, vol. 3923, pp. 47–64. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  16. 16.
    Lhoták, O., Hendren, L.J.: Evaluating the benefits of context-sensitive points-to analysis using a BDD-based implementation. ACM Transactions on Software Engineering Methodology 18(1) (2008)Google Scholar
  17. 17.
    Lu, Y., Shang, L., Xie, X., Xue, J.: An incremental points-to analysis with cfl-reachability. In: Jhala, R., De Bosschere, K. (eds.) Compiler Construction. LNCS, vol. 7791, pp. 61–81. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  18. 18.
    Milanova, A., Rountev, A., Ryder, B.G.: Parameterized object sensitivity for points-to analysis for Java. ACM Transaction on Software Engineering Methodolology 14(1), 1–41 (2005), http://doi.acm.org/10.1145/1044834.1044835 CrossRefGoogle Scholar
  19. 19.
    Octeau, D., McDaniel, P., Jha, S., Bartel, A., Bodden, E., Klein, J., Le Traon, Y.: Effective inter-component communication mapping in android with epicc: An essential step towards holistic security analysis. In: Proceedings of the 22nd USENIX Conference on Security (SEC), pp. 543–558. USENIX Association (2013), http://dl.acm.org/citation.cfm?id=2534766.2534813
  20. 20.
    Oh, H., Lee, W., Heo, K., Yang, H., Yi, K.: Selective context-sensitivity guided by impact pre-analysis. In: ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pp. 475–484. ACM (2014)Google Scholar
  21. 21.
    Ryder, B.G.: Dimensions of precision in reference analysis of object-oriented programming languages. In: Hedin, G. (ed.) CC 2003. LNCS, vol. 2622, pp. 126–137. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  22. 22.
    Smaragdakis, Y., Balatsouras, G., Kastrinis, G.: Set-based pre-processing for points-to analysis. In: ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA), pp. 253–270 (2013)Google Scholar
  23. 23.
    Smaragdakis, Y., Bravenboer, M., Lhoták, O.: Pick your contexts well: understanding object-sensitivity. In: Proceedings of the 38th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2011, pp. 17–30. ACM (2011), http://doi.acm.org/10.1145/1926385.1926390
  24. 24.
    Sridharan, M., Bodík, R.: Refinement-based context-sensitive points-to analysis for Java. In: Proceedings of the 2006 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2006, pp. 387–400. ACM (2006), http://doi.acm.org/10.1145/1133981.1134027
  25. 25.
    Sridharan, M., Gopan, D., Shan, L., Bodik, R.: Demand-driven points-to analysis for Java. In: Proceedings of the 20th Annual ACM Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pp. 59–76. ACM (2005), http://doi.acm.org/10.1145/1094811.1094817
  26. 26.
    Tip, F., Palsberg, J.: Scalable propagation-based call graph construction algorithms. In: Rosson, M.B., Lea, D. (eds.) OOPSLA 2000, pp. 281–293. ACM (2000)Google Scholar
  27. 27.
    Yan, D., Xu, G., Rountev, A.: Demand-driven context-sensitive alias analysis for Java. In: Proceedings of the 2011 International Symposium on Software Testing and Analysis (ISSTA), pp. 155–165. ACM (2011), http://doi.acm.org/10.1145/2001420.2001440
  28. 28.
    Zheng, X., Rugina, R.: Demand-driven alias analysis for C. In: Proceedings of the 35th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2008, pp. 197–208 (2008), http://doi.acm.org/10.1145/1328438.1328464

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  • Nicholas Allen
    • 1
  • Bernhard Scholz
    • 1
  • Padmanabhan Krishnan
    • 1
  1. 1.Oracle LabsBrisbaneAustralia

Personalised recommendations