CIL: Intermediate Language and Tools for Analysis and Transformation of C Programs
This paper describes the C Intermediate Language: a highlevel representation along with a set of tools that permit easy analysis and source-to-source transformation of C programs.
Compared to C, CIL has fewer constructs. It breaks down certain complicated constructs of C into simpler ones, and thus it works at a lower level than abstract-syntax trees. But CIL is also more high-level than typical intermediate languages (e.g., three-address code) designed for compilation. As a result, what we have is a representation that makes it easy to analyze and manipulate C programs, and emit them in a form that resembles the original source. Moreover, it comes with a front-end that translates to CIL not only ANSI C programs but also those using Microsoft C or GNU C extensions.
We describe the structure of CIL with a focus on how it disambiguates those features of C that we found to be most confusing for program analysis and transformation. We also describe a whole-program merger based on structural type equality, allowing a complete project to be viewed as a single compilation unit. As a representative application of CIL, we show a transformation aimed at making code immune to stack-smashing attacks. We are currently using CIL as part of a system that analyzes and instruments C programs with run-time checks to ensure type safety. CIL has served us very well in this project, and we believe it can usefully be applied in other situations as well.
- 2.Luca Cardelli, James Donahue, Mick Jordan, Bill Kalsow, and Greg Nelson. The Modula-3 type system. In Proceedings of the 16th Annual ACM Symposium on Principles of Programming Languages, pages 202–212, January 1989.Google Scholar
- 3.Microsoft Corporation. The AST Toolkit. http://research.microsoft.com/sbt/asttoolkit/ast.asp.
- 4.Crispan Cowan, Calton Pu, Dave Maier, Jonathan Walpole, Peat Bakke, Steve Beattie, Aaron Grier, Perry Wagle, Qian Zhang, and Heather Hinton. StackGuard: Automatic adaptive detection and prevention of buffer-over.owa ttacks. In Proceedings of the 7th USENIX Security Conference, pages 63–78, January 1998.Google Scholar
- 5.Edison Design Group. The C++ Front End. http://www.edg.com/cpp.html.
- 6.ISO/IEC. ISO/IEC 9899:1999(E) Programming Languages — C.Google Scholar
- 7.BrianW. Kernighan and Dennis M. Ritchie. The C Programming Language (second edition). Prentice-Hall, Englewood Cliffs, N. J., 1988.Google Scholar
- 8.Holger Kienle and Urs Hölzle. Introduction to the SUIF 2.0 compiler system. Technical Report TRCS97-22, University of California, Santa Barbara. Computer Science Dept., December 10, 1997.Google Scholar
- 9.Bell Labs. ckit: A Front End for C in SML. http://cm.bell-labs.com/cm/cs/what/smlnj/doc/ckit/overview.html.
- 10.Calvin Lin, Samuel Guyer, Daniel Jimenez, and Teck Bok Tok. C-Breeze. http://www.cs.utexas.edu/users/c-breeze/.
- 11.Paul McJones and Andy Hisgen. The Topaz system: Distributed multiprocessor personal computing. In Proceedings of the IEEE Workshop on Workstation Operating Systems, November 1987.Google Scholar
- 12.George C. Necula, Scott McPeak, and Westley Weimer. CCured: Type-safe retrofitting of legacy code. In Proceedings of the 29th Annual ACM Symposium on Principles of Programming Languages, January 2002.Google Scholar
- 13.Standard Performance Evaluation Corportation. SPEC 95 Benchmarks. July 1995. http://www.spec.org/osg/cpu95/CINT95.
- 14.Robert Wilson, Robert French, Christopher Wilson, Saman Amarasinghe, Jennifer Anderson, Steve Tjiang, Shih-Wei Liao, Chau-Wen Tseng, Mary Hall, Monica Lam, and John Hennessy. The SUIF compiler system: a parallelizing and optimizing research compiler. Technical Report CSL-TR-94-620, Stanford University, Computer Systems Laboratory, May 1994.Google Scholar