Mathematics in Computer Science

, Volume 5, Issue 3, pp 335–356

A Principled, Complete, and Efficient Representation of C++

Article

DOI: 10.1007/s11786-011-0094-1

Cite this article as:
Dos Reis, G. & Stroustrup, B. Math.Comput.Sci. (2011) 5: 335. doi:10.1007/s11786-011-0094-1

Abstract

We present a systematic representation of C++, called IPR, for complete semantic analysis and semantics-based program transformations. We describe the ideas and design principles that shaped the IPR. In particular, we describe how general type-based unification is key to minimal compact representation, fast type-safe traversal, and scalability. For example, the representation of a fairly typical non-trivial C++ program in GCC 3.4.2 was 32 times larger than its IPR representation; this led to significant improvements to GCC. IPR is general enough to handle real-world programs involving many translation units, archaic programming styles, and generic programming using C++0x extensions that affect the type system. The difficult issue of how to represent irregular (ad hoc) features in a systematic (non-ad hoc) manner is among the key contributions of this paper. The IPR data structure can represent all of C++ with just 157 simple node types; to compare the ISO C++ grammar has over 700 productions. The IPR is used for a variety of program analysis and transformation tasks, such as visualization, loop simplification, and concept extraction. Finally, we report impacts of this work on existing C++ compilers.

Keywords

Computer program representationSemanticsStatic analysisC++

Mathematics Subject Classification (2000)

68N3068N1968N2068N15

Copyright information

© Springer Basel AG 2011

Authors and Affiliations

  1. 1.Texas A&M UniversityCollege StationUSA