Type-Based Decompilation (or Program Reconstruction via Type Reconstruction)

  • Alan Mycroft
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1576)

Abstract

We describe a system which decompiles (reverse engineers) C programs from target machine code by type-inference techniques. This extends recent trends in the converse process of compiling high-level languages whereby type information is preserved during compilation. The algorithms remain independent of the particular architecture by virtue of treating target instructions as register-transfer specifications. Target code expressed in such RTL form is then transformed into SSA form (undoing register colouring etc.); this then generates a set of type constraints. Iteration and recursion over data-structures causes synthesis of appropriate recursive C structs; this is triggered by and resolves occurs-check constraint violation. Other constraint violations are resolved by C’s casts and unions. In the limit we use heuristics to select between equally suitable C code — a good GUI would clearly facilitate its professional use.

References

  1. 1.
    Cifuentes, C. Reverse Compilation Techniques, PhD thesis, Queensland University of Technology, 1994. Available as ftp://ftp.csee.uq.edu.au/pub/CSM/dcc/decompilation_thesis.ps.gz
  2. 2.
    Cytron, R., Ferrante, J., Rosen, B.K., Wegman, M.N. and Zadeck, F.W. Efficiently computing static single assignment form and the control dependence graph. ACM Transactions on Programming Languages and Systems, 13(4):451–490, October 1991.CrossRefGoogle Scholar
  3. 3.
    Gandhe, M., Venkatesh, G., Sanyal, A. Correcting Errors in the Curry System. In Chandrum V. and Vinay, V. (Eds.): Proc. of 16th conf. on Foundations of Software Technology and Theoretical Computer Science, LNCS vol. 1180, Springer-Verlag, 1996.Google Scholar
  4. 4.
    Glasgow Haskell Compiler.Google Scholar
  5. 5.
    Gannod, G.C. and Cheng, B.H.C. Using Informal and Formal Techniques for the Reverse Engineering of C Programs. Proc. IEEE International Conference on Software Maintenance, 1996.Google Scholar
  6. 6.
    Glasscock, P.E. An 80x86 to C Reverse Compiler. Diploma in Computer Science Dissertation, Computer Laboratory, Cambridge University, 1998.Google Scholar
  7. 7.
    Milner, R. A Theory of Polymorphism in Programming, JCSS 1978.Google Scholar
  8. 8.
    Morrisett, G., Walker, D., Crary, K. and Glew, N. From System F to Typed Assembly Language. Proc. 25th ACM symp. on Principles of Programming Languages, 1998.Google Scholar
  9. 9.
    Necula, G.C. and Lee, P. The Design and Implementation of a Certifying Compiler. Proc. ACM conf. on Programming Language Design and Implementation, 1998.Google Scholar
  10. 10.
    Richards, M. and Whitby-Strevens, C. BCPL—The Language and its Compiler, CUP 1979.Google Scholar
  11. 11.
    Tang, Y.M., Jouvelot, P. Effect Systems with Subtyping. Proc. ACM symp. on Partial Evaluation and Program Manipulation (PEPM), 1995.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1999

Authors and Affiliations

  • Alan Mycroft
    • 1
  1. 1.Computer LaboratoryCambridge UniversityCambridgeUK

Personalised recommendations