Abstract
We describe a system which decompiles (reverse engineers) C programs from target machine code by type-inference techniques. This extends recent trends in the converse process of compiling high-level languages whereby type information is preserved during compilation. The algorithms remain independent of the particular architecture by virtue of treating target instructions as register-transfer specifications. Target code expressed in such RTL form is then transformed into SSA form (undoing register colouring etc.); this then generates a set of type constraints. Iteration and recursion over data-structures causes synthesis of appropriate recursive C structs; this is triggered by and resolves occurs-check constraint violation. Other constraint violations are resolved by C’s casts and unions. In the limit we use heuristics to select between equally suitable C code — a good GUI would clearly facilitate its professional use.
A preliminary form of this work was presented at the APPSEM’98 workshop in Pisa.
Chapter PDF
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Cifuentes, C. Reverse Compilation Techniques, PhD thesis, Queensland University of Technology, 1994. Available as ftp://ftp.csee.uq.edu.au/pub/CSM/dcc/decompilation_thesis.ps.gz
Cytron, R., Ferrante, J., Rosen, B.K., Wegman, M.N. and Zadeck, F.W. Efficiently computing static single assignment form and the control dependence graph. ACM Transactions on Programming Languages and Systems, 13(4):451–490, October 1991.
Gandhe, M., Venkatesh, G., Sanyal, A. Correcting Errors in the Curry System. In Chandrum V. and Vinay, V. (Eds.): Proc. of 16th conf. on Foundations of Software Technology and Theoretical Computer Science, LNCS vol. 1180, Springer-Verlag, 1996.
Glasgow Haskell Compiler.
Gannod, G.C. and Cheng, B.H.C. Using Informal and Formal Techniques for the Reverse Engineering of C Programs. Proc. IEEE International Conference on Software Maintenance, 1996.
Glasscock, P.E. An 80x86 to C Reverse Compiler. Diploma in Computer Science Dissertation, Computer Laboratory, Cambridge University, 1998.
Milner, R. A Theory of Polymorphism in Programming, JCSS 1978.
Morrisett, G., Walker, D., Crary, K. and Glew, N. From System F to Typed Assembly Language. Proc. 25th ACM symp. on Principles of Programming Languages, 1998.
Necula, G.C. and Lee, P. The Design and Implementation of a Certifying Compiler. Proc. ACM conf. on Programming Language Design and Implementation, 1998.
Richards, M. and Whitby-Strevens, C. BCPL—The Language and its Compiler, CUP 1979.
Tang, Y.M., Jouvelot, P. Effect Systems with Subtyping. Proc. ACM symp. on Partial Evaluation and Program Manipulation (PEPM), 1995.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Mycroft, A. (1999). Type-Based Decompilation (or Program Reconstruction via Type Reconstruction). In: Swierstra, S.D. (eds) Programming Languages and Systems. ESOP 1999. Lecture Notes in Computer Science, vol 1576. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-49099-X_14
Download citation
DOI: https://doi.org/10.1007/3-540-49099-X_14
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65699-9
Online ISBN: 978-3-540-49099-9
eBook Packages: Springer Book Archive