Abstract
A structuring algorithm for arbitrary control flow graphs is presented. Graphs are structured into functional, semantical and structural equivalent graphs, without code replication or introduction of new variables. The algorithm makes use of a set of generic high-level language structures that includes different types of loops and conditionals. Gotos are used only when the graph cannot be structured with the structures in the generic set.
This algorithm is adequate for the control flow analysis required when decompiling programs, given that a pure binary program does not contain information on the high-level structures used by the initial high-level language program (i.e. before compilation). The algorithm has been implemented as part of the dec decompiler, an i80286 decompiler of DOS binary programs, and has proved successful in its aim of structuring decompiled graphs.
This work was done while with the Queensland University of Technology in Brisbane, Australia. This research was partly funded by Australian Research Council (ARC) grant No. A49130261.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
F.E. Allen. Control flow analysis. SIGPLAN Notices, 5(7):1–19, July 1970.
F.E. Allen. A basis for program optimization. In Proc. IFIP Congress, pages 385–390, Amsterdam, Holland, 1972. North-Holland Pub.Co.
F.E. Allen. Interprocedural data flow analysis. In Proc. IFIP Congress, pages 398–402, Amsterdam, Holland, 1974. North-Holland Pub.Co.
F.E. Allen and J. Cocke. Graph theoretic constructs for program control flow analysis. Technical Report RC 3923 (No. 17789), IBM, Thomas J. Watson Research Center, Yorktown Heights, New York, July 1972.
F.E. Allen and J. Cocke. A program data flow analysis procedure. Communications of the ACM, 19(3):137–147, March 1976.
Z. Ammarguellat. A control-flow normalization algorithm and its complexity. IEEE Transactions on Software Engineering, 18(3):237–251, March 1992.
B.S. Baker. An algorithm for structuring flowgraphs. Journal of the ACM, 24(1):98–120, January 1977.
C. Böhm and G. Jacopini. Flow diagrams, Turing machines and languages with only two formation rules. Communications of the ACM, 9(5):366–371, May 1966.
C. Cifuentes. Reverse Compilation Techniques. PhD dissertation, Queensland University of Technology, School of Computing Science, July 1994.
C. Cifuentes. Interprocedural dataflow decompilation. In print: Journal of Programming Languages, 1996.
C. Cifuentes and K.J. Gough. Decompilation of binary programs. Software — Practice and Experience, 25(7):811–829, July 1995.
J. Cocke. Global common subexpression elimination. SIGPLAN Notices, 5(7):20–25, July 1970.
A.M. Erosa and L.J. Hendren. Taming control flow: A structured approach to eliminating goto statements. In Proceedings of the International Conference on Computer Languages, Université Paul Sabatier, Toulouse, France, May 1994. IEEE Computer Society.
M.S. Hecht. Flow Analysis of Computer Programs. Elsevier North-Holland, Inc, 52 Vanderbilt Avenue, New York, New York 10017, 1977.
B.C. Housel. A Study of Decompiling Machine Languages into High-Level Machine Independent Languages. PhD dissertation, Purdue University, Computer Science, August 1973.
G.L. Steele Jr. and G.J. Sussman. Design of a LISP-based microprocessor. Communications of the ACM, 23(11):628–645, November 1980.
D.E. Knuth and R.W. Floyd. Notes on avoiding go to statements. Information Processing Letters, 1(1):23–31, 1971.
S.R. Kosaraju. Analysis of structured programs. Journal of Computer and System Sciences, 9(3):232–255, 1974.
U. Lichtblau. Decompilation of control structures by means of graph transformations. In Proceedings of the International Joint Conference on Theory and Practice of Software Development (TAPSOFT), Berlin, 1985.
U. Lichtblau. Recognizing rooted context-free flowgraph languages in polynomial time. In G. Rozenberg H. Ehrig, H.J. Kreowski, editor, Graph Grammars and their application to Computer Science, number 532 in Lecture Notes in Computer Science, pages 538–548. Springer-Verlag, 1991.
J. McCarthy. Recursive functions of symbolic expressions and their computation by machine, part I. Communications of the ACM, 3(4):184–195, April 1960.
G. Oulsnam. Unravelling unstructured programs. The Computer Journal, 25(3):379–387, 1982.
D.J. Pavey and L.A. Winsborrow. Demonstrating equivalence of source code and PROM contents. The Computer Language, 36(7):654–667, 1993.
L. Ramshaw. Eliminating go to's while preserving program structure. Journal of the ACM, 35(4):893–920, October 1988.
M. Sharir. Structural analysis: A new approach to flow analysis in optimizing compilers. Computer Languages, 5:141–153, 1980.
R.L. Sites, A. Chernoff, M.B. Kirk, M.P. Marks, and S.G. Robinson. Binary translation. Communications of the ACM, 36(2):69–81, February 1993.
M.H. Williams. Generating structured flow diagrams: the nature of unstructuredness. The Computer Journal, 20(1):45–50, 1977.
M.H. Williams and G. Chen. Restructuring Pascal programs containing goto statements. The Computer Journal, 28(2):134–137, 1985.
M.H. Williams and H.L. Ossher. Conversion of unstructured flow diagrams to structured form. The Computer Journal, 21(2):161–167, 1978.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1996 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cifuentes, C. (1996). Structuring decompiled graphs. In: Gyimóthy, T. (eds) Compiler Construction. CC 1996. Lecture Notes in Computer Science, vol 1060. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-61053-7_55
Download citation
DOI: https://doi.org/10.1007/3-540-61053-7_55
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-61053-3
Online ISBN: 978-3-540-49939-8
eBook Packages: Springer Book Archive