Structuring decompiled graphs

  • Cristina Cifuentes
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1060)

Abstract

A structuring algorithm for arbitrary control flow graphs is presented. Graphs are structured into functional, semantical and structural equivalent graphs, without code replication or introduction of new variables. The algorithm makes use of a set of generic high-level language structures that includes different types of loops and conditionals. Gotos are used only when the graph cannot be structured with the structures in the generic set.

This algorithm is adequate for the control flow analysis required when decompiling programs, given that a pure binary program does not contain information on the high-level structures used by the initial high-level language program (i.e. before compilation). The algorithm has been implemented as part of the dec decompiler, an i80286 decompiler of DOS binary programs, and has proved successful in its aim of structuring decompiled graphs.

References

  1. 1.
    F.E. Allen. Control flow analysis. SIGPLAN Notices, 5(7):1–19, July 1970.Google Scholar
  2. 2.
    F.E. Allen. A basis for program optimization. In Proc. IFIP Congress, pages 385–390, Amsterdam, Holland, 1972. North-Holland Pub.Co.Google Scholar
  3. 3.
    F.E. Allen. Interprocedural data flow analysis. In Proc. IFIP Congress, pages 398–402, Amsterdam, Holland, 1974. North-Holland Pub.Co.Google Scholar
  4. 4.
    F.E. Allen and J. Cocke. Graph theoretic constructs for program control flow analysis. Technical Report RC 3923 (No. 17789), IBM, Thomas J. Watson Research Center, Yorktown Heights, New York, July 1972.Google Scholar
  5. 5.
    F.E. Allen and J. Cocke. A program data flow analysis procedure. Communications of the ACM, 19(3):137–147, March 1976.Google Scholar
  6. 6.
    Z. Ammarguellat. A control-flow normalization algorithm and its complexity. IEEE Transactions on Software Engineering, 18(3):237–251, March 1992.Google Scholar
  7. 7.
    B.S. Baker. An algorithm for structuring flowgraphs. Journal of the ACM, 24(1):98–120, January 1977.Google Scholar
  8. 8.
    C. Böhm and G. Jacopini. Flow diagrams, Turing machines and languages with only two formation rules. Communications of the ACM, 9(5):366–371, May 1966.Google Scholar
  9. 9.
    C. Cifuentes. Reverse Compilation Techniques. PhD dissertation, Queensland University of Technology, School of Computing Science, July 1994.Google Scholar
  10. 10.
    C. Cifuentes. Interprocedural dataflow decompilation. In print: Journal of Programming Languages, 1996.Google Scholar
  11. 11.
    C. Cifuentes and K.J. Gough. Decompilation of binary programs. Software — Practice and Experience, 25(7):811–829, July 1995.Google Scholar
  12. 12.
    J. Cocke. Global common subexpression elimination. SIGPLAN Notices, 5(7):20–25, July 1970.Google Scholar
  13. 13.
    A.M. Erosa and L.J. Hendren. Taming control flow: A structured approach to eliminating goto statements. In Proceedings of the International Conference on Computer Languages, Université Paul Sabatier, Toulouse, France, May 1994. IEEE Computer Society.Google Scholar
  14. 14.
    M.S. Hecht. Flow Analysis of Computer Programs. Elsevier North-Holland, Inc, 52 Vanderbilt Avenue, New York, New York 10017, 1977.Google Scholar
  15. 15.
    B.C. Housel. A Study of Decompiling Machine Languages into High-Level Machine Independent Languages. PhD dissertation, Purdue University, Computer Science, August 1973.Google Scholar
  16. 16.
    G.L. Steele Jr. and G.J. Sussman. Design of a LISP-based microprocessor. Communications of the ACM, 23(11):628–645, November 1980.Google Scholar
  17. 17.
    D.E. Knuth and R.W. Floyd. Notes on avoiding go to statements. Information Processing Letters, 1(1):23–31, 1971.Google Scholar
  18. 18.
    S.R. Kosaraju. Analysis of structured programs. Journal of Computer and System Sciences, 9(3):232–255, 1974.Google Scholar
  19. 19.
    U. Lichtblau. Decompilation of control structures by means of graph transformations. In Proceedings of the International Joint Conference on Theory and Practice of Software Development (TAPSOFT), Berlin, 1985.Google Scholar
  20. 20.
    U. Lichtblau. Recognizing rooted context-free flowgraph languages in polynomial time. In G. Rozenberg H. Ehrig, H.J. Kreowski, editor, Graph Grammars and their application to Computer Science, number 532 in Lecture Notes in Computer Science, pages 538–548. Springer-Verlag, 1991.Google Scholar
  21. 21.
    J. McCarthy. Recursive functions of symbolic expressions and their computation by machine, part I. Communications of the ACM, 3(4):184–195, April 1960.Google Scholar
  22. 22.
    G. Oulsnam. Unravelling unstructured programs. The Computer Journal, 25(3):379–387, 1982.Google Scholar
  23. 23.
    D.J. Pavey and L.A. Winsborrow. Demonstrating equivalence of source code and PROM contents. The Computer Language, 36(7):654–667, 1993.Google Scholar
  24. 24.
    L. Ramshaw. Eliminating go to's while preserving program structure. Journal of the ACM, 35(4):893–920, October 1988.Google Scholar
  25. 25.
    M. Sharir. Structural analysis: A new approach to flow analysis in optimizing compilers. Computer Languages, 5:141–153, 1980.Google Scholar
  26. 26.
    R.L. Sites, A. Chernoff, M.B. Kirk, M.P. Marks, and S.G. Robinson. Binary translation. Communications of the ACM, 36(2):69–81, February 1993.Google Scholar
  27. 27.
    M.H. Williams. Generating structured flow diagrams: the nature of unstructuredness. The Computer Journal, 20(1):45–50, 1977.Google Scholar
  28. 28.
    M.H. Williams and G. Chen. Restructuring Pascal programs containing goto statements. The Computer Journal, 28(2):134–137, 1985.Google Scholar
  29. 29.
    M.H. Williams and H.L. Ossher. Conversion of unstructured flow diagrams to structured form. The Computer Journal, 21(2):161–167, 1978.Google Scholar

Copyright information

© Springer-Verlag 1996

Authors and Affiliations

  • Cristina Cifuentes
    • 1
  1. 1.Department of Computer ScienceUniversity of TasmaniaHobartAustralia

Personalised recommendations