Abstract
Java virtual machines execute Java bytecode instructions. Since this bytecode is a higher level representation than traditional object code, it is possible to decompile it back to Java source. Many such decompilers have been developed and the conventional wisdom is that decompiling Java bytecode is relatively simple. This may be true when decompiling bytecode produced directly from a specific compiler, most often Sun’s javac compiler. In this case it is really a matter of inverting a known compilation strategy. However, there are many problems, traps and pitfalls when decompiling arbitrary verifiable Java bytecode. Such bytecode could be produced by other Java compilers, Java bytecode optimizers or Java bytecode obfuscators. Java bytecode can also be produced by compilers for other languages, including Haskell, Eiffel, ML, Ada and Fortran. These compilers often use very different code generation strategies from javac.
This paper outlines the problems and solutions we have found in our development of Dava, a decompiler for arbitrary Java bytecode. We first outline the problems in assigning types to variables and literals, and the problems due to expression evaluation on the Java stack. Then, we look at finding structured control flow with a particular emphasis on issues related to Java exceptions and synchronized blocks. Throughout the paper we provide small examples which are not properly decompiled by commonly used decompilers.
Chapter PDF
References
Z. Ammarguellat. A control-flow normalization algorithm and its complexity. IEEE Transactions on Software Engineering, 18(3):237–250, March 1992.
B. S. Baker. An algorithm for structuring flowgraphs. Journal of the Association for Computing Machinery, pages 98–120, January 1977.
C. Cifuentes. Reverse Compilation Techniques. PhD thesis, Queensland University of Technology, July 1994.
A. M. Erosa and L. J. Hendren. Taming control flow: A structured approach to eliminating goto statements. In Proceedings of the 1994 International Conference on Computer Languages, pages 229–240, May 1994.
E. M. Gagnon, L. J. Hendren, and G. Marceau. Efficient inference of static types for Java bytecode. In Static Analysis Symposium 2000, Lecture Notes in Computer Science, pages 199–219, Santa Barbara, June 2000.
Jad-the fast JAva Decompiler. http://www.geocities.com/SiliconValley/-Bridge/8617/jad.html.
T. Knoblock and J. Rehof. Type elaboration and subtype completion for java bytecode. In Proceedings 27th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages., 2000.
J. Miecznikowski and L. Hendren. Decompiling Java using staged encapsulation. In Proceedings of the Working Conference on Reverse Engineering, pages 368–374, October 2001.
T. A. Proebsting and S. A. Watterson. Krakatoa: Decompilation in Java (Does bytecode reveal source?). In 3rd USENIX Conference on Object-Oriented Technologies and Systems (COOTS’97), pages 185–197, June 1997.
L. Ramshaw. Eliminating go to’s while preserving program structure. Journal of the Association for Computing Machinery, 35(4):893–920, October 1988.
Soot-a Java Optimization Framework. http://www.sable.mcgill.ca/soot/.
R. Vallée-Rai, E. Gagnon, L. Hendren, P. Lam, P. Pominville, and V. Sundaresan. Optimizing Java bytecode using the Soot framework: Is it feasible? In D. A. Watt, editor, Compiler Construction, 9th International Conference, volume 1781 of Lecture Notes in Computer Science, pages 18–34, Berlin, Germany, March 2000. Springer.
WingDis-A Java Decompiler. http://www.wingsoft.com/wingdis.html.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Miecznikowski, J., Hendren, L. (2002). Decompiling Java Bytecode: Problems, Traps and Pitfalls. In: Horspool, R.N. (eds) Compiler Construction. CC 2002. Lecture Notes in Computer Science, vol 2304. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45937-5_10
Download citation
DOI: https://doi.org/10.1007/3-540-45937-5_10
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43369-9
Online ISBN: 978-3-540-45937-8
eBook Packages: Springer Book Archive