ISAMAP: Instruction Mapping Driven by Dynamic Binary Translation

  • Maxwell Souza
  • Daniel Nicácio
  • Guido Araújo
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6161)

Abstract

Dynamic Binary Translation (DBT) techniques have been largely used in the migration of legacy code and in the transparent execution of programs across different architectures. They have also been used in dynamic optimizing compilers, to collect runtime information so as to improve code quality. In many cases, DBT translation mechanism misses important low-level mapping opportunities available at the source/target ISAs. Hot code performance has been shown to be central to the overall program performance, as different instruction mappings can account for high performance gains. Hence, DBT techniques that provide efficient instruction mapping at the ISA level has the potential to considerably improve performance. This paper proposes ISAMAP, a flexible instruction mapping driven by dynamic binary translation. Its mapping mechanism, provides a fast translation between ISAs, under an easy-to-use description. At its current state, ISAMAP is capable of translating 32-bit PowerPC code to 32-bit x86 and to perform local optimizations on the resulting x86 code. Our experimental results show that ISAMAP is capable of executing PowerPC code on an x86 host faster than the processor emulator QEMU, achieving speedups of up to 3.16x for SPEC CPU2000 programs.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Moseley, T., Shye, A., Reddi, V.J., Grunwald, D., Peri, R.: Shadow Profiling: Hiding Instrumentation Costs with Parallelism. In: Proceedings of the International Symposium on Code Generation and Optimization CGO 2007 (March 2007)Google Scholar
  2. 2.
    Wallace, S., Hazelwood, K.: SuperPin: Parallelizing Dynamic Instrumentation for Real-Time Performance. In: Proceedings of the International Symposium on Code Generation and Optimization CGO 2007 (March 2007)Google Scholar
  3. 3.
    Zheng, C., Thompson, C.: PA-RISC to IA-64: Transparent Execution, No Recompilation. IEEE Computer 33(3), 47–53 (2000)CrossRefGoogle Scholar
  4. 4.
    Chernoff, A., Hookway, R.: DIGITAL FX!32 - Running 32-Bit x86 Applications on Alpha NT. In: Proceedings of the USENIX Windows NT Workshop. USENIX Association, Berkeley CA (1997)Google Scholar
  5. 5.
    Baraz, L., Devor, T., Etzion, O., Goldenberg, S., Skaletsky, A., Wang, Y., Zemach, Y.: IA-32 Execution Layer: a two-phase dynamic translator designed to support IA-32 applications on Itanium-based systems. In: 6th International Conference on Microarchitecture (MICRO36), San Diego CA, vol. 36 (December 2003)Google Scholar
  6. 6.
    Bala, V., Duesterwald, E., Banerjia, S.: Dynamo: A transparent Dynamic Optimization System. SIGPLAN PLDI, 1–12 (June 2000)Google Scholar
  7. 7.
    Ung, D., Cifuentes, C.: Optimising hot paths in a dynamic binary translator. In: Workshop on Binary Translation (October 2000)Google Scholar
  8. 8.
    Lu, J., Chen, H., Yew, P.C., Hsu, W.C.: Design and implementation of a lightweight dynamic optimization system. The Journal of Instruction-Level Parallelism 6 (2004)Google Scholar
  9. 9.
    P.J., G.J.: Fast dynamic binary translation the yirr-ma framework. In: Proceedings of the 2002 Workshop on Binary Translation (2002)Google Scholar
  10. 10.
    Ebcioğlu, K., Altman, E.R.: Daisy: dynamic compilation for 100. In: ISCA 1997: Proceedings of the 24th Annual International Symposium on Computer Architecture, pp. 26–37. ACM, New York (1997)Google Scholar
  11. 11.
    Bellard, F.: Qemu, a fast and portable dynamic translator. In: ATEC 2005: Proceedings of the Annual Conference on USENIX Annual Technical Conference, pp. 41–41. USENIX Association, Berkeley (2005)Google Scholar
  12. 12.
    Aho, A.V., Ganapathi, M., Tjiang, S.W.K.: Code generation using tree matching and dynamic programming. ACM Trans. Program. Lang. Syst. 11(4), 491–516 (1989)CrossRefGoogle Scholar
  13. 13.
    Fraser, C., Hanson, D., Proebsting, T.: Engineering a Simple, Efficient Code Generator Generator. ACM Letters on Prog. Lang. and Systems, 213–226 (1993)Google Scholar
  14. 14.
    Azevedo, R., Rigo, S., Bartholomeu, M., Araujo, G., Araujo, C., Barros, E.: The ArchC architecture description language and tools. International Journal of Parallel Programming 33(5), 453–484 (2005)CrossRefMATHGoogle Scholar
  15. 15.
    Baldassin, A., Centoducatte, P.C., Rigo, S.: Extending the ArchC language for automatic generation of assemblers. In: Proceedings of the 17th Symposium on Computer Architecture and High Performance Computing, pp. 60–68 (October 2005)Google Scholar
  16. 16.
    Linux Standard Base Specification for the PPC32 Architecture 1.3. Technical report, Free Standards Group, http://refspecs.freestandards.org/LSB_1.3.0/PPC32/spec.html (last accessed February 18, 2010)
  17. 17.
    Hazelwood, K., Smith, M.D.: Managing bounded code caches in dynamic binary optimization systems. ACM Trans. Archit. Code Optim. 3(3), 263–294 (2006)CrossRefGoogle Scholar
  18. 18.
    Reddi, V.J., Connors, D., Cohn, R., Smith, M.D.: Persistent code caching: Exploiting code reuse across executions and applications. In: CGO 2007: Proceedings of the International Symposium on Code Generation and Optimization, pp. 74–88. IEEE Computer Society Press, Washington, DC (2007)Google Scholar
  19. 19.
    Luk, C.K., Cohn, R., Muth, R., Patil, H., Klauser, A., Lowney, G., Wallace, S., Reddi, V.J., Hazelwood, K.: Pin: building customized program analysis tools with dynamic instrumentation. In: PLDI 2005: Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 190–200. ACM, New York (2005)CrossRefGoogle Scholar
  20. 20.
    Moseley, T., Connors, D.A., Grunwald, D., Peri, R.: Identifying potential parallelism via loop-centric profiling. In: CF 2007: Proceedings of the 4th International Conference on Computing Frontiers, pp. 143–152. ACM, New York (2007)Google Scholar
  21. 21.
    Hazelwood, K., Cohn, R.: A cross-architectural interface for code cache manipulation. In: CGO 2006: Proceedings of the International Symposium on Code Generation and Optimization, pp. 17–27. IEEE Computer Society, Washington, DC (2006)Google Scholar
  22. 22.
    Dehnert, J., Grant, B., Banning, J., Johnson, R., Kistler, T., Klaiber, A., Mattson, J.: The Transmeta code morphing software: Using speculation, recovery, and adaptive retranslation to address reallife challenges. In: International Symposium on Code Generation and Optimization, pp. 15–24. IEEE Computer Society, Los Alamitos (2003)Google Scholar
  23. 23.
    Li, J., Zhang, Q., Xu, S., Huang, B.: Optimizing Dynamic Binary Translation for SIMD Instructions. In: Proceedings of the International Symposium on Code Generation and Optimization, CGO 2006 (March 2006)Google Scholar
  24. 24.
    Yardimci, E., Franz, M.: Dynamic Parallelization and Mapping of Binary Executables on Hierarchical Platforms. In: Proceedings of the 3rd Conference on Computing Frontiers, CF 2006 (May 2006)Google Scholar
  25. 25.
    Jisheng Zhao, C.K., Rogers, I.: Lazy Interprocedural Analysis for Dynamic Loop Parallelization. In: Workshop on New Horizons in Compilers, Bangalore, India (December 2006)Google Scholar
  26. 26.
    Wang, C., Wu, Y., Borin, E., Hu, S., Liu, W., Sager, D., Ngai, T.f., Fang, J.: Dynamic parallelization of single-threaded binary programs using speculative slicing. In: ICS 2009: Proceedings of the 23rd International Conference on Supercomputing, pp. 158–168. ACM, New York (2009)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Maxwell Souza
    • 1
  • Daniel Nicácio
    • 1
  • Guido Araújo
    • 1
  1. 1.Institute of ComputingUNICAMPBrazil

Personalised recommendations