Optimized Register Renaming Scheme for Stack-Based x86 Operations

  • Xuehai Qian
  • He Huang
  • Zhenzhong Duan
  • Junchao Zhang
  • Nan Yuan
  • Yongbin Zhou
  • Hao Zhang
  • Huimin Cui
  • Dongrui Fan
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4415)

Abstract

The stack-based floating point unit (FPU) in the x86 architecture limits its floating point (FP) performance. The flat register file can improve FP performance but affect x86 compatibility. This paper presents an optimized two-phase floating point register renaming scheme used in implementing an x86-compliant processor. The two-phase renaming scheme eliminates the implicit dependencies between the consecutive FP instructions and redundant operations. As two applications of the method, the techniques used in the second phase of the scheme can eliminate redundant loads and reduce the mis-speculation ratio of the load-store queue. Moreover, the performance of a binary translation system that translates instructions in x86 to MIPS-like ISA can also be boosted by adding the related architectural supports in this optimized scheme to the architecture.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Hu, W., Zhang, F., Li, Z.: Microarchitecture of the Godson-2 processor. Journal of Computer Science and Technology 3, 243–249 (2005)CrossRefGoogle Scholar
  2. 2.
    Patterson, D., Hennessy, J.: Computer Architecture: A Quantitative Approach. Morgan Kaufmann Publishers, San Francisco (1996)MATHGoogle Scholar
  3. 3.
    Goddard, M.D., White, S.A.: Floating point stack and exchange instruction. US Patent Number: 5,857,089 Jan.5 (Jan. 5, 1999)Google Scholar
  4. 4.
    Clift, D.W., Arnold, J.M., Colwell, R.P., Glew, A.F.: Floating point register alias table fxch and retirement floating point register array. US Patent Number: 5,499,352 (Mar. 12, 1996)Google Scholar
  5. 5.
    Zhang, F.: Performance analysis and optimization of microprocessors. PHD Thesis, Institute of Computing Technology, Chinese Academy of Sciences (6) (2005)Google Scholar
  6. 6.
    Tang, F.: Research on dynamic binary translation and optimization. PHD Thesis, Institute of Computing Technology, Chinese Academy of Sciences (6) (2006)Google Scholar
  7. 7.
    Baraz, L., Devor, T., Etzion, O., Goldenberg, S., Skaletsky, A., Wang, Y., Zemach, Y.: IA-32 execution layer: a two-phase dynamic translator designed to support IA-32 applications on Itanium®-based systems. MICRO-2003 (11) (2003)Google Scholar
  8. 8.
    Dehnert, J.C., Grant, B.K., Banning, J.P, Johnson, R., Kistler, T., Klaiber, A., Mattson, J.: The transmeta code morphing software: using speculation, recovery, and adaptive retranslation toaddress real-life challenges. CGO-2003 (3) (2003)Google Scholar
  9. 9.
    Bochs: The Open Source IA-32 Emulation Project. http://bochs.sourceforge.net/

Copyright information

© Springer Berlin Heidelberg 2007

Authors and Affiliations

  • Xuehai Qian
    • 1
  • He Huang
    • 1
  • Zhenzhong Duan
    • 1
  • Junchao Zhang
    • 1
  • Nan Yuan
    • 1
  • Yongbin Zhou
    • 1
  • Hao Zhang
    • 1
  • Huimin Cui
    • 1
  • Dongrui Fan
    • 1
  1. 1.Key Laboratory of Computer System and Architecture, Institute of Computing Technology, Chinese Academy of Sciences 

Personalised recommendations