High Performance Embedded Architectures and Compilers

Volume 4917 of the series Lecture Notes in Computer Science pp 273-287

LPA: A First Approach to the Loop Processor Architecture

  • Alejandro GarcíaAffiliated withUniversitat Politècnica de Catalunya
  • , Oliverio J. SantanaAffiliated withUniversidad de Las Palmas de Gran Canaria
  • , Enrique FernándezAffiliated withUniversidad de Las Palmas de Gran Canaria
  • , Pedro MedinaAffiliated withUniversidad de Las Palmas de Gran Canaria
  • , Mateo ValeroAffiliated withUniversitat Politècnica de CatalunyaBarcelona Supercomputing Center


Current processors frequently run applications containing loop structures. However, traditional processor designs do not take into account the semantic information of the executed loops, failing to exploit an important opportunity. In this paper, we take our first step toward a loop-conscious processor architecture that has great potential to achieve high performance and relatively low energy consumption.

In particular, we propose to store simple dynamic loops in a buffer, namely the loop window. Loop instructions are kept in the loop window along with all the information needed to build the rename mapping. Therefore, the loop window can directly feed the execution back-end queues with instructions, avoiding the need for using the prediction, fetch, decode, and rename stages of the normal processor pipeline. Our results show that the loop window is a worthwhile complexity-effective alternative for processor design that reduces front-end activity by 14% for SPECint benchmarks and by 45% for SPECfp benchmarks.