Skip to main content

UCIFF: Unified Cluster Assignment Instruction Scheduling and Fast Frequency Selection for Heterogeneous Clustered VLIW Cores

  • Conference paper
  • 992 Accesses

Part of the Lecture Notes in Computer Science book series (LNTCS,volume 7760)

Abstract

Clustered VLIW processors are scalable wide-issue statically scheduled processors. Their design is based on physically partitioning the otherwise shared hardware resources, a design which leads to both high performance and low energy consumption. In traditional clustered VLIW processors, all clusters operate at the same frequency. Heterogeneous clustered VLIW processors however, support dynamic voltage and frequency scaling (DVFS) independently per cluster. Effectively controlling DVFS, to selectively decrease the frequency of clusters with a lot of slack in their schedule, can lead to significant energy savings.

In this paper we propose UCIFF, a new scheduling algorithm for heterogeneous clustered VLIW processors with software DVFS control, that performs cluster assignment, instruction scheduling and fast frequency selection simultaneously, all in a single compiler pass. The proposed algorithm solves the phase ordering problem between frequency selection and scheduling, present in existing algorithms. We compared the quality of the generated code, using both performance and energy-related metrics, against that of the current state-of-the-art and an optimal scheduler. The results show that UCIFF produces better code than the state-of-the-art, very close to the optimal across the mediabench2 benchmarks, while keeping the algorithmic complexity low.

Keywords

  • clustered VLIW
  • heterogeneous
  • DVFS
  • scheduling
  • phase-ordering

This work was supported in part by the EC under grant ERA 249059 (FP7).

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-642-37658-0_9
  • Chapter length: 16 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   39.99
Price excludes VAT (USA)
  • ISBN: 978-3-642-37658-0
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   54.99
Price excludes VAT (USA)

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Gcc: Gnu compiler collection, http://gcc.gnu.org

  2. Aleta, A., Codina, J., González, A., Kaeli, D.: Heterogeneous clustered vliw microarchitectures. In: CGO, pp. 354–366 (2007)

    Google Scholar 

  3. Baniasadi, A., Moshovos, A.: Asymmetric-frequency clustering: a power-aware back-end for high-performance processors. In: ISLPED, pp. 255–258 (2002)

    Google Scholar 

  4. Desoli, G.: Instruction assignment for clustered vliw dsp compilers: A new approach. HP laboratories Technical Report HPL (1998)

    Google Scholar 

  5. Ellis, J.: Bulldog: A compiler for vliw architectures. Technical Report, Yale Univ., New Haven, CT, USA (1985)

    Google Scholar 

  6. Faraboschi, P., Brown, G., et al.: Lx: a technology platform for customizable vliw embedded processing. In: ISCA, pp. 203–213 (2000)

    Google Scholar 

  7. Fridman, J., Greenfield, Z.: The tigersharc dsp architecture. IEEE Micro 20(1), 66–76 (2000)

    CrossRef  Google Scholar 

  8. Fritts, J., Steiling, F., et al.: Mediabench ii video: expediting the next generation of video systems research. In: Proceedings of SPIE, vol. 5683, p. 79 (2005)

    Google Scholar 

  9. Kailas, K., Ebcioglu, K., Agrawala, A.: Cars: a new code generation framework for clustered ilp processors. Technical Report UMIACS-TR-2000-55 (2000)

    Google Scholar 

  10. Kailas, K., Ebcioglu, K., Agrawala, A.: Cars: a new code generation framework for clustered ilp processors. In: HPCA, pp. 133–143 (2001)

    Google Scholar 

  11. Lee, W., Barua, R., et al.: Space-time scheduling of instruction-level parallelism on a raw machine. In: ASPLOS (1998)

    Google Scholar 

  12. Lowney, P.G., Freudenberger, S.M., et al.: The multiflow trace scheduling compiler. Journal of Supercomputing 7, 51–142 (1993)

    CrossRef  Google Scholar 

  13. Muralimanohar, N., et al.: Power efficient resource scaling in partitioned architectures through dynamic heterogeneity. In: ISPASS, pp. 100–111 (2006)

    Google Scholar 

  14. Ozer, E., et al.: Unified assign and schedule: a new approach to scheduling for clustered register file microarchitectures, pp. 308–315 (1998)

    Google Scholar 

  15. Pechanek, G., Vassiliadis, S.: The ManArray embedded processor architecture. Euromicro 1, 348–355 (2000)

    Google Scholar 

  16. Sharangpani, H., Arora, H.: Itanium processor microarchitecture. IEEE Micro 20(5), 24–43 (2000)

    CrossRef  Google Scholar 

  17. Terechko, A., Corporaal, H.: Inter-cluster communication in vliw architectures. ACM Transactions on Architecture and Code Optimization (TACO) 4(2), 11 (2007)

    CrossRef  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Porpodas, V., Cintra, M. (2013). UCIFF: Unified Cluster Assignment Instruction Scheduling and Fast Frequency Selection for Heterogeneous Clustered VLIW Cores. In: Kasahara, H., Kimura, K. (eds) Languages and Compilers for Parallel Computing. LCPC 2012. Lecture Notes in Computer Science, vol 7760. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37658-0_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-37658-0_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-37657-3

  • Online ISBN: 978-3-642-37658-0

  • eBook Packages: Computer ScienceComputer Science (R0)