Abstract
Clustered VLIW processors are scalable wide-issue statically scheduled processors. Their design is based on physically partitioning the otherwise shared hardware resources, a design which leads to both high performance and low energy consumption. In traditional clustered VLIW processors, all clusters operate at the same frequency. Heterogeneous clustered VLIW processors however, support dynamic voltage and frequency scaling (DVFS) independently per cluster. Effectively controlling DVFS, to selectively decrease the frequency of clusters with a lot of slack in their schedule, can lead to significant energy savings.
In this paper we propose UCIFF, a new scheduling algorithm for heterogeneous clustered VLIW processors with software DVFS control, that performs cluster assignment, instruction scheduling and fast frequency selection simultaneously, all in a single compiler pass. The proposed algorithm solves the phase ordering problem between frequency selection and scheduling, present in existing algorithms. We compared the quality of the generated code, using both performance and energy-related metrics, against that of the current state-of-the-art and an optimal scheduler. The results show that UCIFF produces better code than the state-of-the-art, very close to the optimal across the mediabench2 benchmarks, while keeping the algorithmic complexity low.
This work was supported in part by the EC under grant ERA 249059 (FP7).
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Gcc: Gnu compiler collection, http://gcc.gnu.org
Aleta, A., Codina, J., González, A., Kaeli, D.: Heterogeneous clustered vliw microarchitectures. In: CGO, pp. 354–366 (2007)
Baniasadi, A., Moshovos, A.: Asymmetric-frequency clustering: a power-aware back-end for high-performance processors. In: ISLPED, pp. 255–258 (2002)
Desoli, G.: Instruction assignment for clustered vliw dsp compilers: A new approach. HP laboratories Technical Report HPL (1998)
Ellis, J.: Bulldog: A compiler for vliw architectures. Technical Report, Yale Univ., New Haven, CT, USA (1985)
Faraboschi, P., Brown, G., et al.: Lx: a technology platform for customizable vliw embedded processing. In: ISCA, pp. 203–213 (2000)
Fridman, J., Greenfield, Z.: The tigersharc dsp architecture. IEEE Micro 20(1), 66–76 (2000)
Fritts, J., Steiling, F., et al.: Mediabench ii video: expediting the next generation of video systems research. In: Proceedings of SPIE, vol. 5683, p. 79 (2005)
Kailas, K., Ebcioglu, K., Agrawala, A.: Cars: a new code generation framework for clustered ilp processors. Technical Report UMIACS-TR-2000-55 (2000)
Kailas, K., Ebcioglu, K., Agrawala, A.: Cars: a new code generation framework for clustered ilp processors. In: HPCA, pp. 133–143 (2001)
Lee, W., Barua, R., et al.: Space-time scheduling of instruction-level parallelism on a raw machine. In: ASPLOS (1998)
Lowney, P.G., Freudenberger, S.M., et al.: The multiflow trace scheduling compiler. Journal of Supercomputing 7, 51–142 (1993)
Muralimanohar, N., et al.: Power efficient resource scaling in partitioned architectures through dynamic heterogeneity. In: ISPASS, pp. 100–111 (2006)
Ozer, E., et al.: Unified assign and schedule: a new approach to scheduling for clustered register file microarchitectures, pp. 308–315 (1998)
Pechanek, G., Vassiliadis, S.: The ManArray embedded processor architecture. Euromicro 1, 348–355 (2000)
Sharangpani, H., Arora, H.: Itanium processor microarchitecture. IEEE Micro 20(5), 24–43 (2000)
Terechko, A., Corporaal, H.: Inter-cluster communication in vliw architectures. ACM Transactions on Architecture and Code Optimization (TACO) 4(2), 11 (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Porpodas, V., Cintra, M. (2013). UCIFF: Unified Cluster Assignment Instruction Scheduling and Fast Frequency Selection for Heterogeneous Clustered VLIW Cores. In: Kasahara, H., Kimura, K. (eds) Languages and Compilers for Parallel Computing. LCPC 2012. Lecture Notes in Computer Science, vol 7760. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37658-0_9
Download citation
DOI: https://doi.org/10.1007/978-3-642-37658-0_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37657-3
Online ISBN: 978-3-642-37658-0
eBook Packages: Computer ScienceComputer Science (R0)