Schedule refinement for homogeneous multi-core processors in the presence of manufacturing-caused heterogeneity

Chen, Zhi-xiang; Li, Zhao-lin; Cao, Shan; Wang, Fang; Zhou, Jie

doi:10.1631/FITEE.1500035

Schedule refinement for homogeneous multi-core processors in the presence of manufacturing-caused heterogeneity

Published: 10 December 2015

Volume 16, pages 1018–1033, (2015)
Cite this article

Frontiers of Information Technology & Electronic Engineering Aims and scope Submit manuscript

Zhi-xiang Chen^1,2,
Zhao-lin Li^3,4,
Shan Cao^2,5,
Fang Wang^3,4 &
…
Jie Zhou¹

97 Accesses
1 Citation
Explore all metrics

Abstract

Multi-core homogeneous processors have been widely used to deal with computation-intensive embedded applications. However, with the continuous down scaling of CMOS technology, within-die variations in the manufacturing process lead to a significant spread in the operating speeds of cores within homogeneous multi-core processors. Task scheduling approaches, which do not consider such heterogeneity caused by within-die variations, can lead to an overly pessimistic result in terms of performance. To realize an optimal performance according to the actual maximum clock frequencies at which cores can run, we present a heterogeneity-aware schedule refining (HASR) scheme by fully exploiting the heterogeneities of homogeneous multi-core processors in embedded domains. We analyze and show how the actual maximum frequencies of cores are used to guide the scheduling. In the scheme, representative chip operating points are selected and the corresponding optimal schedules are generated as candidate schedules. During the booting of each chip, according to the actual maximum clock frequencies of cores, one of the candidate schedules is bound to the chip to maximize the performance. A set of applications are designed to evaluate the proposed scheme. Experimental results show that the proposed scheme can improve the performance by an average value of 22.2%, compared with the baseline schedule based on the worst case timing analysis. Compared with the conventional task scheduling approach based on the actual maximum clock frequencies, the proposed scheme also improves the performance by up to 12%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robust Scheduling of Dynamic Real-Time Tasks with Low Overhead for Multi-Core Systems

Mapping and Scheduling Hard Real Time Applications on Multicore Systems - The ARGO Approach

HDSAP: heterogeneity-aware dynamic scheduling algorithm to improve performance of nanoscale many-core processors for unknown workloads

Article 23 March 2023

References

Aguilera, P., Lee, J., Farmahini-Farahani, A., et al., 2014. Process variation-aware workload partitioning algorithms for GPUs supporting spatial-multitasking. Design, Automation and Test in Europe Conf. and Exhibition, p.176.1–176.6. [doi:10.7873/date.2014.189]
Google Scholar
Bell, S., Edwards, B., Amann, J., et al., 2008. TILE64 processor: a 64-core SoC with mesh interconnect. IEEE Int. Solid-State Circuits Conf., p.588–598. [doi:10.1109/isscc.2008.4523070]
Google Scholar
Bowman, K.A., Duvall, S.G., Meindl, J.D., 2002. Impact of die-to-die and within-die parameter fluctuations on the maximum clock frequency distribution for gigascale integration. IEEE J. Solid-State Circ., 37(2):183–190. [doi:10.1109/4.982424]
Article Google Scholar
Bowman, K.A., Alameldeen, A.R., Srinivasan, S.T., et al., 2009. Impact of die-to-die and within-die parameter variations on the clock frequency and throughput of multi-core processors. IEEE Trans. VLSI Syst., 17(12):1679–1690. [doi:10.1109/TVLSI.2008.2006057]
Article Google Scholar
Chon, H., Kim, T., 2009. Timing variation-aware task scheduling and binding for MPSoC. Proc. Asia and South Pacific Design Automation Conf., p.137-142. [doi:10.1109/aspdac.2009.4796470]
Google Scholar
Dick, R.P., Rhodes, D.L., Wolf, W., 1998. TGFF: task graphs for free. Proc. 6th Int. Workshop on Hardware/Software Codesign, p.97–101. [doi:10.1109/hsc.1998.666245]
Google Scholar
Dietrich, M., Haase, J., 2012. Process Variations and Probabilistic Integrated Circuit Design. Springer, New York, p.69–89. [doi:10.1007/978-1-4419-6621-6]
Book MATH Google Scholar
Ferrandi, F., Lanzi, P.L., Pilato, C., et al., 2010. Ant colony heuristic for mapping and scheduling tasks and communications on heterogeneous embedded systems. IEEE Trans. Comput.-Aid. Des. Integr. Circ. Syst., 29(6):911–924. [doi:10.1109/tcad.2010.2048354]
Article Google Scholar
Huang, L., Xu, Q., 2010. Performance yield-driven task allocation and scheduling for MPSoCs under process variation. Proc. 47th Design Automation Conf., p.326–331. [doi:10.1145/1837274.1837358]
Google Scholar
Huang, W., Rajamani, K., Stan, M.R., et al., 2011. Scaling with design constraints: predicting the future of big chips. IEEE Micro, 31(4):16–29. [doi:10.1109/MM. 2011.42]
Article Google Scholar
ITRS, 2013. International Technology Roadmap for Semiconductors. Available from http://www.itrs.net/reports. html [Accessed on Feb. 1, 2015]
Google Scholar
Khailany, B., Dally, W.J., Kapasi, U.J., et al., 2001. Imagine: media processing with streams. IEEE Micro, 21(2):35–46. [doi:10.1109/40.918001]
Article Google Scholar
Khodabandeloo, B., Khonsari, A., Gholamian, F., et al., 2014. Scenario-based quasi-static task mapping and scheduling for temperature-efficient MPSoC design under process variation. Microprocess. Microsyst., 38(5):399–414. [doi:10.1016/j.micpro.2014.05.006]
Article Google Scholar
Lin, Y.C., Lu, F., Cheng, K.T., 2005. Pseudo-functional scan-based BIST for delay fault. Proc. 23rd IEEE VLSI Test Symp., p.229–234. [doi:10.1109/vts.2005.69]
Google Scholar
Mirzoyan, D., Akesson, B., Goossens, K., 2012. Processvariation aware mapping of real-time streaming applications to MPSoCs for improved yield. Proc. 13th Int. Symp. on Quality Electronic Design, p.41–48. [doi:10.1109/isqed.2012.6187472]
Google Scholar
Mirzoyan, D., Akesson, B., Goossens, K., 2014. Processvariation-aware mapping of best-effort and real-time streaming applications to MPSoCs. ACM Trans. Embed. Comput. Syst., 13(2s):61.1–61.24. [doi:10.1145/2490819]
Article Google Scholar
Momtazpour, M., Goudarzi, M., Sanaei, E., 2010a. Variation-aware task and communication scheduling in MPSoCs for power-yield maximization. IEICE Trans. Fundament. Electron. Commun. Comput. Sci., 93(12):2542–2550. [doi:10.1587/transfun.e93.a.2542]
Article Google Scholar
Momtazpour, M., Sanaei, E., Goudarzi, M., 2010b. Poweryield optimization in MPSoC task scheduling under process variation. Proc. 11th Int. Symp. on Quality Electronic Design, p.747–754. [doi:10.1109/isqed.2010. 5450497]
Google Scholar
Momtazpour, M., Ghorbani, M., Goudarzi, M., et al., 2011. Simultaneous variation-aware architecture exploration and task scheduling for MPSoC energy minimization. Proc. 21st Symp. on GLSVLSI, p.271–276. [doi:10.1145/1973009.1973063]
Google Scholar
Momtazpour, M., Goudarzi, M., Sanaei, E., 2013. Static statistical MPSoC power optimization by variation-aware task and communication scheduling. Microprocess. Microsyst., 37(8B):953–963. [doi:10.1016/j.micpro.2012. 02.008]
Article Google Scholar
Omara, F.A., Arafa, M.M., 2010. Genetic algorithms for task scheduling problem. J. Parall. Distrib. Comput., 70(1):13–22. [doi:10.1016/j.jpdc.2009.09.009]
Article MATH Google Scholar
Ramamritham, K., 1995. Allocation and scheduling of precedence-related periodic tasks. IEEE Trans. Parall. Distrib. Syst., 6(4):412–420. [doi:10.1109/71.372795]
Article Google Scholar
Raychowdhury, A., Ghosh, S., Roy, K., 2005. A novel on-chip delay measurement hardware for efficient speed-binning. Proc. 11th IEEE Int. On-Line Testing Symp., p.287–292. [doi:10.1109/iolts.2005.10]
Chapter Google Scholar
Sarangi, S.R., Greskamp, B., Teodorescu, R., et al., 2008. VARIUS: a model of process variation and resulting timing errors for microarchitects. IEEE Trans. Semicond. Manufact., 21(1):3–13. [doi:10.1109/tsm.2007.913186]
Article Google Scholar
Singhal, L., Bozorgzadeh, E., 2008. Process variation aware system-level task allocation using stochastic ordering of delay distributions. Proc. IEEE/ACM Int. Conf. on Computer-Aided Design, p.570–574. [doi:10.1109/iccad.2008.4681633]
Google Scholar
Stuijk, S., Geilen, M., Basten, T., 2006. SDF3: SDF for free. Proc. 6th Int. Conf. on Application of Concurrency to System Design, p.276–278. [doi:10.1109/acsd.2006.23]
Chapter Google Scholar
Taylor, M.B., Kim, J., Miller, J., et al., 2002. The raw microprocessor: a computational fabric for software circuits and general-purpose programs. IEEE Micro, 22(2):25–35. [doi:10.1109/mm.2002.997877]
Article Google Scholar
Topcuoglu, H., Hariri, S., Wu, M.Y., 2002. Performanceeffective and low-complexity task scheduling for heterogeneous computing. IEEE Trans. Parall. Distrib. Syst., 13(3):260–274. [doi:10.1109/71.993206]
Article Google Scholar
Von Mises, R., 1964. Mathematical Theory of Probability and Statistics. Academic Press, New York, p.329–367. [doi:10.1016/b978-1-4832-3213-3.50010-5]
MATH Google Scholar
Wang, F., Chen, Y., Nicopoulos, C., et al., 2011. Variationaware task and communication mapping for MPSoC architecture. IEEE Trans. Comput.-Aid. Des. Integr. Circ. Syst., 30(2):295–307. [doi:10.1109/tcad.2010. 2077830]
Article Google Scholar
Yi, Y., Han, W., Zhao, X., et al., 2009. An ILP formulation for task mapping and scheduling on multi-core architectures. Design, Automation and Test in Europe Conf. and Exhibition, p.33–38. [doi:10.1109/date.2009.5090629]
Google Scholar
Yu, Z., Baas, B.M., 2009. High performance, energy efficiency, and scalability with GALS chip multiprocessors. IEEE Trans. VLSI Syst., 17(1):66–79. [doi:10.1109/tvlsi.2008.2001947]
Article Google Scholar
Zhao, W., Liu, F., Agarwal, K., et al., 2009. Rigorous extraction of process variations for 65-nm CMOS design. IEEE Trans. Semicond. Manufact., 22(1):196–203. [doi:10.1109/tsm.2008.2011182]
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Automation, Tsinghua University, Beijing, 100084, China
Zhi-xiang Chen & Jie Zhou
Institute of Microelectronics, Tsinghua University, Beijing, 100084, China
Zhi-xiang Chen & Shan Cao
Research Institute of Information Technology, Tsinghua University, Beijing, 100084, China
Zhao-lin Li & Fang Wang
Tsinghua National Laboratory for Information Science and Technology, Tsinghua University, Beijing, 100084, China
Zhao-lin Li & Fang Wang
The School of Information and Electronics, Beijing Institute of Technology, Beijing, 100084, China
Shan Cao

Authors

Zhi-xiang Chen
View author publications
You can also search for this author in PubMed Google Scholar
Zhao-lin Li
View author publications
You can also search for this author in PubMed Google Scholar
Shan Cao
View author publications
You can also search for this author in PubMed Google Scholar
Fang Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jie Zhou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhi-xiang Chen.

Additional information

Project supported by the National Natural Science Foundation of China (Nos. 61225008, 61373074, and 61373090), the National Basic Research Program (973) of China (No. 2014CB349304), the Specialized Research Fund for the Doctoral Program of Higher Education, the Ministry of Education of China (No. 20120002110033), and the Tsinghua University Initiative Scientific Research Program

ORCID: Zhi-xiang CHEN, http://orcid.org/0000-0001-7986-030X

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, Zx., Li, Zl., Cao, S. et al. Schedule refinement for homogeneous multi-core processors in the presence of manufacturing-caused heterogeneity. Frontiers Inf Technol Electronic Eng 16, 1018–1033 (2015). https://doi.org/10.1631/FITEE.1500035

Download citation

Received: 01 February 2015
Accepted: 26 August 2015
Published: 10 December 2015
Issue Date: December 2015
DOI: https://doi.org/10.1631/FITEE.1500035

Keywords

CLC number

TP302

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Schedule refinement for homogeneous multi-core processors in the presence of manufacturing-caused heterogeneity

Abstract

Access this article

Similar content being viewed by others

Robust Scheduling of Dynamic Real-Time Tasks with Low Overhead for Multi-Core Systems

Mapping and Scheduling Hard Real Time Applications on Multicore Systems - The ARGO Approach

HDSAP: heterogeneity-aware dynamic scheduling algorithm to improve performance of nanoscale many-core processors for unknown workloads

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

CLC number

Navigation

Schedule refinement for homogeneous multi-core processors in the presence of manufacturing-caused heterogeneity

Abstract

Access this article

Similar content being viewed by others

Robust Scheduling of Dynamic Real-Time Tasks with Low Overhead for Multi-Core Systems

Mapping and Scheduling Hard Real Time Applications on Multicore Systems - The ARGO Approach

HDSAP: heterogeneity-aware dynamic scheduling algorithm to improve performance of nanoscale many-core processors for unknown workloads

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

CLC number

Search

Navigation