Characterizing the Impact of Using Spare-Cores on Application Performance

  • José Carlos Sancho
  • Darren J. Kerbyson
  • Michael Lang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6271)

Abstract

Increased parallelism on a single processor is driving improvements in peak-performance at both the node and system levels. However achievable performance, in particular from production scientific applications, is not always directly proportional to the core count. Performance is often limited by constraints in the memory hierarchy and also by a node inter-connectivity. Even on state-of-the-art processors, containing between four and eight cores, many applications cannot take full advantage of the compute-performance of all cores. This trend is expected to increase on future processors as the core count per processor increases. In this work we characterize the use of spare-cores, cores that do not provide any improvements in application performance, on current multi-core processors. By using a pulse-width modulation method, we examine the possible performance profile of using a spare-core and quantify under what situations its use will not impact application performance. We show that, for current AMD and Intel multi-core processors, spare-cores can be used for substantial computational tasks but can impact application performance when using shared caches or when significantly accessing main memory.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Intel: Futuristic Intel Chip Could Reshape How Computers are Built, Consumers Interact with Their PCs and Personal Devices (2009) Press released at, http://www.intel.com/pressroom/archive/releases/2009/20091202comp_sm.html
  2. 2.
    Barker, K., Davis, K., Hoisie, A., Kerbyson, D., Lang, M., Pakin, S., Sancho, J.: A Performance Evaluation of the Nehalem Quad-core Processor for Scientific Computing. Parallel Processing Letters 18(4), 453–469 (2008)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Sakuma, K., Andry, P.S., Tsang, C.K., Wright, S.L., Dang, B., Patel, C.S., Webb, B.C., Maria, J., Sprogis, E.J., Kang, S.K., Polastre, R.J., Horton, R.R., Knickerbocker, J.U.: 3D Chip-stacking Technology with Through-silicon Vias and Low-volume Lead-free Interconnections. IBM Journal of Research and Development 52(6), 611–622 (2008)CrossRefGoogle Scholar
  4. 4.
    Wells, P.M., Chakraborty, K., Sohi, G.S.: Adapting to Intermittent Faults in Multicore Systems. In: Proc. ACM ASPLOS, Seattle, WA, pp. 255–264 (March 2008)Google Scholar
  5. 5.
    Joseph, R.: Exploring Salvage Techniques for Multi-core Architectures. In: Proc. Workshop on High Performance Computing Reliability in Conjunction with HPCA-11, San Francisco, CA (February 2005)Google Scholar
  6. 6.
    Zhou, H.: Dual-Core Execution: Building a Highly Scalable Single-Thread Instruction Window. In: International Conference on Parallel Architecture and Compilation Techniques, St. Louis, MO, pp. 231–242 (2005)Google Scholar
  7. 7.
    Ganusov, I., Burtscher, M.: Future Execution: A Hardware Prefetching Technique for Chip Multiprocessors. In: International Conference on Parallel Architecture and Compilation Techniques, St. Louis, MO, pp. 350–360 (September 2005)Google Scholar
  8. 8.
    Porterfield, A., Fowler, R., Neyer, M.: MAESTRO: Dynamic Runtime Power and Concurrency. In: Workshop on Managed Many-Core Systems Colocated with the ACM International Symposium on High Performance Distributed Computing, Boston, MA (June 2008)Google Scholar
  9. 9.
    Chow, J., Garfinkel, T., Chen, P.: Decoupling Dynamic Program Analysis from Execution in Virtual Environment. In: Proc. Usenix Annual Technical Conference, Boston, MA, pp. 1–14 (June 2008)Google Scholar
  10. 10.
    Kerbyson, D.J., Alme, H.J., Hoisie, A., Petrini, F., Wasserman, H.J., Gittings, M.: Predictive Performance and Scalability Modeling of a Large-Scale Application. In: Supercomputing Conference, Denver, Colorado, p. 39 (November 2001)Google Scholar
  11. 11.
    Koch, K.R., Baker, R.S., Alcouffe, R.E.: Solution of the First-order Form of the 3-D Discrete Ordinates Equation on a Massively Parallel Processor. Transactions of the American Nuclear Society 65, 192–198 (1992)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • José Carlos Sancho
    • 1
  • Darren J. Kerbyson
    • 2
  • Michael Lang
    • 3
  1. 1.Barcelona Supercomputing CenterBarcelonaSpain
  2. 2.Pacific Northwest National LaboratoryRichlandUSA
  3. 3.Los Alamos National LaboratoryLos AlamosUSA

Personalised recommendations