Cluster Computing

, Volume 16, Issue 1, pp 117–129

The impact of system design parameters on application noise sensitivity

  • Kurt B. Ferreira
  • Patrick G. Bridges
  • Ron Brightwell
  • Kevin T. Pedretti
Article

DOI: 10.1007/s10586-011-0178-3

Cite this article as:
Ferreira, K.B., Bridges, P.G., Brightwell, R. et al. Cluster Comput (2013) 16: 117. doi:10.1007/s10586-011-0178-3

Abstract

Operating system (OS) noise, or jitter, is a key limiter of application scalability in high end computing systems. Several studies have attempted to quantify the sources and effects of system interference, though few of these studies show the influence that architectural and system characteristics have on the impact of noise at scale. In this paper, we examine the impact of three such system properties: platform balance, noisy node distribution, and the choice of collective algorithm. Using a previously-developed noise injection tool, we explore how the impact of noise varies with these platform characteristics. We provide detailed performance results that indicate that a system with relatively less network bandwidth is able to absorb more noise than a system with more network bandwidth. Our results also show that application performance can be significantly degraded by only a subset of noisy nodes. Furthermore, the placement of the noisy nodes is also important, especially for applications that make substantial use of tree-based collective communication operations. Lastly, performance results indicate that non-blocking collective operations have the ability to greatly mitigate the impact of OS interference. When combined, these results show that the impact of OS noise is not solely a property of application communication behavior, but is also influenced by other properties of the system architecture and system software environment.

Keywords

Operating systems interference Jitter System balance Non-blocking collectives 

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  • Kurt B. Ferreira
    • 1
    • 2
  • Patrick G. Bridges
    • 2
  • Ron Brightwell
    • 1
  • Kevin T. Pedretti
    • 1
  1. 1.Scalable System Software DepartmentSandia National LaboratoriesAlbuquerqueUSA
  2. 2.Computer Science DepartmentThe University of New MexicoAlbuquerqueUSA

Personalised recommendations