Cluster Computing

, Volume 17, Issue 1, pp 1–18

Optimizing I/O forwarding techniques for extreme-scale event tracing

Authors

    • Technische Universität Dresden (ZIH)
  • Joseph Schuchart
    • Oak Ridge National Laboratory
  • Jason Cope
    • Argonne National Laboratory
  • Dries Kimpe
    • Argonne National Laboratory
  • Terry Jones
    • Oak Ridge National Laboratory
  • Andreas Knüpfer
    • Technische Universität Dresden (ZIH)
  • Kamil Iskra
    • Argonne National Laboratory
  • Robert Ross
    • Argonne National Laboratory
  • Wolfgang E. Nagel
    • Technische Universität Dresden (ZIH)
  • Stephen Poole
    • Oak Ridge National Laboratory
Article

DOI: 10.1007/s10586-013-0272-9

Cite this article as:
Ilsche, T., Schuchart, J., Cope, J. et al. Cluster Comput (2014) 17: 1. doi:10.1007/s10586-013-0272-9

Abstract

Programming development tools are a vital component for understanding the behavior of parallel applications. Event tracing is a principal ingredient to these tools, but new and serious challenges place event tracing at risk on extreme-scale machines. As the quantity of captured events increases with concurrency, the additional data can overload the parallel file system and perturb the application being observed. In this work we present a solution for event tracing on extreme-scale machines. We enhance an I/O forwarding software layer to aggregate and reorganize log data prior to writing to the storage system, significantly reducing the burden on the underlying file system. Furthermore, we introduce a sophisticated write buffering capability to limit the impact. To validate the approach, we employ the Vampir tracing toolset using these new capabilities. Our results demonstrate that the approach increases the maximum traced application size by a factor of 5× to more than 200,000 processes.

Keywords

Event tracingI/O forwardingAtomic append

Copyright information

© Springer Science + Business Media New York (outside the USA) 2013