Design and Evaluation of Nonblocking Collective I/O Operations

  • Vishwanath Venkatesan
  • Mohamad Chaarawi
  • Edgar Gabriel
  • Torsten Hoefler
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6960)


Nonblocking operations have successfully been used to hide network latencies in large scale parallel applications. This paper presents the challenges associated with developing nonblocking collective I/O operations, in order to help hiding the costs of I/O operations. We also present an implementation based on the libNBC library, and evaluate the benefits of nonblocking collective I/O over a PVFS2 file system for a micro-benchmark and a parallel image processing application. Our results indicate the potential benefit of our approach, but also highlight the challenges to achieve appropriate overlap between I/O and compute operations.


Message Passing Interface Communication Operation Collective Operation Message Passing Interface Process Open Message Passing Interface 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Brightwell, R., Underwood, K.D.: An analysis of the impact of MPI overlap and independent progress. In: ICS 2004: Proceedings of the 18th Annual International Conference on Supercomputing, pp. 298–305. ACM Press, New York (2004)Google Scholar
  2. 2.
    Baude, F., Caromel, D., Furmento, N., Sagnol, D.: Optimizing metacomputing with communication-computation overlap. In: Malyshkin, V.E. (ed.) PaCT 2001. LNCS, vol. 2127, pp. 190–204. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  3. 3.
    Hoefler, T., Gottschling, P., Lumsdaine, A., Rehm, W.: Optimizing a Conjugate Gradient Solver with Non-Blocking Collective Operations. Elsevier Journal of Parallel Computing (PARCO) 33(9), 624–633 (2007)MathSciNetCrossRefGoogle Scholar
  4. 4.
    Hoefler, T., Lumsdaine, A., Rehm, W.: Implementation and Performance Analysis of Non-Blocking Collective Operations for MPI. In: Proc. of the 2007 Intl. Conf. on High Perf. Comp., Networking, Storage and Analysis, SC 2007, IEEE Computer Society/ACM ( November 2007)Google Scholar
  5. 5.
    Kothe, D., Kendall, R.: Computational science requirements for leadership computing. Technical report, ORNL/TM-2007/44 (2007)Google Scholar
  6. 6.
    Chaarawi, M., Chandok, S., Gabriel, E.: Performance Evaluation of Collective Write Algorithms in MPI I/O. In: Allen, G., Nabrzyski, J., Seidel, E., van Albada, G.D., Dongarra, J., Sloot, P.M.A. (eds.) ICCS 2009. LNCS, vol. 5544, pp. 185–194. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  7. 7.
    Chaarawi, M., Gabriel, E., Keller, R., Graham, R.L., Bosilca, G., Dongarra, J.J.: OMPIO: A Modular Software Architecture for MPI I/O. In: Cotronis, Y., et al. (eds.) EuroMPI 2011. LNCS, vol. 6960, pp. 81–89. Springer, Heidelberg (2011)Google Scholar
  8. 8.
    Gabriel, E., Fagg, G.E., Dongarra, J.J.: Evaluating dynamic communicators and one-sided operations for current MPI libraries. International Journal of High Performance Computing Applications 19(1), 67–79 (2005)CrossRefGoogle Scholar
  9. 9.
    Gabriel, E., Venkatesan, V., Shah, S.: Towards high performance cell segmentation in multispectral fine needle aspiration cytology of thyroid lesions. Computational Methods and Programs in Biomedicine 98(3), 231–240 (2009)CrossRefGoogle Scholar
  10. 10.
    Frigo, M., Johnson, S.G.: The Design and Implementation of FFTW3. Proceedings of IEEE 93(2), 216–231 (2005); Special issue on Program Generation, Optimization, and Platform Adaptation CrossRefGoogle Scholar
  11. 11.
    Bell, C., Bonachea, D., Cote, Y., Duell, J., Hargrove, P., Husbands, P., Iancu, C., Welcome, M., Yelick, K.: An evaluation of current high-performance networks. In: Proc. of the 17th Int. Symp. on Par. and Distr. Proc., p. 28.1 (2003)Google Scholar
  12. 12.
    Hoefler, T., Lumsdaine, A.: Message Progression in Parallel Computing - To Thread or not to Thread?. In: Proceedings of the 2008 IEEE International Conference on Cluster Computing. IEEE Computer Society, Los Alamitos (2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Vishwanath Venkatesan
    • 1
  • Mohamad Chaarawi
    • 1
  • Edgar Gabriel
    • 1
  • Torsten Hoefler
    • 2
  1. 1.Department of Computer ScienceUniversity of HoustonUSA
  2. 2.Blue Waters DirectorateUniversity of IllinoisUSA

Personalised recommendations