CoMPI: Enhancing MPI Based Applications Performance and Scalability Using Run-Time Compression

  • Rosa Filgueira
  • David E. Singh
  • Alejandro Calderón
  • Jesús Carretero
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5759)


This paper presents an optimization of MPI communications, called CoMPI, based on run-time compression of MPI messages exchanged by applications. A broad number of compression algorithms have been fully implemented and tested for both MPI collective and point to point primitives. In addition, this paper presents a study of several compression algorithms that can be used for run-time compression, based on the datatype used by applications. This study has been validated by using several MPI benchmarks and real HPC applications. Show that, in most of the cases, using compression reduces the application communication time enhancing application performance and scalability. In this way, CoMPI obtains important improvements in the overall execution time for many of the considered scenarios.


Execution Time Compression Ratio Buffer Size Compression Algorithm Redundancy Level 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Gropp, W., Lusk, E.: Sowing MPICH: A case study in the dissemination of a portable environment for parallel scientific computing. The International Journal of Supercomputer Applications and High Performance Computing 11(2), 103–114 (1997)CrossRefGoogle Scholar
  2. 2.
    Balkanski, D., Trams, M., Rehm, W.: Heterogeneous computing with mpich/madeleine and pacx mpi: a critical comparison (2003)Google Scholar
  3. 3.
    Keller, R.,, M.L.: Using pacx-mpi in metacomputing applications. In: 18th Symposium Simulationstechnique, Erlangen, September 12-15 (2005)Google Scholar
  4. 4.
    Paruj, R., Jian, K., Martin, B.: Fast lossless compression of scientific floating-point data. In: DCC 2006: Proceedings of the Data Compression Conference, Washington, DC, USA, pp. 133–142. IEEE Computer Society Press, Los Alamitos (2006)Google Scholar
  5. 5.
    Jian, K., Martin, B., Evan, S.: Runtime compression of mpi messanes to improve the performance and scalability of parallel applications. In: SC 2004: Proceedings of the 2004 ACM/IEEE conference on Supercomputing, Washington, DC, USA, p. 59. IEEE Computer Society Press, Los Alamitos (2004)Google Scholar
  6. 6.
    Carretero, J., No, J., Park, S.-S., Choudhary, A., Chen, P.: Compassion: a parallel I/O runtime system including chunking and compression for irregular applications. In: Proceedings of the International Conference on High-Performance Computing and Networking, April 1998, pp. 668–677 (1998)Google Scholar
  7. 7.
    Garíca, F., Galderón, A., Carretero, J.: MiMPI: A multithread-safe implementation of MPI. In: Margalef, T., Dongarra, J., Luque, E. (eds.) PVM/MPI 1999. LNCS, vol. 1697, pp. 207–214. Springer, Heidelberg (1999)CrossRefGoogle Scholar
  8. 8.
    Gropp, W.D., Thakur, R.: Issues in developing a thread-safe MPI implementation. In: Mohr, B., Träff, J.L., Worringen, J., Dongarra, J. (eds.) PVM/MPI 2006. LNCS, vol. 4192, pp. 12–21. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  9. 9.
    Zigon, R.: Run length encoding. j-DDJ 14(2) (February 1989)Google Scholar
  10. 10.
    Knuth, D.E.: Dynamic huffman coding. J. Algorithms 6(2), 163–180 (1985)MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Coco, S., DArrigo, D.G.: A rice-based lossless data compression system for space. In: Proceedings of the 2000 IEEE Nordic Signal Processing Symposium, pp. 133–142 (2000)Google Scholar
  12. 12.
    Burtscher, M., Ratanaworabhan, P.: Fpc: A high-speed compressor for double-precision floating-point data. IEEE Transactions on Computers 58(1), 18–31 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    Oberhumer, M.F.X.J.: Lzo real-time data compression library (2005)Google Scholar
  14. 14.
    Bell, T., Powell, M., J.H.R.A.: The canterbury corpus (2009)Google Scholar
  15. 15.
    Pichel, J.C., Singh, D.E., Rivera, F.F.: Image segmentation based on merging of sub-optimal segmentations. Pattern Recogn. Lett. 27(10), 1105–1116 (2006)CrossRefGoogle Scholar
  16. 16.
    Mourino, J., Martin, M., Doallo, R., Singh, D., Rivera, F., Bruguera, J.: The stem-ii air quality model on a distributed memory system (2004)Google Scholar
  17. 17.
    Carter, R., Ciotti, B., Fineberg, S., Nitzberg, B.: NHT-1 I/O benchmarks. Technical Report RND-92-016, NAS Systems Division, NASA Ames (1992)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Rosa Filgueira
    • 1
  • David E. Singh
    • 1
  • Alejandro Calderón
    • 1
  • Jesús Carretero
    • 1
  1. 1.Department of Computer ScienceUniversity Carlos III of MadridSpain

Personalised recommendations