Advertisement

Exploiting Single-Assignment Properties to Optimize Message-Passing Programs by Code Transformations

  • Alfredo Cristóbal-Salas
  • Andrey Chernykh
  • Edelmira Rodríguez-Alcantar
  • Jean-Luc Gaudiot
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3474)

Abstract

The message-passing paradigm is now widely accepted and used mainly for inter-process communication in distributed memory parallel systems. However, one of its disadvantages is the high cost associated with the data exchange. Therefore, in this paper, we describe a message-passing optimization technique based on the exploitation of single-assignment and constant information properties to reduce the number of communications. Similar to the more general partial evaluation approach, technique evaluates local and remote memory operations when only part of the input is known or available; it further specializes the program with respect to the input data. It is applied to the programs, which use a distributed single-assignment memory system. Experimental results show a considerable speedup in programs running in computer systems with slow interconnection networks. We also show that single assignment memory systems can have better network latency tolerance and the overhead introduced by its management can be hidden.

Keywords

Memory System Partial Evaluation Remote Memory Collective Communication Code Transformation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Amarasinghe, S.-P., Lam, M.-S.: Communication optimization and code generation for distributed memory machines. In: Proceedings of the SIGPLAN 1993 Conference on Programming Language Design and Implementation (1993)Google Scholar
  2. 2.
    Arvind, Nikhil, R.-S., Pingali, K.-K.: I-Structures: Data Structures for Parallel Computing. ACM Transaction on PLS 11(4), 598–632 (1989)Google Scholar
  3. 3.
    Banerjee, P., Chandy, J.-A., Gupta, M., Holm, J.-G., Lain, A., Palermo, D.-J., Ramaswamy, S., Su, E.: The PARADIGM compiler for distributed-memory message multicomputers. In: proceedings of the first international workshop on parallel processing (1994)Google Scholar
  4. 4.
    Bruck, J., Dolev, D., Ho, C.-T., Roşu, M.-C., Strong, R.: Efficient Message-passing Interface (MPI) for Parallel Computing on Clusters of Workstations. Journal of Parallel and Distributed Computing 40(1), 19–34 (1997)CrossRefGoogle Scholar
  5. 5.
    Champeaux, D., Lea, D., Faure, P.: Object-Oriented System Development. Addison-Wesley, Reading (1993) ISBN 0-201-56355-XGoogle Scholar
  6. 6.
    Cristóbal-Salas, A., Tchernykh, A.: I-Structure Software Cache for distributed applications, Dyna, Year 71, No. 141. pp. 67 – 74, Medellín, March 2004 (2004) ISSN 0012-7353Google Scholar
  7. 7.
    Cristóbal-Salas, A., Tchernykh, A., Gaudiot, J.-L., Lin, W.Y.: Non-Strict Execution in Parallel and Distributed Computing. In: International Journal of Parallel Programming, vol. 31(2), pp. 77–105. Kluwer Academic Publishers, New York (2003)Google Scholar
  8. 8.
    Cristóbal-Salas, A., Tchernykh, A., Gaudiot, J.-L.: Incomplete Information Processing for Optimization of Distributed Applications. In: Proceedings of the Fourth ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing (SNPD 2003), pp. 277–284. ACIS (2003)Google Scholar
  9. 9.
    Faraj, A.-A.: Communication characteristics in the NAS parallel benchmarks. Master thesis, college of arts and sciences, Florida State University (October 2002)Google Scholar
  10. 10.
    Garza-Salazar, D.-A., Bohm, W.: D-OSC: A sisal compiler for distributed memory machines. In: proceedings of the International Workshop on PCS (1997)Google Scholar
  11. 11.
    Jones, N.-D.: An introduction to Partial Evaluation. ACM computing surveys 28(3) (1996)Google Scholar
  12. 12.
    Karwande, A., Yuan, X., Lowenthal, D.-K.: CC-MPI: A Compiled Communication Capable MPI Prototype for Ethernet Switched Clusters. In: ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), pp. 95–106 (2003)Google Scholar
  13. 13.
    Kielmann, T., Hofman, F.-H., Bal, H.-E., Plaat, A., Bhoedjang, A.-F.: MagPIe: MPI’s Collective Communication Operations for Clustered Wide Area Systems. In: 7th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPOPP 1999) (1999)Google Scholar
  14. 14.
    Lahaut, D., Germain, C.: Static Communications in Parallel Scientic Programs. In: Halatsis, C., Philokyprou, G., Maritsas, D., Theodoridis, S. (eds.) PARLE 1994. LNCS, vol. 817, pp. 262–276. Springer, Heidelberg (1994)Google Scholar
  15. 15.
    Lin, W.-Y., Gaudiot, J.-L.: I-Structure Software Cache - A split-Phase Transaction runtime cache system. In: Proceedings of PACT 1996, Boston, MA (1996)Google Scholar
  16. 16.
    McGraw, J., Skedzielewski, S., Allan, S., Grit, D., Oldehoeft, R., Glauert, J., Dobes, I., Hohensee, P.: SISAL-Streams and Iterations in a Single Assignment Language, Language Reference Manual, version 1. 2. Technical Report TR M-146, University of California - Lawrence Livermore Laboratory (1985)Google Scholar
  17. 17.
    Emil, M.: Haar wavelet transform (2004), http://dmr.ath.cx/gfx/haar/index.html
  18. 18.
    Mogensen, Sestoft, P.: Partial evaluation. In: Kent, A., Williams, J.G. (eds.) Encyclopedia of Computer Science and Technology, vol. 37, pp. 247–279 (1997)Google Scholar
  19. 19.
    Moh, S., Yu, C., Lee, B., Youn, H.-Y., Han, D., Lee, D.: 4-ary Tree-Based Barrier Synchronization for 2-D Meshes without Nonmember Involvement. IEEE Transactions on Computers 50(8) (2001)Google Scholar
  20. 20.
    Ogawa, H., Matsuoka, S.: OMPI: Optimizing MPI programs using Partial Evaluation. In: Proceedings of the 1996 IEEE/ACM Supercomputing Conference, Pittsburgh (1996)Google Scholar
  21. 21.
    Yuan, X., Melhem, R., Gupta, R.: Algorithms for Supporting Compiled Communication. IEEE Transactions on Parallel and Distributed Systems 14(2), 107–118 (2003)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Alfredo Cristóbal-Salas
    • 1
  • Andrey Chernykh
    • 2
  • Edelmira Rodríguez-Alcantar
    • 3
  • Jean-Luc Gaudiot
    • 4
  1. 1.School of Chemistry Science and EngineeringAutonomous University of Baja CaliforniaTijuanaMexico
  2. 2.Computer Science DepartmentCICESE Research CenterEnsenadaMexico
  3. 3.Computer ScienceUniversity of SonoraHermosilloMexico
  4. 4.Electrical Engineering and Computer ScienceUniversity of California, IrvineIrvineUSA

Personalised recommendations