Addressing communication latency issues on clusters for fine grained asynchronous applications—A case study

  • Umesh Kumar V. Rajasekaran
  • Malolan Chetlur
  • Girindra D. Sharma
  • Radharamanan Radhakrishnan
  • Philip A. Wilsey
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1586)


With the advent of cheap and powerful hardware for workstations and networks, a new cluster-based architecture for parallel processing applications has been envisioned. However, fine-grained asynchronous applications that communicate frequently are not the ideal candidates for such architectures because of their high latency communication costs. Hence, designers of fine-grained parallel applications on clusters are faced with the problem of reducing the high communication latency in such architectures. Depending on what kind of resources are available, the communication latency can be improved along the following dimensions: (a) reducing network latency by employing a higher performance network hardware (i.e., Fast Ethernet versus Myrinet); (b) reducing communication software overhead by developing more efficient communication libraries (MPICH versus TCPMPL (our TCP/IP based message passing layer) versus MPI-BIP); (c) rewriting/restructuring the application code for less frequent communication; and (d) exploiting application characteristics by deploying communication optimizations that exploit the application’s inherent communication characteristics. This paper discusses our experiences with building a communication subsystem on a cluster of workstations for a fine-grained asynchronous application (a Time Warp synchronized discrete-event simulator). Specifically, our efforts in reducing the communication latency along three of the four aforementioned dimensions are detailed and discussed. In addition, performance results from an in-depth empirical evaluation of the communication subsystem are reported in the paper.


Message Passing Interface Message Passing Communication Latency Polling Frequency Message Latency 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Boden, N. J., Cohen, D., Felderman, R. E., Kulawik, A. E., Seitz, C. L., Seizovic, J. N., and Su, W.-K. Myrinet—a gigabit-per-second local-area network. IEEE Micro 15, 1 (February 1995), 29–36.CrossRefGoogle Scholar
  2. 2.
    Chen, P. M. RAID: High-performance, reliable secondary storage. ACM Computing Surveys 26, 2 (June 1994), 145–185.CrossRefGoogle Scholar
  3. 3.
    Chetlur, M., Abu-Ghazaleh, N., Radhakrishnan, R., and Wilsey, P. A. Optimizing communication in Time-Warp simulators. In 12th Workshop on Parallel and Distributed Simulation (May 1998), Society for Computer Simulation, pp. 64–71.Google Scholar
  4. 4.
    Ciaccio, G. Optimal communication performance on fast ethernet with gamma. In 12th International Parallel Processing Symposium and 9th Symposium on Parallel and Distributed Processing (Orlando, Florida, March/April 1998), Springer, pp. 534–548.Google Scholar
  5. 5.
    Felten, E. W. Protocol compilation: High-performance communication for parallel programs. Tech. rep., University of Washington—Dept. of Computer Science, 1993.Google Scholar
  6. 6.
    Fujimoto, R. Parallel discrete event simulation. Communications of the ACM 33, 10 (Oct. 1990), 30–53.CrossRefGoogle Scholar
  7. 7.
    Fujimoto, R. Performance of time warp under synthetic workloads. Proceedings of the SCS Multiconference on Distributed Simulation 22, 1 (Jan.1990), 23–28.Google Scholar
  8. 8.
    Gropp, W., Huss-Lederman, S., Lumsdaine, A., Lusk, E., Nitzberg, B., Saphir, W., and Snir, M.MPI: The Complete Reference Volume 2—The MPI-2 Extension. MIT Press, 1998.Google Scholar
  9. 9.
    Gropp, W., Lusk, E., Doss, N. and Skjellum, A.A High-Performance, Portable Implementation of the MPI Message Passing Interface Standard, July 1996.Google Scholar
  10. 10.
    Jefferson, D. Virtual time. ACM Transactions on Programming Languages and Systems 7, 3 (July 1985), 405–425.MathSciNetCrossRefGoogle Scholar
  11. 11.
    Lab, S. C. Scl cluster cookbook—technology comparison. (available on the www at Scholar
  12. 12.
    Marenzoni, P., Rimassa, G., Vignali, M., Bertozzi, M., Conte, G., and Rossi, P. An operating system support to low-overhead communications in NOW clusters. In Proceedings of Communication and Architectural Support for Net work-Based Parallel Computing CANPC97 (San Antonio, Texas, Feb. 1997), vol. 1199, Springer-Verlag, pp. 130–143.Google Scholar
  13. 13.
    Misra, J. Distributed discrete-event simulation. Computing Surveys 18, 1 (Mar. 1986), 39–65.CrossRefGoogle Scholar
  14. 14.
    Nagle, J. Congestion control in TCP/IP internetworks. Computer Communications Review 14 (Oct 1984), 11–17.Google Scholar
  15. 15.
    Pakin, S., Lauria, M., and Chien, A. High performance messaging on workstations: Illinois fast message (FM) for Myrinet. In Proceedings of Supercomputing ’95 (December 1995).Google Scholar
  16. 16.
    Prylli, L., and Tourancheau, B. BIP: A new protocol designed for high performance networking on myrinet. In 12th International Parallel Processing Symposium and 9th Symposium on Parallel and Distributed Processing (Orlando, Florida, March/April 1998), Springer, pp. 472–485.Google Scholar
  17. 17.
    Radhakrishnan, R., Martin, D. E., Chetlur, M., Rao, D. M., and Wilsey, P. A. An Object-Oriented Time Warp Simulation Kernel. In Proceedings of the International Symposium on Computing in Object-Oriented Parallel Environments (ISCOPE’98), D. Caromel, R. R. Oldehoeft, and M. Tholburn, Eds., vol. LNCS 1505. Springer-Verlag, Dec. 1998, pp. 13–23.Google Scholar
  18. 18.
    Stevens, W. R.TCP/IP Illustrated Volume 1: The Protocols. Addison-Wesley Publishing Company, Reading Massachusetts, March 1996.Google Scholar
  19. 19.
    von Eicken, T., Basu, A., Buch, V., and Vogels, W. U-Net: A user-level network interface for parallel and distributed computing. In Proceedings of the 15th ACM Symposium on Operating Systems Principles (December 3–6 1995).Google Scholar

Copyright information

© Springer-Verlag 1999

Authors and Affiliations

  • Umesh Kumar V. Rajasekaran
    • 1
  • Malolan Chetlur
    • 1
  • Girindra D. Sharma
    • 1
  • Radharamanan Radhakrishnan
    • 1
  • Philip A. Wilsey
    • 1
  1. 1.Computer Architecture Design LaboratoryDept. of ECECSCincinnati

Personalised recommendations