A New Process Placement Algorithm in Multi-core Clusters Aimed to Reducing Network Interface Contention

  • Ghobad Zarrinchian
  • Mohsen Soryani
  • Morteza Analoui
Part of the Advances in Intelligent Systems and Computing book series (volume 167)

Abstract

The number of processing cores within computing nodes which are used in current clustered systems, are growing up rapidly. Despite this trend, the number of available network interfaces in such nodes almost has been remained unchanged. This issue can lead to high usage of network interface in many workloads, especially in workloads which have high inter-process communications. As a result, network interface would become a performance bottleneck and can degrade the performance drastically. The goal of this paper is to introduce a new process mapping algorithm in multi-core clusters aimed to reducing network interface contention and improving the performance of running parallel applications. Comparison of the new algorithm with other well-known methods in synthetic and real workloads indicates that the new strategy can gain 5% to 90% performance improvement in heavy communicating workloads.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
  2. 2.
    Hood, R., Jin, H., Mehrotra, P., Chang, J., Djomehri, J., Gavali, S., Jespersen, D., Taylor, K., Biswas, R.: Performance Impact of Resource Contention in Multicore Systems. In: IEEE International Symposium on Parallel and Distributed Processing, Atlanta (2010)Google Scholar
  3. 3.
    Chai, L., Gao, Q., Panda, D.K.: Understanding the Impact of Multi-Core Architecture in Cluster Computing: A Case Study with Intel Dual-Core System. In: 7th IEEE International Symposium on Cluster Computing and the Grid, Rio De Janeiro, Brazil (2007)Google Scholar
  4. 4.
    Jokanovic, A., Rodriguez, G., Sancho, J.C., Labarta, J.: Impact of Inter-Application Contention in Current and Future HPC Systems. In: IEEE Annual Symposium on High-Performance Interconnects, Mountain View, U.S.A (2010)Google Scholar
  5. 5.
    Kayi, A., El-Ghazawi, T., Newby, G.B.: Performance issues in emerging homogeneous multi-core architectures. Elsevier Journal of Simulation Modeling Practice and Theory 17(9) (2009)Google Scholar
  6. 6.
    Narayanaswamy, G., Balaji, P., Feng, W.: Impact of Network Sharing in Multi-core Architectures. In: 17th IEEE International Conference on Computer Communications and Networks, Virgin Islands, U.S.A (2008)Google Scholar
  7. 7.
    Dummler, J., Rauber, T., Runger, G.: Mapping Algorithms for Multiprocessor Tasks on Multi-core Clusters. In: 37th IEEE International Conference on Parallel Processing, Portland, U.S.A (2008)Google Scholar
  8. 8.
    Ichikawa, S., Takagi, S.: Estimating the Optimal Configuration of a Multi-Core Cluster: A Preliminary Study. In: IEEE International Conference on Complex, Intelligent and Software Intensive Systems, Fukuoka, Japan (2009)Google Scholar
  9. 9.
    Chen, H., Chen, W., Huang, J., Robert, B., Kuhn, H.: MPIPP: An Automatic Profile-guided Parallel Process Placement Toolset for SMP Clusters and Multiclusters. In: 20th Annual International Conference on Supercomputing, New York, U.S.A (2006)Google Scholar
  10. 10.
    Mercier, G., Clet-Ortega, J.: Towards an Efficient Process Placement Policy for MPI Applications in Multicore Environments. In: 16th European PVM/MPI Users’ Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface, Berlin, Germany (2009)Google Scholar
  11. 11.
    Rodrigues, E.R., Madruga, F.L., Navaux, P.O.A., Panetta, J.: Multi-core Aware Process Mapping and Its Impact on Communication Overhead of Parallel Applications. In: IEEE Symposium on Computers and Communications, Sousse, Tunisia (2009)Google Scholar
  12. 12.
    Khoroshevsky, V.G., Kurnosov, M.G.: Mapping Parallel Programs into Hierarchical Distributed Computer Systems. In: 4th International Conference on Software and Data Technologies, Sofia, Bulgaria (2009)Google Scholar
  13. 13.
    Jeannot, E., Mercier, G.: Near-Optimal Placement of MPI Processes on Hierarchical NUMA Architectures. In: D’Ambra, P., Guarracino, M., Talia, D. (eds.) Euro-Par 2010, Part II. LNCS, vol. 6272, pp. 199–210. Springer, Heidelberg (2010)Google Scholar
  14. 14.
    Agrawal, T., Sharma, A., Kale, L.V.: Topology-Aware Task Mapping for Reducing Communication Contention on Large Parallel Machines. In: 20th IEEE International Symposium on Parallel and Distributed Processing, Rhodes Island, Greece (2006)Google Scholar
  15. 15.
    Koop, M.J., Luo, M., Panda, D.K.: Reducing Network Contention with Mixed Workloads on Modern Multicore Clusters. In: IEEE International Conference on Cluster Computing and Workshops, New Orleans, U.S.A (2009)Google Scholar
  16. 16.
    Koukis, E., Koziris, N.: Memory and Network Bandwidth Aware Scheduling of Multiprogrammed Workloads on Clusters of SMPs. In: 12th International Conference on Parallel and Distributed Systems, Minneapolis, U.S.A (2006)Google Scholar

Copyright information

© Springer-Verlag GmbH Berlin Heidelberg 2012

Authors and Affiliations

  • Ghobad Zarrinchian
    • 1
  • Mohsen Soryani
    • 1
  • Morteza Analoui
    • 1
  1. 1.Iran University of Science and TechnologyTehranIran

Personalised recommendations