Advertisement

A Methodology for Handling Data Movements by Anticipation: Position Paper

  • Raphaël BleuseEmail author
  • Giorgio LucarelliEmail author
  • Denis Trystram
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11339)

Abstract

The enhanced capabilities of large scale parallel and distributed platforms produce a continuously increasing amount of data which have to be stored, exchanged and used by various tasks allocated on different nodes of the system. The management of such a huge communication demand is crucial for reaching the best possible performance of the system. Meanwhile, we have to deal with more interferences as the trend is to use a single all-purpose interconnection network whatever the interconnect (tree-based hierarchies or topology-based heterarchies). There are two different types of communications, namely, the flows induced by data exchanges during the computations, and the flows related to Input/Output operations. We propose in this paper a general model for interference-aware scheduling, where explicit communications are replaced by external topological constraints. Specifically, the interferences of both communication types are reduced by adding geometric constraints on the allocation of tasks into machines. The proposed constraints reduce implicitly the data movements by restricting the set of possible allocations for each task. This methodology has been proved to be efficient in a recent study for a restricted interconnection network (a line/ring of processors which is an intermediate between a tree and higher dimensions grids/torus). The obtained results illustrated well the difficulty of the problem even on simple topologies, but also provided a pragmatic greedy solution, which was assessed to be efficient by simulations. We are currently extending this solution for more complex topologies. This work is a position paper which describes the methodology, it does not focus on the solving part.

Keywords

Scheduling Affinity Data movements Heterogeneity Topology HPC 

References

  1. 1.
  2. 2.
    Agelastos, A., et al.: The lightweight distributed metric service: a scalable infrastructure for continuous monitoring of large scale computing systems and applications. In: SC, pp. 154–165. IEEE, November 2014Google Scholar
  3. 3.
    Albing, C.: Characterizing node orderings for improved performance. In: PMBS@SC, pp. 6:1–6:11. ACM (2015)Google Scholar
  4. 4.
    Bhatele, A., Mohror, K., Langer, S.H., Isaacs, K.E.: There goes the neighborhood: performance degradation due to nearby jobs. In: SC, pp. 41:1–41:12. ACM, November 2013Google Scholar
  5. 5.
    Billaut, J.C., Moukrim, A., Sanlaville, É.: Flexibility and Robustness in Scheduling. Control Systems, Robotics and Manufacturing, Wiley (2008)Google Scholar
  6. 6.
    Błądek, I., Drozdowski, M., Guinand, F., Schepler, X.: On contiguous and non-contiguous parallel task scheduling. J. Sched. 18(5), 487–495 (2015)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Bleuse, R., Dogeas, K., Lucarelli, G., Mounié, G., Trystram, D.: Interference-aware scheduling using geometric constraints. In: Aldinucci, M., Padovani, L., Torquati, M. (eds.) Euro-Par 2018. LNCS, vol. 11014, pp. 205–217. Springer, Cham (2018).  https://doi.org/10.1007/978-3-319-96983-1_15CrossRefGoogle Scholar
  8. 8.
    Boito, F.Z., Kassick, R.V., Navaux, P.O.A., Denneulin, Y.: Automatic I/O scheduling algorithm selection for parallel file systems. Concurr. Comput. Pract. Exp. 28(8), 2457–2472 (2016)CrossRefGoogle Scholar
  9. 9.
    Brucker, P.: Scheduling Algorithms, 5th edn. Springer, New York (2007)zbMATHGoogle Scholar
  10. 10.
    Carns, P.H., Harms, K., Allcock, W.E., Bacon, C., Lang, S., Latham, R., Ross, R.B.: Understanding and improving computational science storage access through continuous characterization. ACM Trans. Storage 7(3), 8:1–8:26 (2011)CrossRefGoogle Scholar
  11. 11.
    Chen, N., Poon, S.S., Ramakrishnan, L., Aragon, C.R.: Considering time in designing large-scale systems for scientific computing. In: CSCW, pp. 1533–1545. ACM, February 2016Google Scholar
  12. 12.
    Deveci, M., et al.: Exploiting geometric partitioning in task mapping for parallel computers. In: IPDPS, pp. 27–36. IEEE, May 2014Google Scholar
  13. 13.
    Dorier, M., Ibrahim, S., Antoniu, G., Ross, R.B.: Using formal grammars to predict I/O behaviors in HPC: the Omnisc’IO approach. IEEE Trans. Parallel Distrib. Syst. 27(8), 2435–2449 (2016)CrossRefGoogle Scholar
  14. 14.
    Drozdowski, M.: Scheduling for Parallel Processing. Computer Communications and Networks. Springer, London (2009).  https://doi.org/10.1007/978-1-84882-310-5CrossRefzbMATHGoogle Scholar
  15. 15.
    Enos, J., et al.: Topology-aware job scheduling strategies for torus networks. In: Cray User Group, May 2014. https://cug.org/proceedings/cug2014_proceedings/includes/files/pap182.pdf
  16. 16.
    Evans, R.T., Browne, J.C., Barth, W.L.: Understanding application and system performance through system-wide monitoring. In: IPDPS Workshops, pp. 1702–1710. IEEE, May 2016Google Scholar
  17. 17.
    Feitelson, D.G., Rudolph, L., Schwiegelshohn, U., Sevcik, K.C., Wong, P.: Theory and practice in parallel job scheduling. In: Feitelson, D.G., Rudolph, L. (eds.) JSSPP 1997. LNCS, vol. 1291, pp. 1–34. Springer, Heidelberg (1997).  https://doi.org/10.1007/3-540-63574-2_14CrossRefGoogle Scholar
  18. 18.
    Gainaru, A., Aupy, G., Benoit, A., Cappello, F., Robert, Y., Snir, M.: Scheduling the I/O of HPC applications under congestion. In: IPDPS, pp. 1013–1022. IEEE, May 2015Google Scholar
  19. 19.
    Georgiou, Y., Jeannot, E., Mercier, G., Villiermet, A.: Topology-aware resource management for HPC applications. In: ICDCN, pp. 17:1–17:10. ACM (2017)Google Scholar
  20. 20.
    Hilbert, D.: Ueber die stetige Abbildung einer Line auf ein Flächenstück. Math. Ann. 38(3), 459–460 (1891)MathSciNetCrossRefGoogle Scholar
  21. 21.
    Isaila, F., Carretero, J., Ross, R.B.: CLARISSE: a middleware for data-staging coordination and control on large-scale HPC platforms. In: CCGrid, pp. 346–355. IEEE, May 2016Google Scholar
  22. 22.
    Kathareios, G., Minkenberg, C., Prisacari, B., Rodríguez, G., Hoefler, T.: Cost-effective diameter-two topologies: analysis and evaluation. In: SC, pp. 36:1–36:11. ACM, November 2015Google Scholar
  23. 23.
    Lucarelli, G., Machado Mendonça, F., Trystram, D., Wagner, F.: Contiguity and locality in backfilling scheduling. In: CCGRID, pp. 586–595. IEEE Computer Society, May 2015Google Scholar
  24. 24.
    Morton, G.M.: A computer Oriented Geodetic Data Base; and a New Technique in File Sequencing. Technical report, IBM Ltd., March 1966. https://domino.research.ibm.com/library/cyberdig.nsf/0/0dabf9473b9c86d48525779800566a39
  25. 25.
    Ngoko, Y.: Heating as a cloud-service, a position paper (industrial presentation). In: Dutot, P.-F., Trystram, D. (eds.) Euro-Par 2016. LNCS, vol. 9833, pp. 389–401. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-43659-3_29CrossRefGoogle Scholar
  26. 26.
    Pascual, J.A., Miguel-Alonso, J., Antonio, L.J.: Application-aware metrics for partition selection in cube-shaped topologies. Parallel Comput. 40(5), 129–139 (2014)CrossRefGoogle Scholar
  27. 27.
    Strohmaier, E., Dongarra, J., Simon, H., Meuer, M.: TOP500 list. https://www.top500.org/lists/
  28. 28.
    Tessier, F., Malakar, P., Vishwanath, V., Jeannot, E., Isaila, F.: Topology-aware data aggregation for intensive I/O on large-scale supercomputers. In: COMHPC@SC, pp. 73–81. IEEE (Nov 2016)Google Scholar
  29. 29.
    Tuncer, O., Leung, V.J., Coskun, A.K.: PaCMap: topology mapping of unstructured communication patterns onto non-contiguous allocations. In: ICS, pp. 37–46. ACM, June 2015Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Univ. Grenoble Alpes, CNRS, Inria, Grenoble INP, LIGGrenobleFrance
  2. 2.FSTC/CSCUniversity of LuxembourgLuxembourg CityLuxembourg

Personalised recommendations