Versatile task scheduling of binary trees for realistic machines
In general, scheduling models only consider message latency as the sole dominant communication parameter. However, in many parallel systems, latency is negligible when compared to the CPU penalties associated with sending and receiving communication events. Our work considers a model in which this CPU penalty can also be a significant communication parameter. This paper focusses on analysing the effect of such a model on the scheduling of Full Binary Trees. We briefly describe a versatile, multi-stage scheduling approach that can be customised to classes of parallel systems according to their communication performance characteristics and which produces better makespans when compared with traditional techniques.
Unable to display preview. Download preview PDF.
- 1.C. Boeres. Versatile Communication Cost Modelling for Multicomputer Task Scheduling Heuristics. PhD thesis, Department of Computer Science, University of Edinburgh, Oct 1996.Google Scholar
- 2.C. Boeres, G. Chochia, and P. Thanisch. On the scope of applicability of the ETF algorithm. In Proceedings of the 2nd International Workshop on Parallel Algorithms for Irregularly Structured Problems (IRREGULAR'95), Lecture Notes in Computer Science (LNCS 980), pages 159–164, Lyon, France, Sept 1995. Springer.Google Scholar
- 3.R.P. Brent. The parallel evaluation of general arithmetic expressions. J. ACM, 21:201–206, 1974.Google Scholar
- 4.G. Chochia, C. Boeres, M. Norman, and P. Thanisch. Analysis of multicomputer schedules in a cost and latency model of communication. In Proceedings of the 3rd Workshop on Abstract Machine Models for Parallel and Distributed Computing, Leeds, UK., April 1996. IOS press.Google Scholar
- 5.D. Culler et al. LogP: Towards a realistic model of parallel computation. In Proceedings of the 4th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, San Diego, CA, May 1993.Google Scholar
- 6.H. El-Rewini and T.G. Lewis. Scheduling parallel program tasks onto arbitrary target machines. J. Parallel Dist. Comput., 9:138–153, 1990.Google Scholar
- 7.A. Gerasoulis, S. Venugopol, and T. Yang. Clustering task graphs for message passing architectures. In The International Conference on Supercomputing, pages 447–456, Amsterdam, The Netherlands, June 1990.Google Scholar
- 8.F.W. Howell. Reverse profiling. In Innes Jelly, Ian Gorton, and Peter Croll, editors, Software Engineering for Parallel and Distributed Systems, pages 244–255. Chapman and Hall on behalf of IFIP, March 1996.Google Scholar
- 9.J-J. Hwang, Y-C. Chow, F.D. Anger, and C-Y. Lee. Scheduling precedence graphs in systems with interprocessor communication times. SIAM J. Comput., 18(2):244–257, 1989.Google Scholar
- 10.H. Jung, L. Kirousis, and P. Spirakis. Lower bounds and efficient algorithms for multiprocessor scheduling of DAGs with communication delays. In Proc. ACM Symposium on Parallel Algorithms and Architectures, pages 254–264, 1989.Google Scholar
- 11.C.H. Papadimitriou and M. Yannakakis. Towards an architecture-independent analysis of parallel algorithms. SIAM J. Comput., 19:322–328, 1990.Google Scholar
- 12.V. Sarkar. Partitioning and Scheduling Parallel Programs for Multiprocessors. Pitman, London, 1989. *** DIRECT SUPPORT *** A0008C42 00032Google Scholar