Abstract
The simulated models and requirements of engineering programs like computational fluids dynamics and structural mechanics grow more rapidly than single processor performance. Automatic parallelisation seem to be the obvious approach for huge and historic packages like PERMAS. The approach is based on dynamic scheduling, which is more flexible than domain decomposition, is totally transparent to the end-user and shows good speedups because it is able to extract parallelism where others are not. In this paper we show the need of some preparatory steps on the big input matrices for good performance. We present a new approach for blocking that saves storage and decreases the computation critical path. Also a data distribution step is proposed that drives the dynamic scheduler decisions such that an efficient parallelisation can be achieved even on slow multiprocessor networks. A final and important step is the interleaving of the array blocks that are distributed to different processors. This step is essential to expose the parallelism to the scheduler.
Chapter PDF
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Abaqus product performance. http://www.abaqus.com/products/p_performace58.htm
M. Ast, R. Fischer, J. Labarta and H. Manz. “Run-Time Parallelization of Large FEM Analyses with PERMAS”. NASA’97 National Symposium. 1997.
T. Bui and C. Jones “A heuristic for reducing fill in sparse matrix factorization”. 6th SIAM Conf. Parallel Processing for Scientific Computing, pp. 445–452, 1993.
S. Fink, S. Baden and S. Kohn. “Efficient Run-Time Support for Irregular Block-Structured Applications”. Journal of Parallel and Distributed Computing 50, pp. 61–82. 1998.
T. Johnson. “A concurrent Dynamic Task Graph”. International Conference on Parallel Processing, 1993.
G. Karypis and V. Kumar. “A fast and highly quality multilevel scheme for partitioning irregular graphs. SIAM Journal on Scientific Computing. 1995.
L. Komzsik, “Parallel Processing in MSC/Nastran’. 1993 MSC World Users Conference, Virginia, 1993. http://www.macsch.com
V. Kumar et al. “Introduction to parallel Computing. Design and analysis of algorithms. The Benjamin/Cumminngs Pub. 1994.
J. Liu. “Computational models and task scheduling for parallel sparse Cholesky factorization”. Parallel Computing 3, pp. 327–342, 1986.
Marc product description. http://www.marc.com/Product/MARC
R. Schreiber. “Scalability of sparse direct solvers”. Graph theory and sparse matrix computations, The IMA volumes in mathematics and its applications, vol. 56, pp. 191–209, 1993.
S. Venugopal, V. Naik. “Effects of partitioning and scheduling sparse matrix factorization on communications and load balance”. Supercomputing’91, pp. 866–875, 1991.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ast, M. et al. (2000). Sparse Matrix Structure for Dynamic Parallelisation Efficiency. In: Bode, A., Ludwig, T., Karl, W., Wismüller, R. (eds) Euro-Par 2000 Parallel Processing. Euro-Par 2000. Lecture Notes in Computer Science, vol 1900. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44520-X_69
Download citation
DOI: https://doi.org/10.1007/3-540-44520-X_69
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67956-1
Online ISBN: 978-3-540-44520-3
eBook Packages: Springer Book Archive