Abstract
This paper focuses on the partial evaluation of local and remote memory accesses of distributed applications, not only to remove much of the excess overhead of message passing implementations, but also to reduce the number of messages, when some information about the input data set is known. The use of split- phase memory operations, the exploitation of spatial data locality, and non-strict information processing are described. Through a detailed performance analysis, we establish conditions under which the technique is beneficial. We show that by incorporating non-strict information processing to FFT MPI, a significant reduction of the number of messages can be archived, and the overall system performance can be improved.
This work is partly supported by CONACYT (Consejo Nacional de Ciencia y Tecnología de México) under grant #32989-Aand by the National Science Foundation under Grants No. CSA-0073527 and INT-9815742. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views neither of the National Science Foundation nor of CONACYT.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Arvind, Nikhil, R.S., Pingali, K.-K.: I-Structures: Data Structures for Parallel Computing. ACM Transaction on Programming Languages and Systems 11(4), 598–632 (1989)
Böhm, A.-P.-W., Hiromoto, R.-E.: The Data Flow Parallelism of FFT. In: Gao, G.-R., Bic, L., Gaudiot, J.-L. (eds.) Advanced topics in dataflow computing and multithreading, pp. 393–404 (1995) ISBN: 0-8186- 6542-4
Chamberlain, R.-M.: Gray codes, Fast Fourier Transforms and hypercubes. Parallel computing 6, 225–233 (1988)
Cristobal, A., Tchernykh, A., Gaudiot, J.-L., Lin, W.Y.: Non-Strict Execution in Parallel and Distributed Computing. International Journal of Parallel Programming 31(2), 77–105 (2003)
Dennis, J.-B., Gao, G.-R.: On memory models and cache management for shared-memory multiprocessors. CSG MEMO 363, CSL, MIT (1995)
Eicken, T., Culler, D.-E., Goldstein, S.-C., Schauser, K.-E.: Active Messages: a Mechnisim for Integrated Communication and Computation. In: Proceedings of the 19th International Symposium on Computer Architecture, 256-266 (1992)
Ershov, A.P.: Mixed computation: potential applications and problems for study. Theoretical Computer Science 18 (1982)
Govindarajan, R., Nemawarkar, S., LeNir, P.: Design and performance evaluation of a multithreaded architecture. In: Proceedings of the 1st international symposium on High-Performance Computer Architecture, Raliegh, pp. 298-307 (1995)
Gluck R., Nakashige R., Zochling R.: Binding-time analysis applied to mathematical algorithms. In Dolezal J., Fidler, J. (eds.) 17th IFIP Conference on System Modelling and Optimization, Prague, Czech Republic (1995)
Gupta, S.-A.: A typed approach to layered programming language design. Thesis proposal, Laboratory of computer science, Department of EE&CS, MIT (1993)
Jones, N-D.: An introduction to Partial Evaluation. ACM computing surveys 28(3) (1996)
Kavi, K.-M., Hurson, A.-R., Patadia, P., Abraham, E., Shanmugam, P.: Design of cache memories for multithreaded dataflow architecture. In: ISCA 1995, pp. 253-264 (1995)
Lawall, J.-L.: Faster Fourier Transforms via automatic program specialization. IRISA research reports, p. 28 (1998)
Lin, W.-Y., Gaudiot, J.-L.: I-Structure Software Cache – A split-Phase Transaction runtime cache system. In: Proceedings of PACT 1996, Boston, MA, pp. 20-23 (1996)
Ogawa, H., Matsuoka, S.: OMPI: Optimizing MPI programs using Partial Evaluation. In: Proceedings IEEE/ACM Supercomputing Conference (1996)
Osamu, T., Yuetsu, K., Santoshi, S., Yoshinori, Y.: Highly efficient implementation of MPI point-to-point communication using remote memory operations. In: Proceedings of 12th ACM ICS 1998, Melbourne, Australia, pp. 267-273 (1998)
Quinn, M.-J.: Parallel computing theory and practice. McGraw-Hill Inc., New York (1994)
Sperber, M., Klaeren, H., Thiemann, P.: Distributed partial evaluation. In: Kaltofen, E (ed.): PASCO 1997, Maui, Hawaii, pp. 80-87 (1997)
Swarztrauber, P.-N.: Multiprocessor FFTs. Parallel computing 5, 197–210 (1987)
Oran Brigham, E.: Fast Fourier Transform and Its Applications. Prentice-Hall, Englewood Cliffs (1988)
Amaral, J.N., Lin, W.-Y., Gaudiot, J.-L., Gao, G.R.: Exploiting Locality in Single Assignment Data Structures Updated Through Split-Phase Transactions. International Journal of Cluster Computing, Special Issue on Internet Scalability: Advances in Parallel, Distributed, and Mobile Systems 4(4) (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cristóbal-Salas, A., Tchernykh, A., Gaudiot, JL. (2003). Non-strict Evaluation of the FFT Algorithm in Distributed Memory Systems. In: Dongarra, J., Laforenza, D., Orlando, S. (eds) Recent Advances in Parallel Virtual Machine and Message Passing Interface. EuroPVM/MPI 2003. Lecture Notes in Computer Science, vol 2840. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39924-7_28
Download citation
DOI: https://doi.org/10.1007/978-3-540-39924-7_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20149-6
Online ISBN: 978-3-540-39924-7
eBook Packages: Springer Book Archive