Abstract
It is a challenging issue whether scientific applications are suitable for Imagine architecture. To address this problem, this paper presents a novel architecture-based optimization for the key techniques of mapping scientific applications to Imagine. Our specific contributions include that we achieve fine kernel granularity and choose necessary arrays to organize appropriate streams. Specially, we develop a new stream program generation algorithm based on the architecture-based optimization. We implement our algorithm to some representative scientific applications on ISIM simulation of Imagine, compared the corresponding FORTRAN programs running on Itanium 2. The experimental results show that the optimizing stream programs can efficiently improve computational intensiveness, enhance locality of LRF and SRF, avoid index stream overhead and enable parallelism to utilize ALUs. It is certain that Imagine is efficient for many scientific applications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Amarasinghe, S., William.: Stream Architectures. In: Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (2003)
Khailany, B., et al.: Imagine: Media Processing with Streams. IEEE Micro, 35–46 (2001)
Kapasi, U.J., et al.: Programmable Stream Processors. IEEE Computer, 54–62 (2003)
Khailany, B.: The VLSI Implementation and Evaluation of Area-and Energy-Effcient Streaming Media Processors. Ph.D. thesis, Stanford University (2003)
Kapasi, U.J., Dally, W.J., et al.: The Imagine Stream Processor. In: Processings of the International Conference on Computer Design (2002)
Das, A., et al.: Imagine Programming System User’s Guide 2.0 (2004)
Mattson, P.R.: A Programming System for the Imagine Media Processor. Dept. of Electrical Engineering. Ph.D. thesis, Stanford University (2002)
Amarasinghe, S., et al.: Stream Languages and Programming Models. In: Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (2003)
Jayasena, N.S.: Memory Hierarchy Design for Stream Computing. Ph.D. thesis, Stanford University (2005)
Andrew, A.L., William, T., Saman, A.: Linear Analysis and Optimization of Stream Programs. In: Proceedings of the SIGPLAN ’03 Conference on Programming Language Design and Implementation, San Diego, CA (2003)
Owens, J.D., Rixner, S., et al.: Media Processing Applications on the Imagine Stream Processor. In: Proceedings of the 2002 International Conference on Computer Design (2002)
Fan, Z., Qiu, F., Kaufman, A., Yoakum-Stover, S.: Gpu Cluster for High Performance Computing. In: ACM / IEEE Supercomputing Conference (2004)
Harris, M.J., Baxter, W.V., Scheuermann, T., Lastera, A.: Simulation of Cloud Dynamics on Graphics Hardware. In: Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware, Aire-la-Ville, Switzerland, pp. 92–101. ACM Press, New York (2003)
Bolz, J., Farmer, I., Grinspun, E., SchrÖder, P.: Sparse Matrix Solvers on the Gpu: Conjugate Gradients and Multigrid. ACM Transactions on Graph, 917–924 (2003)
Göddeke, D.: Gpgpu Performance Tuning. Tech. rep. University of Dortmund, Germany (2005), http://www.mathematik.uni-dortmund.de/~goeddeke/gpgpu/
Dally, W.J., et al.: Merrimac: Supercomputing with Streams. In: ACM / IEEE Supercomputing Conference (2003)
Erez, M., Ahn, J., Garg, A., et al.: Analysis and Performance Results of a Molecular Modeling Application on Merrimac. In: ACM / IEEE Supercomputing Conference (2004)
Griem, G., Oliker, L.: Transitive Closure on the Imagine Stream Processor. In: The 5th Workshop on Media and Streaming Processors, SanDiego, CA (2003)
Du, J., Yang, X., et al.: Scientific Computing Applications on the Imagine Stream Processor. In: Proceedings of the 11th Asia-Pacific Computer Systems Architecture Conference, Shanghai, China (2006)
Yang, X., Du, J., et al.: Matrix-Based Programming Optimization for Improving Memory Hierarchy Performance on Imagine. In: Proceedings of the 4th International Symposium on Parallel and Distributed Processing and Applications (ISPA), Sorrento, Italy (2006)
Suh, J., et al.: A Performance Analysis of PIM, Stream Processing, and Tiled Processing on Memory-Intensive Signal Processing Kernels. In: Proceedings of the annual international symposium on Computer Architecture (2003)
Ahn, J.H., et al.: Evaluating the Imagine Stream Architecture. In: Proceedings of the annual international symposium on Computer Architecture (2004)
Wolfe, M.J.: High Performance Compilers for Parallel Computing. Addison-Wesley, Reading (1996)
Kuck, et al.: Dependence Graphs and Compiler Optimizations. In: The 8th ACM Symposium on the Principles of Programming Languages, Williamsburg, VA (1981)
Wolf, M.E., et al.: A Loop Transformation Theory and an Algorithm to Maximize Parallelism. IEEE Transactions on Parallel and Distributed Systems 2(4), 452–471 (1991)
Xue, J.: Loop Tiling for Parallelism. Kluwer Academic Publishers, Boston (2000)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Du, J., Yang, X., Wang, G., Tang, T., Zeng, K. (2007). Architecture-Based Optimization for Mapping Scientific Applications to Imagine. In: Stojmenovic, I., Thulasiram, R.K., Yang, L.T., Jia, W., Guo, M., de Mello, R.F. (eds) Parallel and Distributed Processing and Applications. ISPA 2007. Lecture Notes in Computer Science, vol 4742. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74742-0_6
Download citation
DOI: https://doi.org/10.1007/978-3-540-74742-0_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74741-3
Online ISBN: 978-3-540-74742-0
eBook Packages: Computer ScienceComputer Science (R0)