Dependency-Based Automatic Parallelization of Java Applications
There are billions of lines of sequential code inside nowadays software which do not benefit from the parallelism available in modern multicore architectures. Transforming legacy sequential code into a parallel version of the same programs is a complex and cumbersome task. Trying to perform such transformation automatically and without the intervention of a developer has been a striking research objective for a long time. This work proposes an elegant way of achieving such a goal. By targeting a task-based runtime which manages execution using a task dependency graph, we developed a translator for sequential JAVA code which generates a highly parallel version of the same program. The translation process interprets the AST nodes for signatures such as read-write access, execution-flow modifications, among others and generates a set of dependencies between executable tasks. This process has been applied to well known problems, such as the recursive Fibonacci and FFT algorithms, resulting in versions capable of maximizing resource usage. For the case of two CPU bounded applications we were able to obtain 10.97x and 9.0x speedup on a 12 core machine.
KeywordsAutomatic programming automatic parallelization task-based runtime symbolic analysis recursive procedures
Unable to display preview. Download preview PDF.
- 1.Arnold, K., Gosling, J., Holmes, D.: The Java programming language. Addison Wesley Professional (2005)Google Scholar
- 2.van Biema, M.: A survey of parallel programming constructs. In: Columbia University Computer Science Technical Reports. Department of Computer Science, Columbia University (1999)Google Scholar
- 4.Banerjee, U.: Loop Transformations for Restructuring Compilers: The Foundations. Springer (1993)Google Scholar
- 8.Randall, K.: Cilk: Efficient multithreaded computing. Technical report, Cambridge, MA, USA (1998)Google Scholar
- 10.Ottoni, G., Rangan, R., Stoler, A., August, D.I.: Automatic thread extraction with decoupled software pipelining. In: Proceedings of the 38th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO-38, 12 p. (November 2005)Google Scholar
- 14.Amdahl, G.M.: Validity of the single processor approach to achieving large scale computing capabilities. In: Proceedings of the Spring Joint Computer Conference, AFIPS 1967, April 18-20, pp. 483–485. ACM, New York (1967)Google Scholar
- 15.da Silva, C.P., Cupertino, L.F., Chevitarese, D., Pacheco, M.A.C., Bentes, C.: Exploring data streaming to improve 3d fft implementation on multiple gpus. In: 2010 22nd International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW), pp. 13–18. IEEE (2010)Google Scholar