Advertisement

Dependency-Based Automatic Parallelization of Java Applications

  • João Rafael
  • Ivo Correia
  • Alcides Fonseca
  • Bruno Cabral
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8806)

Abstract

There are billions of lines of sequential code inside nowadays software which do not benefit from the parallelism available in modern multicore architectures. Transforming legacy sequential code into a parallel version of the same programs is a complex and cumbersome task. Trying to perform such transformation automatically and without the intervention of a developer has been a striking research objective for a long time. This work proposes an elegant way of achieving such a goal. By targeting a task-based runtime which manages execution using a task dependency graph, we developed a translator for sequential JAVA code which generates a highly parallel version of the same program. The translation process interprets the AST nodes for signatures such as read-write access, execution-flow modifications, among others and generates a set of dependencies between executable tasks. This process has been applied to well known problems, such as the recursive Fibonacci and FFT algorithms, resulting in versions capable of maximizing resource usage. For the case of two CPU bounded applications we were able to obtain 10.97x and 9.0x speedup on a 12 core machine.

Keywords

Automatic programming automatic parallelization task-based runtime symbolic analysis recursive procedures 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Arnold, K., Gosling, J., Holmes, D.: The Java programming language. Addison Wesley Professional (2005)Google Scholar
  2. 2.
    van Biema, M.: A survey of parallel programming constructs. In: Columbia University Computer Science Technical Reports. Department of Computer Science, Columbia University (1999)Google Scholar
  3. 3.
    Banerjee, U., Eigenmann, R., Nicolau, A., Padua, D.: Automatic program parallelization. Proceedings of the IEEE 81(2), 211–243 (1993)CrossRefGoogle Scholar
  4. 4.
    Banerjee, U.: Loop Transformations for Restructuring Compilers: The Foundations. Springer (1993)Google Scholar
  5. 5.
    Feautrier, P.: Automatic parallelization in the polytope model. In: Perrin, G.-R., Darte, A. (eds.) The Data Parallel Programming Model. LNCS, vol. 1132, pp. 79–103. Springer, Heidelberg (1996)CrossRefGoogle Scholar
  6. 6.
    Bik, A.J., Gannon, D.B.: Automatically exploiting implicit parallelism in java. Concurrency - Practice and Experience 9(6), 579–619 (1997)CrossRefGoogle Scholar
  7. 7.
    Blumofe, R.D., Leiserson, C.E.: Scheduling multithreaded computations by work stealing. J. ACM 46(5), 720–748 (1999)CrossRefMathSciNetzbMATHGoogle Scholar
  8. 8.
    Randall, K.: Cilk: Efficient multithreaded computing. Technical report, Cambridge, MA, USA (1998)Google Scholar
  9. 9.
    Dagum, L., Menon, R.: Openmp: an industry standard api for shared-memory programming. IEEE Computational Science Engineering 5(1), 46–55 (1998)CrossRefGoogle Scholar
  10. 10.
    Ottoni, G., Rangan, R., Stoler, A., August, D.I.: Automatic thread extraction with decoupled software pipelining. In: Proceedings of the 38th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO-38, 12 p. (November 2005)Google Scholar
  11. 11.
    Hogen, G., Kindler, A., Loogen, R.: Automatic parallelization of lazy functional programs. In: Krieg-Brückner, B. (ed.) ESOP 1992. LNCS, vol. 582, pp. 254–268. Springer, Heidelberg (1992)CrossRefGoogle Scholar
  12. 12.
    Bhowmik, A., Franklin, M.: A general compiler framework for speculative multithreading. In: Proceedings of the Fourteenth Annual ACM Symposium on Parallel Algorithms and Architectures, SPAA 2002, pp. 99–108. ACM, New York (2002)CrossRefGoogle Scholar
  13. 13.
    Chan, B., Abdelrahman, T.S.: Run-time support for the automatic parallelization of java programs. J. Supercomput. 28(1), 91–117 (2004)CrossRefzbMATHGoogle Scholar
  14. 14.
    Amdahl, G.M.: Validity of the single processor approach to achieving large scale computing capabilities. In: Proceedings of the Spring Joint Computer Conference, AFIPS 1967, April 18-20, pp. 483–485. ACM, New York (1967)Google Scholar
  15. 15.
    da Silva, C.P., Cupertino, L.F., Chevitarese, D., Pacheco, M.A.C., Bentes, C.: Exploring data streaming to improve 3d fft implementation on multiple gpus. In: 2010 22nd International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW), pp. 13–18. IEEE (2010)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • João Rafael
    • 1
  • Ivo Correia
    • 1
  • Alcides Fonseca
    • 1
  • Bruno Cabral
    • 1
  1. 1.University of CoimbraPortugal

Personalised recommendations