The Journal of Supercomputing

, Volume 73, Issue 1, pp 88–99 | Cite as

BFCA+: automatic synthesis of parallel code with TLS capabilities

  • Sergio Aldea
  • Diego R. Llanos
  • Arturo Gonzalez-Escribano


Parallelization of sequential applications requires extracting information about the loops and how their variables are accessed, and afterwards, augmenting the source code with extra code depending on such information. In this paper we propose a framework that avoids such an error-prone, time-consuming task. Our solution leverages the compile-time information extracted from the source code to classify all variables used inside each loop according to their accesses. Then, our system, called BFCA+, automatically instruments the source code with the necessary OpenMP directives and clauses to allow its parallel execution, using the standard shared and private clauses for variable classification. The framework is also capable of instrumenting loops for speculative parallelization, with the help of the ATLaS runtime system, that defines a new speculative clause to point out those variables that may lead to a dependency violation. As a result, the target loop is guaranteed to correctly run in parallel, ensuring that its execution follows sequential semantics even in the presence of dependency violations. Our experimental evaluation shows that the framework not only saves development time, but also leads to a faster code than the one manually parallelized.


Automatic parallelization Compiler framework OpenMP Source synthesis Source transformation Speculative parallelization XML 



This research has been partially supported by MICINN (Spain) and ERDF program of the European Union: HomProg-HetSys project (TIN2014-58876-P), CAPAP-H5 network (TIN2014-53522-REDT), and COST Program Action IC1305: Network for Sustainable Ultrascale Computing (NESUS).


  1. 1.
    Aldea S, Llanos DR, Gonzalez-Escribano A (2012) Using SPEC CPU2006 to evaluate the secuential and parallel code generated by commercial and open-source compilers. J Supercomput 59(1):486–498CrossRefGoogle Scholar
  2. 2.
    Cintra M, Llanos DR (2003) Toward efficient and robust software speculative parallelization on multiprocessors. In: PPoPP’03 proceedings, pp 13–24Google Scholar
  3. 3.
    Dang FH, Yu H, Rauchwerger L (2002) The R-LRPD test: speculative parallelization of partially parallel loops. In: IPDPS’02 proceedings, pp 20–29Google Scholar
  4. 4.
    Aldea S, Llanos DR, Gonzalez-Escribano A (2012) Support for thread-level speculation into OpenMP. In: IWOMP’12 proceedings, pp 275–278Google Scholar
  5. 5.
    Aldea S, Llanos DR, Gonzalez-Escribano A (2014) The BonaFide C analyzer: automatic loop-level characterization and coverage measurement. J Supercomput 68(3):1378–1401CrossRefGoogle Scholar
  6. 6.
    Aldea S, Estebanez A, Llanos DR, Gonzalez-Escribano A (2014) A new GCC plugin-based compiler pass to add support for thread-level speculation into OpenMP. In: EuroPar’14 proceedings, LNCS 8632, Springer, pp 234–245Google Scholar
  7. 7.
    Aldea S et al (2015) An OpenMP extension that supports thread-level speculation. IEEE Trans Partial Distrib Syst (to appear)Google Scholar
  8. 8.
    Oancea CE, Mycroft A, Harris T (2009) A lightweight in-place implementation for software thread-level speculation. In: SPAA 2009 proceedings, pp 223–232. ACM, New YorkGoogle Scholar
  9. 9.
    Yiapanis P et al (2013) Optimizing software runtime systems for speculative parallelization. ACM Trans Arch Code Optim (TACO) 9(4):39Google Scholar
  10. 10.
    Adhianto Laksono et al (2000) Tools for OpenMP application development: the POST project. Concurr Pract Exp 12:1177–1191CrossRefMATHGoogle Scholar
  11. 11.
    Ierotheou Cos S et al (2005) Generating OpenMP code using an interactive parallelization environment. Parallel Comput 31(10–12):999–1012CrossRefGoogle Scholar
  12. 12.
    Jin Haoqiang et al (2003) Automatic multilevel parallelization using OpenMP. J Sci Program EWOMP’11 11(2):177–190 (2)Google Scholar
  13. 13.
    Johnson S et al (2005) The ParaWise expert assistant—widening accessibility to efficient and scalable tool generated OpenMP code. In: Proceedings of the WOMPAT’04, pp 67–82Google Scholar
  14. 14.
    Bondhugula, Uday et al (2008) A practical automatic polyhedral parallelizer and locality optimizer. In: PLDI’08 proceedings, pp 101–113Google Scholar
  15. 15.
    Trifunovic K et al (2010) Graphite two years after: first lessons learned from real-world polyhedral compilation. In: GROW’10 proceedings, pp 4–19Google Scholar
  16. 16.
    Grosser T et al (2011) Polly—polyhedral optimization in LLVM. In: IMPACT’11 workshop proceedings, Charmonix, France, pp 1–6Google Scholar
  17. 17.
    Lattner C, Adve V (2004) LLVM: a compilation framework for lifelong program analysis transformation. In: CGO’04 proceedings, pp 75–86 (2004)Google Scholar
  18. 18.
    Amini M et al (2012) Par4All: from convex array regions to heterogeneous computing. In: IMPACT’12 HiPEAC workshop proceedings, Paris, France, pp 1–2Google Scholar
  19. 19.
    Guelton S (2011) Building source-to-source compilers for heterogeneous targets. PhD thesis, Universit europenne de Bretagne, Rennes (2011)Google Scholar
  20. 20.
    Amini M et al (2011) PIPS is not (just) polyhedral software. In: IMPACT’11 workshop proceedings, Charmonix, France, pp 7–12Google Scholar
  21. 21.
    Liao C et al (2008) Automatic parallelization using OpenMP based on STL semantics. In: Languages and compilers for parallel computing (LCPC)Google Scholar
  22. 22.
    Dave Chirag et al (2009) Cetus: a source-to-source compiler infrastructure for multicores. IEEE Comput 42(12):36–42CrossRefGoogle Scholar
  23. 23.
    Taillard J, Guyomarch F, Dekeyser JL (2008) A graphical framework for high performance computing using an MDE approach. In: PDP’08 proceedings, pp 165–173Google Scholar
  24. 24.
    Nardi L et al (2012) YAO: a generator of parallel code for variational data assimilation applications. In: HPCC’12 proceedings, pp 224–232Google Scholar
  25. 25.
    Clarkson KL, Mehlhorn K, Seidel R (1993) Four results on randomized incremental constructions. Comput Geom Theory Appl 3(4):185–212MathSciNetCrossRefMATHGoogle Scholar
  26. 26.
    Devroye L, Mücke EP, Zhu B (1998) A note on point location in Delaunay triangulations of random points. Algorithmica 22:477–482MathSciNetCrossRefMATHGoogle Scholar
  27. 27.
    Welzl E (1991) Smallest enclosing disks (balls and ellipsoids). In: New results and new trends in computer science. LNCS, vol 555. Springer, New York, pp 359–370Google Scholar
  28. 28.
    Barnes JE (1997) TREE. Institute for Astronomy, University of Hawaii.

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  • Sergio Aldea
    • 1
  • Diego R. Llanos
    • 1
  • Arturo Gonzalez-Escribano
    • 1
  1. 1.ETS Ingeniera InformticaUniversidad de ValladolidValladolidSpain

Personalised recommendations