GLT: A Unified API for Lightweight Thread Libraries

  • Adrián CastellóEmail author
  • Sangmin Seo
  • Rafael Mayo
  • Pavan Balaji
  • Enrique S. Quintana-Ortí
  • Antonio J. Peña
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10417)


In recent years, several lightweight thread (LWT) libraries have emerged to tackle exascale challenges. These offer programming models (PMs) based on user-level threads and incorporate their own lightweight mechanisms. However, each library proposes its own PM, exposing different semantics and hindering portability.

To address this drawback, we have designed Generic Lightweight Thread (GLT), an application programming interface that frames the functionality of the most popular LWT libraries for high-performance computing under a single PM. We implement GLT on top of Argobots, MassiveThreads, and Qthreads. We provide GLT as a dynamic library, as well as in the form of a static version based on macro preprocessing resolution to reduce overhead. This paper discusses the GLT PM and demonstrates its minimal performance impact.



Researchers from the Universitat Jaume I de Castelló were supported by project TIN2014-53495-R of the MINECO, the Generalitat Valenciana fellowship programme Vali+d 2015, and FEDER. Antonio J. Peña is cofinancied by the Spanish Ministry of Economy and Competitiveness under Juan de la Cierva fellowship number IJCI-2015-23266. This work was partially supported by the U.S. Dept. of Energy, Office of Science, Office of Advanced Scientific Computing Research (SC-21), under contract DE-AC02-06CH11357.


  1. 1.
    Generic Lightweight Thread.
  2. 2.
    Programming with Solaris Threads.
  3. 3.
  4. 4.
    MPICH, High-Performance Portable MPI (2016).
  5. 5.
    TOP500 Supercomputer Sites (June 2016).
  6. 6.
    Augonnet, C., Namyst, R.: A unified runtime system for heterogeneous multi-core architectures. In: César, E., Alexander, M., Streit, A., Träff, J.L., Cérin, C., Knüpfer, A., Kranzlmüller, D., Jha, S. (eds.) Euro-Par 2008. LNCS, vol. 5415, pp. 174–183. Springer, Heidelberg (2009). doi: 10.1007/978-3-642-00955-6_22 CrossRefGoogle Scholar
  7. 7.
    Augonnet, C., Thibault, S., Namyst, R., Wacrenier, P.A.: Starpu: a unified platform for task scheduling on heterogeneous multicore architectures. Concurr. Comput.: Pract. Exp. 23(2), 187–198 (2011)CrossRefGoogle Scholar
  8. 8.
  9. 9.
    Castelló, A., Peña, A.J., Seo, S., Mayo, R., Balaji, P., Quintana-Ortí, E.S.: A review of lightweight thread approaches for high performance computing. In: IEEE International Conference on Cluster Computing, Taiwan, September 2016Google Scholar
  10. 10.
    Dabek, F., Zhao, B., Druschel, P., Kubiatowicz, J., Stoica, I.: Towards a common API for structured peer-to-peer overlays. In: Kaashoek, M.F., Stoica, I. (eds.) IPTPS 2003. LNCS, vol. 2735, pp. 33–44. Springer, Heidelberg (2003). doi: 10.1007/978-3-540-45172-3_3 CrossRefGoogle Scholar
  11. 11.
    Duran González, A., Teruel, X., Ferrer, R., Martorell Bofill, X., Ayguadé Parra, E.: Barcelona OpenMP tasks suite: a set of benchmarks targeting the exploitation of task parallelism in OpenMP. In: 38th International Conference on Parallel Processing, pp. 124–131 (2009)Google Scholar
  12. 12.
    Gabriel, E., Fagg, G.E., Bosilca, G., Angskun, T., Dongarra, J.J., Squyres, J.M., Sahay, V., Kambadur, P., et al.: Open MPI: goals, concept, and design of a next generation MPI implementation. In: Kranzlmüller, D., Kacsuk, P., Dongarra, J. (eds.) EuroPVM/MPI 2004. LNCS, vol. 3241, pp. 97–104. Springer, Heidelberg (2004). doi: 10.1007/978-3-540-30218-6_19 CrossRefGoogle Scholar
  13. 13.
    Kalé, L.V., Bhandarkar, M.A., Jagathesan, N., Krishnan, S., Yelon, J.: Converse: an interoperable framework for parallel programming. In: Proceedings of the 10th International Parallel Processing Symposium (IPPS), pp. 212–217, April 1996Google Scholar
  14. 14.
  15. 15.
    Nakashima, J., Taura, K.: MassiveThreads: a thread library for high productivity languages. In: Agha, G., Igarashi, A., Kobayashi, N., Masuhara, H., Matsuoka, S., Shibayama, E., Taura, K. (eds.) Concurrent Objects and Beyond. LNCS, vol. 8665, pp. 222–238. Springer, Heidelberg (2014). doi: 10.1007/978-3-662-44471-9_10 CrossRefGoogle Scholar
  16. 16.
    Seo, S., Amer, A., Balaji, P., Bordage, C., Bosilca, G., Brooks, A., Carns, P., Castelló, A., Genet, D., Herault, T., Jindal, P., Kalé, L.V., Krishnamoorthy, S., Lifflander, J., Lu, H., Meneses, E., Snir, M., Sun, Y., Beckman, P.: Argobots: a lightweight threading/tasking framework (2017).
  17. 17.
    Silva, L.A.B., Costa, C., Oliveira, J.L.: A common API for delivering services over multi-vendor cloud resources. J. Syst. Softw. 86(9), 2309–2317 (2013)CrossRefGoogle Scholar
  18. 18.
    Developers, V.: Callgrind: a call-graph generating cache and branch prediction profiler (2010)Google Scholar
  19. 19.
    Van Zee, F.G., van de Geijn, R.A.: BLIS: a framework for rapidly instantiating BLAS functionality. ACM Trans. Math. Softw. 41(3), 14 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
  20. 20.
    Wheeler, K.B., Murphy, R.C., Thain, D.: Qthreads: An API for programming with millions of lightweight threads. In: Proceedings of the 2008 Workshop on Multithreaded Architectures and Applications (MTAAP), April 2008Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Adrián Castelló
    • 1
    Email author
  • Sangmin Seo
    • 2
  • Rafael Mayo
    • 1
  • Pavan Balaji
    • 2
  • Enrique S. Quintana-Ortí
    • 1
  • Antonio J. Peña
    • 3
  1. 1.Universitat Jaume I de CastellóCastellón de la PlanaSpain
  2. 2.Argonne National LaboratoryLemontUSA
  3. 3.Barcelona Supercomputing Center (BSC)BarcelonaSpain

Personalised recommendations