A library implementation of the nano-threads programming model

  • Xavier Martorell
  • Jesus Labarta
  • Nacho Navarro
  • Eduard Ayguade
Workshop 17 Scheduling and Load Balancing
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1124)


In this paper we describe the design and implementation of a user-level thread package based on the nano-threads programming model, whose goal is to efficiently manage the application parallelism at user-level. Nano-thread applications work close to the operating system to quickly adapt to resource availability.

The goal is to obtain an efficient parallel execution of the nano-threads by appropriately balancing the work assigned to each thread and the thread management overhead. Early experiments let us determine that the appropriate number of operations spread out among the threads to ensure less than 10% of overhead is around 800. Recent experiments show that this nano-thread granularity is fine enough to adapt easily to the system conditions, granting a reduced response time.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Accetta, M., Baron, R., Golub, D., Rashid, R., Tevanian, A., Young, M.: Mach: A New Kernel Foundation for UNIX Development. Proc. of the Summer 1986 USENIX Conference, July 1986.Google Scholar
  2. 2.
    Girkar, M., Polychronopoulos, C. D.: Automatic Extraction of Functional Parallelism from Ordinary Programs. IEEE Trans. on Parallel and Distr. Systems, Vol. 3, No. 2, March 1992.Google Scholar
  3. 3.
    Keppel, D.: Tools and Techniques for Building Fast Portable Threads Packages. Technical Report UWCSE 93-05-06, University of Washington, 1993.Google Scholar
  4. 4.
    Markatos, E. P., LeBlanc, T. J.: Using Processor Affinity in Loop Scheduling on Shared-Memory Multiprocessors. Proc. of the Supercomputing-92, 1992, pp. 104–113.Google Scholar
  5. 5.
    Martorell, X., Labarta, J., Navarro, N., Ayguadé, E.: Nano-Threads Library Design, Implementation and Evaluation. Dept. d'Arquitectura de Computadors — UniversitÄt Politècnica de Catalunya. Technical Report: UPC-DAC-1995-33, September 1995.Google Scholar
  6. 6.
    Moreira, J. E.: On the Implementation and Effectiveness of Autoscheduling for Shared-Memory Multiprocessors. PhD. thesis, Department of Electrical and Computer Engineering, Univ. of Illinois at Urbana-Champaign, 1995.Google Scholar
  7. 7.
    Mueller, F.: A Library Implementation of POSIX Threads under UNIX. 1993 Winter USENIX, January 25–29, 1993, San Diego, CA.Google Scholar
  8. 8.
    Pillet, V., Labarta, J., Cortés, T., Girona, S.: PARAVER: A Tool to Visualize and Analyse Parallel Code, WoTUG-18, pp 17–31, Manchester, April 95.Google Scholar
  9. 9.
    Polychronopoulos, C. D., Girkar, M., Haghighat, M. R., Chia Ling Lee, Leung, B., and Schouten, D.: Parafrase-2: An Environment for Parallelizing, Partitioning, Synchronizing, and Scheduling Programs on Multiprocessors. International Journal of High Speed Computing, Vo. 1, No. 1, 1989.Google Scholar
  10. 10.
    Polychronopoulos, C. D.: nano-Threads: Compiler-Driven Multithreading. CSRD Technical Report, 1993.Google Scholar
  11. 11.
    Rozier, M., Abrossimov, V, Armand, F, Boule, I., Gien, M., Guillemont, M., Herrman, F., Kaiser, C., et al.: Overview of the Chorus Distributed Operating System. Proc. of the USENIX Workshop on Micro-kernels and Other Kernel Architectures, April 1992.Google Scholar
  12. 12.
    Tucker, A., Gupta, A.: Process Control and Scheduling Issues for Multiprogrammed Shared-Memory Multiprocessors. ACM Operating Systems Rev., Vol 23 Num 5, Dec. 1989.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1996

Authors and Affiliations

  • Xavier Martorell
    • 1
  • Jesus Labarta
    • 1
  • Nacho Navarro
    • 1
  • Eduard Ayguade
    • 1
  1. 1.Departament d'Arquitectura de Computadors (DAC)Universitat Politecnica de Catalunya (UPC)BarcelonaSpain

Personalised recommendations