Starsscheck: A Tool to Find Errors in Task-Based Parallel Programs

  • Paul M. Carpenter
  • Alex Ramirez
  • Eduard Ayguade
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6271)


Star Superscalar is a task-based programming model. The programmer starts with an ordinary C program, and adds pragmas to mark functions as tasks, identifying their inputs and outputs. When the main thread reaches a task, an instance of the task is added to a run-time dependency graph, and later scheduled to run on a processor. Variants of Star Superscalar exist for the Cell Broadband Engine and SMPs.

Star Superscalar relies on the annotations provided by the programmer. If these are incorrect, the program may exhibit race conditions or exceptions deep inside the run-time system.

This paper introduces Starsscheck, a tool based on Valgrind, which helps debug Star Superscalar programs. Starsscheck verifies that the pragma annotations are correct, producing a warning if a task or the main thread performs an invalid access. The tool can be adapted to support similar programming models such as TPC. For most benchmarks, Starsscheck is faster than memcheck, the default Valgrind tool.


Shared Memory Runtime System Number Block Data Race Main Thread 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bellens, P., Perez, J., Badia, R., Labarta, J.: CellSs: a programming model for the Cell BE architecture. In: Proceedings of the 2006 ACM/IEEE Conference on Supercomputing. ACM, New York (2006)Google Scholar
  2. 2.
    Pham, D., Behnen, E., Bolliger, M., Hofstee, H., et al.: The design methodology and implementation of a first-generation Cell processor: a multi-core SoC. In: Custom Integrated Circuits Conference, pp. 45–49 (2005)Google Scholar
  3. 3.
    Perez, J., Badia, R., Labarta, J.: A dependency-aware task-based programming environment for multi-core architectures. In: 2008 IEEE International Conference on Cluster Computing, pp. 142–151 (2008)Google Scholar
  4. 4.
    Barcelona Supercomputing Center: Cell Superscalar (CellSs) User’s Manual Version 2.2 (2009)Google Scholar
  5. 5.
    Barcelona Supercomputing Center: SMP Superscalar (SMPSs) User’s Manual Version 2.0 (2008)Google Scholar
  6. 6.
    Nethercote, N., Seward, J.: Valgrind: a framework for heavyweight dynamic binary instrumentation. In: PLDI, pp. 89–100 (2007)Google Scholar
  7. 7.
    Seward, J., Nethercote, N.: Using Valgrind to detect undefined value errors with bit-precision. In: Proceedings of the Annual Conference on USENIX Annual Technical Conference, USENIX Association, p. 2 (2005)Google Scholar
  8. 8.
    Luk, C.K., Cohn, R., Muth, R., Patil, H., Klauser, A., Lowney, G., Wallace, S., Reddi, V.J., Hazelwood, K.: Pin: building customized program analysis tools with dynamic instrumentation. In: PLDI 2005, pp. 190–200 (2005)Google Scholar
  9. 9.
    Tzenakist, G., Kapelonis, K., Alvanost, M., Koukost, K., Nikolopoulost, D., Bilast, A.: Tagged Procedure Calls (TPC): Efficient runtime support for task-based parallelism on the Cell Processor. In: HiPEAC 2010. LNCS, vol. 5952. Springer, Heidelberg (2010)Google Scholar
  10. 10.
    Blumofe, R., Joerg, C., Kuszmaul, B., Leiserson, C., Randall, K., Zhou, Y.: Cilk: An efficient multithreaded runtime system. ACM SigPlan Notices 30(8), 207–216 (1995)CrossRefGoogle Scholar
  11. 11.
    IBM: Cell Broadband Engine Programming Handbook including PowerXCell 8i Version 1.11 (2009)Google Scholar
  12. 12.
    Balart, J., Duran, A., Gonzalez, M., Martorell, X., Ayguade, E., Labarta, J.: Nanos Mercurium: a Research Compiler for OpenMP. In: Proceedings of the European Workshop on OpenMP, vol. 2004 (2004)Google Scholar
  13. 13.
    Galperin, I., Rivest, R.: Scapegoat trees. In: Proceedings of the Fourth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 165–174. Society for Industrial and Applied Mathematics, Philadelphia (1993)Google Scholar
  14. 14.
    Feng, M., Leiserson, C.: Efficient detection of determinacy races in Cilk programs. Theory of Computing Systems 32(3), 301–326 (1999)CrossRefzbMATHGoogle Scholar
  15. 15.
    MIT LCS: Cilk 5.4.6 Reference Manual (1998)Google Scholar
  16. 16.
    Savage, S., Burrows, M., Nelson, G., Sobalvarro, P., Anderson, T.: Eraser: A dynamic data race detector for multithreaded programs. ACM Transactions on Computer Systems (TOCS) 15(4), 391–411 (1997)CrossRefGoogle Scholar
  17. 17.
    Serebryany, K., Iskhodzhanov, T.: ThreadSanitizer—data race detection in practice. In: Proceedings of the Workshop on Binary Instrumentation and Applications, pp. 62–71 (2009)Google Scholar
  18. 18.
    Prvulovic, M.: Cord: cost-effective (and nearly overhead-free) order-recording and data race detection. In: The Twelfth International Symposium on High-Performance Computer Architecture, pp. 232–243 (February 2006)Google Scholar
  19. 19.
    Prvulovic, M., Torrellas, J.: ReEnact: Using thread-level speculation mechanisms to debug data races in multithreaded codes. In: Annual International Symposium on Computer Architecture, vol. 30, pp. 110–121 (2003)Google Scholar
  20. 20.
    IBM: Multi-Thread Run-time Analysis Tool for Java,

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Paul M. Carpenter
    • 1
  • Alex Ramirez
    • 1
  • Eduard Ayguade
    • 1
  1. 1.Barcelona Supercomputing CenterBarcelonaSpain

Personalised recommendations