ScalaJack: Customized Scalable Tracing with In-situ Data Analysis

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8632)


Root cause diagnosis of large-scale HPC applications often fails because tools, specifically trace-based ones, can no longer record all metrics they measure. We address this problems by combining customized tracing and providing support for in-situ data analysis via ScalaJack, a framework with customizable instrumentation and pluggable extension capabilities for problem directed instrumentation and in-situ data analysis. We further eliminate cross cutting concerns by code refactoring for aspect orientation and evaluate these capabilities in case studies within and beyond the scope of tracing.


Execution Time Call Graph Task Number Distribute Processing Symposium Aspect Orientation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Adhianto, L., Banerjee, S., Fagan, M., Krentel, M., Marin, G., Mellor-Crummey, J., Tallent, N.: HPCToolkit: Tools for performance analysis of optimized parallel programs. Concurrency & Comp. Practice and Experience 22(6), 685–701 (2010)Google Scholar
  2. 2.
    Arnold, D.C., Ahn, D.H., de Supinski, B.R., Lee, G.L., Miller, B.P., Schulz, M.: Stack trace analysis for large scale debugging. In: International Parallel and Distributed Processing Symposium (2007)Google Scholar
  3. 3.
    Aspect, C.: AspectC: AOP for C. (2004)Google Scholar
  4. 4.
    Brunst, H., Hackenberg, D., Juckeland, G., Rohling, H.: Comprehensive performance tracking with Vampir 7. In: Tools for HPC 2009, pp. 17–29 (2010)Google Scholar
  5. 5.
    Buck, B., Hollingsworth, J.: An API for runtime code patching. International Journal of High Performance Computing Applications 14(4), 317–329 (2000)CrossRefGoogle Scholar
  6. 6.
    De Rose, L., Hollingsworth, J., Hoover, T.: The dynamic probe class library – an infrastructure for developing instrumentation for performance tools. In: International Parallel and Distributed Processing Symposium (April 2001)Google Scholar
  7. 7.
    Eaddy, M., Zimmermann, T., Sherwood, K., Garg, V., Murphy, G., Nagappan, N., Aho, A.: Do crosscutting concerns cause defects? IEEE Transactions on Software Engineering 34(4), 497–515 (2008)CrossRefGoogle Scholar
  8. 8.
    Eaddy, M., Aho, A., Murphy, G.C.: Identifying, assigning, and quantifying crosscutting concerns. In: Workshop on Assessment of Contemporary Modularization Techniques, pp. 2–2 (2007)Google Scholar
  9. 9.
    Geimer, M., Wolf, F., Wylie, B.J.N., Abraham, E., Becker, D., Mohr, B.: The scalasca performance toolset architecture. In: International Workshop on Scalable Tools for High-End Computing (June 2008)Google Scholar
  10. 10.
    Graham, S.L., Kessler, P.B., Mckusick, M.K.: Gprof: A call graph execution profiler. ACM Sigplan Notices 17(6), 120–126 (1982)CrossRefGoogle Scholar
  11. 11.
    Kiczales, G., Hilsdale, E.: Aspect-oriented programming. In: ACM SIGSOFT Software Engineering Notes, vol. 26, p. 313 (2001)Google Scholar
  12. 12.
    Kiczales, G., Hilsdale, E., Hugunin, J., Kersten, M., Palm, J., Griswold, W.G.: An overview of AspectJ. In: Lindskov Knudsen, J. (ed.) ECOOP 2001. LNCS, vol. 2072, pp. 327–354. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  13. 13.
    Laboratory, L.A.N.: Cell-based adaptive mesh refinement using MPI and OpenCL GPU code,
  14. 14.
    Marathe, J., Mueller, F., Mohan, T., de Supinski, B.R., McKee, S.A., Yoo, A.: METRIC: Tracking down inefficiencies in the memory hierarchy via binary rewriting. In: Int’l Symp. on Code Generation and Optimization, pp. 289–300 (March 2003)Google Scholar
  15. 15.
    Mucci, P.J., Browne, S., Deane, C., Ho, G.: PAPI: A portable interface to hardware performance counters. In: HPCMP Users Group Conference (1999)Google Scholar
  16. 16.
    Nagel, W.E., Arnold, A., Weber, M., Hoppe, H.C., Solchenbach, K.: VAMPIR: Visualization and analysis of MPI resources. Supercomputer 12(1), 69–80 (1996)Google Scholar
  17. 17.
    Noeth, M., Ratn, P., Mueller, F., Schulz, M., de Supinski, B.R.: ScalaTrace: Scalable compression and replay of communication traces for high-performance computing. Journal of Parallel Distributed Computing 69(8), 696–710 (2009)CrossRefGoogle Scholar
  18. 18.
    of Dresden, T.U.: Score-p: Application instrumentation,
  19. 19.
    Pillet, V., Labarta, J., Cortes, T., Girona, S.: PARAVER: A tool to visualise and analyze parallel code. In: WoTUG-18: Transputer and occam Developments.Transputer and Occam Engineering, vol. 44, pp. 17–31 ( April 1995)Google Scholar
  20. 20.
    Rajaraman, A., Ullman, J.: Mining of Massive Datasets. Cambridge Press (2011)Google Scholar
  21. 21.
    Roth, P., Arnold, D., Miller, B.: MRNet: A software-based multicast/reduction network for scalable tools. Supercomputing, 21–36 (2003)Google Scholar
  22. 22.
    Shende, S.S., Malony, A.D.: The tau parallel performance system. Int. J. High Perform. Comput. Appl. 20(2), 287–311 (2006)CrossRefGoogle Scholar
  23. 23.
    Vetter, J., Chambreau, C.: mpiP: Lightweight, scalable MPI profiling. CASC/mpip (2005),
  24. 24.
    Wu, X., Mueller, F.: Elastic and scalable tracing and accurate replay of non-deterministic events. In: Int’l Conference on Supercomputing, pp. 59–68 (June 2013)Google Scholar
  25. 25.
    Wu, X., Deshpande, V., Mueller, F.: ScalaBenchGen: Auto-generation of communication benchmarks traces. In: International Parallel and Distributed Processing Symposium, pp. 1250–1260 (2012)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  1. 1.North Carolina State UniversityUSA

Personalised recommendations