Skip to main content

Folding: Detailed Analysis with Coarse Sampling

  • Conference paper
  • First Online:

Abstract

Performance analysis tools help the application users to find bottlenecks that prevent the application to run at full speed in current supercomputers. The level of detail and the accuracy of the performance tools are crucial to completely depict the nature of the bottlenecks. The details exposed do not only depend on the nature of the tools (profile-based or trace-based) but also on the mechanism on which they rely (instrumentation or sampling) to gather information.In this paper we present a mechanism called folding that combines both instrumentation and sampling for trace-based performance analysis tools. The folding mechanism takes advantage of long execution runs and low frequency sampling to finely detail the evolution of the user code with minimal overhead on the application. The reports provided by the folding mechanism are extremely useful to understand the behavior of a region of code at a very low level. We also present a practical study we have done in a in-production scenario with the folding mechanism and show that the results of the folding resembles to high frequency sampling.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://www.vim.org

  2. 2.

    http://www.gnome.org

References

  1. Azimi, R., et al.: Online performance analysis by statistical sampling of microprocessor performance counters. In: ICS ’05: Proceedings of the 19th Annual International Conference on Supercomputing, pp. 101–110. ACM, New York (2005). doi: http://doi.acm.org/10.1145/1088149.1088163

  2. Bézier, P.: Numerical Control. Mathematics and Applications. Wiley, London (1972). Translated by: A.R. Forrest and Anne F. Pakhurst

    Google Scholar 

  3. Code Saturne. http://research.edf.com/research-and-the-scientific-community/softwares/code-saturne/introduction-code-saturne-80058.html. Accessed July 2011

  4. Extrae Instrumentation Package. http://www.bsc.es/paraver. Accessed August 2012

  5. González, J., et al.: Automatic detection of parallel applications computation phases. In: IPDPS’09: 23rd IEEE International Parallel and Distributed Processing Symposium, Rome, Italy. IEEE Computer Society, Piscataway (2009)

    Google Scholar 

  6. González, J., et al.: Automatic evaluation of the computation structure of parallel applications. In: PDCAT ’09: Proceedings of the 10th International Conference on Parallel and Distributed Computing, Applications and Technologies, Hiroshima, Japan. IEEE Computer Society, Hiroshima (2009)

    Google Scholar 

  7. Graham, S.L., et al.: Gprof: a call graph execution profiler. In: SIGPLAN ’82: Proceedings of the 1982 SIGPLAN Symposium on Compiler Construction, pp. 120–126. ACM, New York (1982). doi:http://doi.acm.org/10.1145/800230.806987

  8. Itzkowitz, M.: Sun studio performance analyzer. http://developers.sun.com/sunstudio/overview/topics/analyzer_index.html. Accessed August 2012

  9. Llort, G., et al.: On-line detection of large-scale parallel application’s structure. In: IEEE International Symposium on Parallel and Distributed Processing (IPDPS), 19–23 April 2010, pp. 1–10. doi: 10.1109/IPDPS.2010.5470350. URL:http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5470350&isnumber=5470342 (2010)

    Google Scholar 

  10. Morris, A., et al.: Design and implementation of a hybrid performance measurement and sampling system. In: ICPP 2010: Proceedings of the 2010 International Conference on Parallel Processing, San Diego, California (2010)

    Google Scholar 

  11. NAS Parallel Benchmark Suite. http://www.nas.nasa.gov/Resources/Software/npb.html. Accessed August 2012

  12. Pillet, V., et al.: Paraver: a tool to visualize and analyze parallel code. In: Nixon, P. (ed.) Transputer and occam Developments, pp. 17–32. IOS Press, Amsterdam (1995). http://www.bsc.es/paraver. Accessed July 2011

  13. Servat, H., et al.: Detailed performance analysis using coarse grain sampling. In: Euro-Par Workshops (Workshop on Productivity and Performance, PROPER), Delft, The Netherlands pp. 185–198. Springer Berlin, Heidelberg (2009)

    Google Scholar 

  14. Servat, H., et al.: Unveiling internal evolution of parallel application computation phases. In: ICPP’11: International Conference on Parallel Processing, Taipei, Taiwan (2011)

    Google Scholar 

  15. Shende, S.S., Malony, A.D.: The TAU parallel performance system. Int. J. High Perform. Comput. Appl. 20(2), 287–311 (2006). doi: http://dx.doi.org/10.1177/1094342006064482

    Google Scholar 

  16. Simpson, A.D., Bull, M., Hill, J.: Identification and categorisation of applications and initial benchmarks suite (2008). http://www.prace-project.eu/documents/Identification_and_Categorisatio_of_Applications_and_Initial_Benchmark_Suite_final.pdf. Accessed July 2011

  17. Tallent, N., et al.: Hpctoolkit: performance tools for scientific computing. J. Phys. Conf. Ser. 125(1), 012088 (2008)

    Google Scholar 

  18. Trochu, F.: A contouring program based on dual Kriging interpolation. Eng. Comput. 9(3), 160–177 (1993)

    Google Scholar 

  19. Wolf, F., et al.: Usage of the SCALASCA for scalable performance analysis of large-scale parallel applications. In: Tools for High Performance Computing, pp. 157–167. Springer, Berlin/Heidelberg (2008)

    Google Scholar 

Download references

Acknowledgements

This work is granted by the IBM/BSC MareIncognito project and by the Comisión Interministerial de Ciencia y Tecnología (CICYT) under Contract No. TIN2007-60625.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Harald Servat .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Servat, H., Llort, G., Giménez, J., Huck, K., Labarta, J. (2012). Folding: Detailed Analysis with Coarse Sampling. In: Brunst, H., Müller, M., Nagel, W., Resch, M. (eds) Tools for High Performance Computing 2011. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31476-6_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-31476-6_9

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-31475-9

  • Online ISBN: 978-3-642-31476-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics