Efficient Methods for Trace Analysis Parallelization

Reumont-Locke, Fabien; Ezzati-Jivan, Naser; Dagenais, Michel R.

doi:10.1007/s10766-019-00631-4

Efficient Methods for Trace Analysis Parallelization

Published: 09 February 2019

Volume 47, pages 951–972, (2019)
Cite this article

International Journal of Parallel Programming Aims and scope Submit manuscript

Fabien Reumont-Locke¹^nAff2,
Naser Ezzati-Jivan¹ &
Michel R. Dagenais¹

275 Accesses
Explore all metrics

Abstract

Tracing provides a low-impact, high-resolution way to observe the execution of a system. As the amount of parallelism in traced systems increases, so does the data generated by the trace. Most trace analysis tools work in a single thread, which hinders their performance as the scale of data increases. In this paper, we explore parallelization as an approach to speedup system trace analysis. We propose a solution which uses the inherent aspects of the CTF trace format to create balanced and parallelizable workloads. Our solution takes into account key factors of parallelization, such as good load balancing, low synchronization overhead and an efficient resolution of data dependencies. We also propose an algorithm to detect and resolve data dependencies during trace analysis, with minimal locking and synchronization. Using this approach, we implement three different trace analysis programs: event counting, CPU usage analysis and I/O usage analysis, to assess the scalability in terms of parallel efficiency. The parallel implementations achieve parallel efficiency above 56% with 32 cores, which translates to a speedup of 18 times the serial speed, when running the parallel trace analyses and using trace data stored on consumer-grade solid state storage devices. We also show the scalability and potential of our approach by measuring the effect of future improvements to trace decoding on parallel efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 3

Fig. 4

Breaking the von Neumann bottleneck: architecture-level processing-in-memory technology

Article 27 April 2021

The New Hardware Development Trend and the Challenges in Data Management and Analysis

Article Open access 24 September 2018

Shared Memory Parallelism in Modern C++ and HPX

Article 20 April 2024

Notes

References

Biancheri, C., Ezzati-Jivan, N., Dagenais, M.R.: Multilayer virtualized systems analysis with kernel tracing. In: IEEE 4th International Conference on Future Internet of Things and Cloud Workshops (FiCloudW 2016), Aug 2016, pp. 1–6 (Online). https://doi.org/10.1109/W-FiCloud.2016.18
Ezzati-Jivan, N., Dagenais, M.R.: Multi-scale navigation of large trace data: a survey. Concurr. Comput. Pract. Exp. 29(10), e4068 (2017). https://doi.org/10.1002/cpe.4068
Article Google Scholar
Desnoyers, M., Dagenais, M.: The LTTNG tracer: a low impact performance and behavior monitor for gnu/linux. In: Proceedings of the Ottawa Linux Symposium, vol. 2006 (2006)
Desnoyers, M., Dagenais, M.R.: Lockless multi-core high-throughput buffering scheme for kernel tracing. ACM SIGOPS Oper. Syst. Rev. 46(3), 65–81 (2012)
Article Google Scholar
Rostedt, S.: Finding origins of latencies using ftrace. Proc, RT Linux WS (2009)
Eigler, F.C., Hat, R.: Problem solving with systemtap. In: Proceedings of the Ottawa Linux Symposium. Citeseer, pp. 261–268, (2006)
de Melo, A.C.: The new linux’perf’tools. In: Slidesfrom Linux Kongress (2010)
Fournier, P.-M., Desnoyers, M., Dagenais, M.R.: Combined tracing of the kernel and applications with LTTNG. In: Proceedings of the 2009 Linux Symposium (2009)
Matni, G., Dagenais, M.: Automata-based approach for kernel trace analysis. In: Canadian Conference on Electrical and Computer Engineering: CCECE’09. IEEE 2009, 970–973 (2009)
Wininger, F., Ezzati-Jivan, N., Dagenais, M.R.: A declarative framework for stateful analysis of execution traces. Softw. Qual. J. 25(1), 201–229 (2017). https://doi.org/10.1007/s11219-016-9311-0
Article Google Scholar
Kouame, K., Ezzati-Jivan, N., Dagenais, M.R.: A flexible data-driven approach for execution trace filtering. IEEE International Congress on Big Data 2015, 698–703 (2015). https://doi.org/10.1109/BigDataCongress.2015.112
Article Google Scholar
Montplaisir, A., Ezzati-Jivan, N., Wininger, F., Dagenais, M.R.: State history tree: an incremental disk-based data structure for very large interval data. In: International Conference on Social Computing, pp. 716–724 (2013). https://doi.org/10.1109/SocialCom.2013.107
Veeraraghavan, K., Lee, D., Wester, B., Ouyang, J., Chen, P.M., Flinn, J., Narayanasamy, S.: Doubleplay: parallelizing sequential logging and replay. ACM Trans. Comput. Syst. (TOCS) 30(1), 3 (2012)
Article Google Scholar
Nightingale, E. B., Peek, D., Chen, P. M., Flinn, J.: Parallelizing security checks on commodity hardware. In: ACM Sigplan Notices, vol. 43(3), pp. 308–318. ACM (2008)
Süßkraut, M., Knauth, T., Weigert, S., Schiffel, U., Meinhold, M., Fetzer, C.: Prospect: a compiler framework for speculative parallelization. In: Proceedings of the 8th Annual IEEE/ACM International Symposium on Code Generation and Optimization, pp. 131–140. ACM (2010)
Zilles, C., Sohi, G.: Master, slave speculative parallelization. In: Proceedings of the 35th Annual IEEE/ACM International Symposium on Microarchitecture: (MICRO-35), pp. 85–96. IEEE (2002)
Wolf, F., Mohr, B.: Automatic performance analysis of hybrid mpi/openmp applications. J. Syst. Archit. 49(10), 421–439 (2003)
Article Google Scholar
Geimer, M., Wolf, F., Wylie, B.J., Mohr, B.: Scalable parallel trace-based performance analysis. In: Recent Advances in Parallel Virtual Machine and Message Passing Interface, pp. 303–312. Springer (2006)
Geimer, M., Wolf, F., Wylie, B.J.N., Mohr, B.: A scalable tool architecture for diagnosing wait states in massively parallel applications. Parall. Comput. 35(7), 375 – 388 (2009). (Online) http://www.sciencedirect.com/science/article/pii/S0167819109000398
Article Google Scholar
Tumeo, A., Villa, O., Chavarria-Miranda, D.G.: Aho-corasick string matching on shared and distributed-memory parallel architectures. IEEE Trans. Parall. Distrib. Syst. 23(3), 436–443 (2012)
Article Google Scholar
Schuff, D.L., Choe, Y.R., Pai, V.S.: Conservative vs. optimistic parallelization of stateful network intrusion detection. In: IEEE International Symposium on Performance Analysis of Systems and software: ISPASS, 2008, 32–43. IEEE (2008)
Vasiliadis, G., Polychronakis, M., Ioannidis, S.: Midea: a multi-parallel intrusion detection architecture. In: Proceedings of the 18th ACM Conference on Computer and Communications Security, pp. 297–308. ACM (2011)
Ladner, R.E., Fischer, M.J.: Parallel prefix computation. J. ACM (JACM) 27(4), 831–838 (1980)
Article MathSciNet Google Scholar
Hillis, W.D., Steele Jr., G.L.: Data parallel algorithms. Commun. ACM 29(12), 1170–1183 (1986)
Article Google Scholar
Mytkowicz, T., Musuvathi, M., Schulte, W.: Data-parallel finite-state machines. In: Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 529–542. ACM (2014)
Desnoyers, M.: Common trace format (ctf) specification (Online). http://git.efficios.com/?p=ctf.git;a=blob_plain;f=common-trace-format-specification.md;hb=master
Vergé, A., Ezzati-Jivan, N., Dagenais, M.R.: Hardware-assisted software event tracing. Concurr. Comput. Pract. Exp. 29(10), (2017). https://doi.org/10.1002/cpe.4069
Article Google Scholar
Clements, A.T., Kaashoek, M.F., Zeldovich, N.: Scalable address spaces using RCU balanced trees. ACM SIGARCH Comput. Archit. News 40(1), 199–210 (2012)
Article Google Scholar
Reumont-Locke, F.: Méthodes efficaces de parallélisation de l’analyse de traces NOYAU. Masters thesis, École Polytechnique de Montréal (2015)

Download references

Acknowledgements

The financial support of the Natural Sciences and Engineering Research Council of Canada (NSERC) and Ericsson Software Research is gratefully acknowledged. We would also like to thank Francis Giraldeau for his advice and comments, as well as Geneviève Bastien and Julien Desfossez for their help.

Author information

Fabien Reumont-Locke
Present address: Google, Mountain View, USA

Authors and Affiliations

Computer and Software Engineering Department, Ecole Polytechnique de Montreal, P.O. Box 6079, Succ. Downtown, Montreal, QC, H3C 3A7, Canada
Fabien Reumont-Locke, Naser Ezzati-Jivan & Michel R. Dagenais

Authors

Fabien Reumont-Locke
View author publications
You can also search for this author in PubMed Google Scholar
Naser Ezzati-Jivan
View author publications
You can also search for this author in PubMed Google Scholar
Michel R. Dagenais
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Naser Ezzati-Jivan.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Reumont-Locke, F., Ezzati-Jivan, N. & Dagenais, M.R. Efficient Methods for Trace Analysis Parallelization. Int J Parallel Prog 47, 951–972 (2019). https://doi.org/10.1007/s10766-019-00631-4

Download citation

Received: 11 October 2017
Accepted: 05 February 2019
Published: 09 February 2019
Issue Date: December 2019
DOI: https://doi.org/10.1007/s10766-019-00631-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient Methods for Trace Analysis Parallelization

Abstract

Access this article

Similar content being viewed by others

Breaking the von Neumann bottleneck: architecture-level processing-in-memory technology

The New Hardware Development Trend and the Challenges in Data Management and Analysis

Shared Memory Parallelism in Modern C++ and HPX

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Efficient Methods for Trace Analysis Parallelization

Abstract

Access this article

Similar content being viewed by others

Breaking the von Neumann bottleneck: architecture-level processing-in-memory technology

The New Hardware Development Trend and the Challenges in Data Management and Analysis

Shared Memory Parallelism in Modern C++ and HPX

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation