Skip to main content

Software Streams: Big Data Challenges in Dynamic Program Analysis

  • Conference paper
The Nature of Computation. Logic, Algorithms, Applications (CiE 2013)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7921))

Included in the following conference series:

  • 1820 Accesses

Abstract

Dynamic program analysis encompasses the development of techniques and tools for analyzing computer software by exploiting information gathered from a program at runtime. The impressive amounts of data collected by dynamic analysis tools require efficient indexing and compression schemes, as well as on-line algorithmic techniques for mining relevant information on-the-fly in order to identify frequent events, hidden software patterns, or undesirable behaviors corresponding to bugs, malware, or intrusions. The paper explores how recent results in algorithmic theory for data-intensive scenarios can be applied to the design and implementation of dynamic program analysis tools, focusing on two important techniques: sampling and streaming.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ball, T.: The concept of dynamic analysis. In: Wang, J., Lemoine, M. (eds.) ESEC 1999 and ESEC-FSE 1999. LNCS, vol. 1687, pp. 216–234. Springer, Heidelberg (1999)

    Chapter  Google Scholar 

  2. Cornelissen, B., Zaidman, A., van Deursen, A., Moonen, L., Koschke, R.: A systematic survey of program comprehension through dynamic analysis. IEEE Transactions on Software Engineering 35(5), 684–702 (2009)

    Article  Google Scholar 

  3. Finkbeiner, B., Havelund, K., Rosu, G., Sokolsky, O.: Runtime verification, dagstuhl sem. 07011 executive summary. Technical report (2007)

    Google Scholar 

  4. Hamou-Lhadj, A., Lethbridge, T.: Measuring various properties of execution traces to help build better trace analysis tools. In: 10th IEEE Int. Conference on Engineering of Complex Computer Systems, pp. 559–568 (2005)

    Google Scholar 

  5. Luk, C.K., Cohn, R., Muth, R., Patil, H., Klauser, A., Lowney, G., Wallace, S., Reddi, V.J., Hazelwood, K.: Pin: building customized program analysis tools with dynamic instrumentation. In: Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2005), pp. 190–200 (2005)

    Google Scholar 

  6. Nethercote, N., Seward, J.: Valgrind: a framework for heavyweight dynamic binary instrumentation. In: Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2007), pp. 89–100 (2007)

    Google Scholar 

  7. D’Elia, D.C., Demetrescu, C., Finocchi, I.: Mining hot calling contexts in small space. In: Proc. 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2011), pp. 516–527. ACM (2011)

    Google Scholar 

  8. Larus, J.R.: Whole program paths. In: ACM SIGPLAN Conference on Programming language design and implementation (PLDI 1999), pp. 259–269. ACM (1999)

    Google Scholar 

  9. Nevill-Manning, C.G., Witten, I.H.: Compression and explanation using hierarchical grammars. The Computer Journal 40(2/3), 103–116 (1997)

    Article  Google Scholar 

  10. Nevill-Manning, C.G., Witten, I.H.: Linear-time, incremental hierarchy inference for compression. In: 7th Data Compression Conference (DCC 1997), pp. 3–11. IEEE Computer Society (1997)

    Google Scholar 

  11. Arnold, M., Ryder, B.G.: A framework for reducing the cost of instrumented code. SIGPLAN Not 36(5), 168–179 (2001)

    Article  Google Scholar 

  12. Chan, A., Holmes, R., Murphy, G.C., Ying, A.T.T.: Scaling an object-oriented system execution visualizer through sampling. In: 11th Int. Workshop on Program Comprehension (IWPC 2003), pp. 237–244. IEEE Computer Society (2003)

    Google Scholar 

  13. Dugerdil, P.: Using trace sampling techniques to identify dynamic clusters of classes. In: Conference of the Center for Advanced Studies on Collaborative Research (CASCON 2007), pp. 306–314. IBM Corporation (2007)

    Google Scholar 

  14. Liblit, B., Aiken, A., Zheng, A.X., Jordan, M.I.: Bug isolation via remote program sampling. In: ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2003), pp. 141–154. ACM (2003)

    Google Scholar 

  15. Pirzadeh, H., Shanian, S., Hamou-Lhadj, A., Alawneh, L., Shafiee, A.: Stratified sampling of execution traces: Execution phases serving as strata. Science of Computer Programming (2012) (in press)

    Google Scholar 

  16. Zhuang, X., Serrano, M.J., Cain, H.W., Choi, J.D.: Accurate, efficient, and adaptive calling context profiling. In: ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2006), pp. 263–271. ACM (2006)

    Google Scholar 

  17. Coppa, E., Finocchi, I., Lo Re, D.: Reservoir profiling. Unpublished Manuscript (January 2013)

    Google Scholar 

  18. Mytkowicz, T., Diwan, A., Hauswirth, M., Sweeney, P.F.: Evaluating the accuracy of Java profilers. In: Proc. 31st ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2010), pp. 187–197 (2010)

    Google Scholar 

  19. Vitter, J.S.: Random sampling with a reservoir. ACM Trans. Math. Softw. 11(1), 37–57 (1985)

    Article  MathSciNet  MATH  Google Scholar 

  20. Marino, D., Musuvathi, M., Narayanasamy, S.: Literace: effective sampling for lightweight data-race detection. In: Proceedings of the 2009 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2009), pp. 134–143 (2009)

    Google Scholar 

  21. Morris, R.: Counting large numbers of events in small registers. Comm. ACM 21(10), 840–842 (1978)

    Article  MATH  Google Scholar 

  22. Munro, J., Paterson, M.: Selection and sorting with limited storage. Theoretical Computer Science 12(3), 315–323 (1980)

    Article  MathSciNet  MATH  Google Scholar 

  23. Alon, N., Matias, Y., Szegedy, M.: The space complexity of approximating the frequency moments. Journal of Computer and System Sciences 58(1), 137–147 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  24. Gilbert, A.C., Guha, S., Indyk, P., Kotidis, Y., Muthukrishnan, S., Strauss, M.J.: Fast, small-space algorithms for approximate histogram maintenance. In: Proceedings of the 34th Annual ACM Symposium on Theory of Computing, pp. 389–398 (2002)

    Google Scholar 

  25. Indyk, P.: Stable distributions, pseudorandom generators, embeddings, and data stream computation. J. ACM 53(3), 307–323 (2006)

    Article  MathSciNet  Google Scholar 

  26. Manku, G.S., Motwani, R.: Approximate frequency counts over data streams. In: Proceedings of the 28th International Conference on Very Large Data Bases, pp. 346–357 (2002)

    Google Scholar 

  27. Charikar, M., O’Callaghan, L., Panigrahy, R.: Better streaming algorithms for clustering problems. In: Proceedings of the 35th Annual ACM Symposium on Theory of Computing (STOC 2003), pp. 30–39 (2003)

    Google Scholar 

  28. Mysore, S., Agrawal, B., Sherwood, T., Shrivastava, N., Suri, S.: Profiling over adaptive ranges. In: IEEE/ACM Int. Symposium on Code Generation and Optimization (CGO 2006), pp. 147–158. IEEE Computer Society (2006)

    Google Scholar 

  29. Hershberger, J., Shrivastava, N., Suri, S., Tóth, C.D.: Adaptive spatial partitioning for multidimensional data streams. In: Fleischer, R., Trippen, G. (eds.) ISAAC 2004. LNCS, vol. 3341, pp. 522–533. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  30. Muthukrishnan, S.: Data streams: Algorithms and applications. Foundations and Trends in Theoretical Computer Science 1(2) (2005)

    Google Scholar 

  31. Metwally, A., Agrawal, D., Abbadi, A.E.: An integrated efficient solution for computing frequent and top-k elements in data streams. ACM Trans. Database Syst. 31(3), 1095–1133 (2006)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Finocchi, I. (2013). Software Streams: Big Data Challenges in Dynamic Program Analysis. In: Bonizzoni, P., Brattka, V., Löwe, B. (eds) The Nature of Computation. Logic, Algorithms, Applications. CiE 2013. Lecture Notes in Computer Science, vol 7921. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39053-1_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-39053-1_15

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-39052-4

  • Online ISBN: 978-3-642-39053-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics