Skip to main content

Scalability of Streaming Anomaly Detection in an Unbounded Key Space Using Migrating Threads

Part of the Lecture Notes in Computer Science book series (LNTCS,volume 12728)

Abstract

Applications where streams of data are passed through large data structures are becoming of increasing importance. For instance network intrusion detection and cyber security as a whole rely on real time analysis of network traffic. Unfortunately, when implemented on conventional architectures such applications become horribly inefficient, especially when attempts are made to scale up performance via some sort of parallelism. An earlier paper discussed an implementation of the Firehose streaming benchmark that assumed only a bounded number of keys and datums. This paper discusses a significantly more complex (and more realistic) variant that analyzes continuously streaming samples from an unbounded range of keys. We utilize a novel migrating thread architecture in which threads may migrate as needed through a single system wide shared memory space, thereby avoiding conventional inefficiencies. As with the earlier paper, results are promising, with both far better scaling and increased performance over previously reported implementations, on a platform with considerably less intrinsic hardware computational resources.

Keywords

  • Streaming
  • Emerging architectures
  • Scalability
  • Communication overhead

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-030-78713-4_9
  • Chapter length: 19 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   84.99
Price excludes VAT (USA)
  • ISBN: 978-3-030-78713-4
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   109.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.
Fig. 6.

Notes

  1. 1.

    https://firehose.sandia.gov.

  2. 2.

    Lucata formerly EMU Solutions Inc.

  3. 3.

    https://crnch.gatech.edu/rogues-Lucata.

References

  1. Firehose benchmarks. http://firehose.sandia.gov/

  2. Bader, D.A., et al.: STINGER: spatio-temporal interaction networks and graphs (STING) extensible representation. Technical report, Georgia Institute of Technology (2009)

    Google Scholar 

  3. Bar-Yossef, Z., Kumar, R., Sivakumar, D.: Reductions in streaming algorithms, with an application to counting triangles in graphs. In: Proceedings of the Thirteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2002, pp. 623–632. Society for Industrial and Applied Mathematics, Philadelphia (2002). http://dl.acm.org/citation.cfm?id=545381.545464

  4. Becchetti, L., Boldi, P., Castillo, C., Gionis, A.: Efficient semi-streaming algorithms for local triangle counting in massive graphs. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2008, pp. 16–24. ACM, New York (2008). https://doi.org/10.1145/1401890.1401898

  5. Bernstein, P.A., Goodman, N.: Timestamp-based algorithms for concurrency control in distributed database systems. In: Proceedings of the Sixth International Conference on Very Large Data Bases, VLDB 1980, vol. 6, pp. 285–300. VLDB Endowment (1980). http://dl.acm.org/citation.cfm?id=1286887.1286918

  6. Berry, J., Porter, A.: Stateful streaming in distributed memory supercomputers. In: Chesapeake Large Scale Data Analytics Conference (2016)

    Google Scholar 

  7. Carbone, P., Katsifodimos, A., Ewen, S., Markl, V., Haridi, S., Tzoumas, K.: Apache Flink: stream and batch processing in a single engine. In: Bulletin of the Technical Committee on Data Engineering, December 2015

    Google Scholar 

  8. Dysart, T., et al.: Highly scalable near memory processing with migrating threads on the emu system architecture, November 2016. https://doi.org/10.1109/IA3.2016.7

  9. Eaton, J.: FireHose, PageRank, and nvGRAPH: GPU accelerated analytics. In: Chesapeake Large Scale Data Analytics Conference (2016)

    Google Scholar 

  10. Ediger, D., Jiang, K., Riedy, J., Bader, D.: Massive streaming data analytics: a case study with clustering coefficients, pp. 1–8, May 2010. https://doi.org/10.1109/IPDPSW.2010.5470687

  11. Feigenbaum, J., Kannan, S., McGregor, A., Suri, S., Zhang, J.: On graph problems in a semi-streaming model. Theor. Comput. Sci. 348(2), 207–216 (2005). https://doi.org/10.1016/j.tcs.2005.09.013

    MathSciNet  CrossRef  MATH  Google Scholar 

  12. FIREHOUSE, S.B., with WATERSLIDE, E.: Karl Anderson. In: Chesapeake Large Scale Data Analytics Conference (2016)

    Google Scholar 

  13. Kogge, P.M., Butcher, N., Page, B.: Introducing streaming into linear algebra-based sparse graph algorithms, July 2019

    Google Scholar 

  14. Kogge, P.: Of piglets and threadlets: architectures for self-contained, mobile, memory programming. In: Innovative Architecture for Future Generation High-Performance Processors and Systems, pp. 130–138, January 2004. https://doi.org/10.1109/IWIA.2004.10005

  15. McGregor, A.: Graph stream algorithms: a survey. SIGMOD Rec. 43(1), 9–20 (2014). https://doi.org/10.1145/2627692.2627694

    CrossRef  Google Scholar 

  16. Page, B.A., Kogge, P.M.: Scalability of streaming on migrating threads. In: High Performance Extreme Computing (HPEC), September 2020

    Google Scholar 

  17. Plimpton, S.J., Shead, T.: Streaming data analytics via message passing with application to graph algorithms. J. Parallel Distrib. Comput. 74(8) (2014). https://doi.org/10.1016/j.jpdc.2014.04.001

  18. Riedy, J., Bader, D.: Stinger: multi-threaded graph streaming, May 2014

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Brian A. Page .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Page, B.A., Kogge, P.M. (2021). Scalability of Streaming Anomaly Detection in an Unbounded Key Space Using Migrating Threads. In: Chamberlain, B.L., Varbanescu, AL., Ltaief, H., Luszczek, P. (eds) High Performance Computing. ISC High Performance 2021. Lecture Notes in Computer Science(), vol 12728. Springer, Cham. https://doi.org/10.1007/978-3-030-78713-4_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-78713-4_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-78712-7

  • Online ISBN: 978-3-030-78713-4

  • eBook Packages: Computer ScienceComputer Science (R0)