Skip to main content

HOOVER: Distributed, Flexible, and Scalable Streaming Graph Processing on OpenSHMEM

  • Conference paper
  • First Online:
OpenSHMEM and Related Technologies. OpenSHMEM in the Era of Extreme Heterogeneity (OpenSHMEM 2018)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 11283))

Included in the following conference series:

Abstract

Many problems can benefit from being phrased as a graph processing or graph analytics problem: infectious disease modeling, insider threat detection, fraud prevention, social network analyis, and more. These problems all share a common property: the relationships between entitites in these systems are crucial to understanding the overall behavior of the systems themselves. However, relations are rarely if ever static. As our ability to collect information on those relations improve (e.g. on financial transactions in fraud prevention), the value added by large-scale, high-performance, dynamic/streaming (rather than static) graph analysis becomes significant.

This paper introduces HOOVER, a distributed software framework for large-scale, dynamic graph modeling and analyis. HOOVER sits on top of OpenSHMEM, a PGAS programming system, and enables users to plug in application-specific logic while handling all runtime coordination of computation and communication. HOOVER has demonstrated scaling out to 24,576 cores, and is flexible enough to support a wide range of graph-based applications, including infectious disease modeling and anomaly detection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Carbone, P., Katsifodimos, A., Ewen, S., Markl, V., Haridi, S., Tzoumas, K.: Apache flink: stream and batch processing in a single engine. Bull. IEEE Comput. Soc. Tech. Committee Data Eng. 36(4) (2015)

    Google Scholar 

  2. Chapman, B., et al.: Introducing OpenSHMEM: SHMEM for the PGAS community. In: Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model, p. 2. ACM (2010)

    Google Scholar 

  3. Eberle, W., Holder, L.: Scalable anomaly detection in graphs. Intell. Data Anal. 19(1), 57–74 (2015)

    Article  Google Scholar 

  4. Eberle, W., Holder, L.B.: Mining for structural anomalies in graph-based data. In: DMIN, pp. 376–389 (2007)

    Google Scholar 

  5. Gonzalez, J.E., Xin, R.S., Dave, A., Crankshaw, D., Franklin, M.J., Stoica, I.: Graphx: graph processing in a distributed dataflow framework. In: OSDI, vol. 14, pp. 599–613 (2014)

    Google Scholar 

  6. Hoque, I., Gupta, I.: LFGraph: simple and fast distributed graph analytics. In: Proceedings of the First ACM SIGOPS Conference on Timely Results in Operating Systems, p. 9. ACM (2013)

    Google Scholar 

  7. Kalavri, V.: Gelly: Large-Scale Graph Processing with Apache Flink (2015). https://www.slideshare.net/vkalavri/gelly-in-apache-flink-bay-area-meetup

  8. Low, Y., Bickson, D., Gonzalez, J., Guestrin, C., Kyrola, A., Hellerstein, J.M.: Distributed graphlab: a framework for machine learning and data mining in the cloud. Proc. VLDB Endow. 5(8), 716–727 (2012)

    Article  Google Scholar 

  9. Malewicz, G., et al.: Pregel: a system for large-scale graph processing. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, pp. 135–146. ACM (2010)

    Google Scholar 

Download references

Acknowledgments

The authors would like to thank Steve Poole (LANL) for his valuable feedback on the HOOVER project and this manuscript.

Work on HOOVER was funded in part by the United States Department of Defense, and was supported by resources at Los Alamos National Laboratory. This research used resources of the National Energy Research Scientific Computing Center, a DOE Office of Science User Facility supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231. Los Alamos National Laboratory publication number LA-UR-18-27825.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Max Grossman .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Grossman, M., Pritchard, H., Curtis, T., Sarkar, V. (2019). HOOVER: Distributed, Flexible, and Scalable Streaming Graph Processing on OpenSHMEM. In: Pophale, S., Imam, N., Aderholdt, F., Gorentla Venkata, M. (eds) OpenSHMEM and Related Technologies. OpenSHMEM in the Era of Extreme Heterogeneity. OpenSHMEM 2018. Lecture Notes in Computer Science(), vol 11283. Springer, Cham. https://doi.org/10.1007/978-3-030-04918-8_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-04918-8_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-04917-1

  • Online ISBN: 978-3-030-04918-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics