Skip to main content

Autonomous Security Mechanisms for High-Performance Computing Systems: Review and Analysis

  • Chapter
  • First Online:
Adaptive Autonomous Secure Cyber Systems

Abstract

High-performance computing (HPC) has played an increasingly important role in the fields of research, commerce, and national security. Though HPC systems may inherit security issues from general-purpose computers, simply retrofitting traditional security mechanisms for HPC systems is inappropriate or ineffective. In this chapter, we provide an overview of the design and architecture of HPC systems, and analyze the potential threats and vulnerabilities in HPC systems. We also analyze how to use defense mechanisms from the aspects of implementation, methodology, application, and performance for autonomous cyber defense in HPC systems. This chapter provides a comprehensive review of autonomous security mechanisms for HPC security and sheds light on applying security defense mechanisms to HPC systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 179.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Erich Strohmaier, Jack J Dongarra, Hans W Meuer, and Horst D Simon. The marketplace of high-performance computing. Parallel Computing, 25(13):1517–1544, 1999.

    Article  Google Scholar 

  2. Oak Ridge National Laboratory. Introducing Titan. https://www.olcf.ornl.gov/titan/.

  3. J Michalakes, J Dudhia, D Gill, T Henderson, J Klemp, W Skamarock, and W Wang. The weather research and forecast model: software architecture and performance. In Proc. of ECMWF, 2005.

    Google Scholar 

  4. Lorin Hochstein, Taiga Nakamura, Victor R Basili, Sima Asgari, Marvin V Zelkowitz, Jeffrey K Hollingsworth, Forrest Shull, Jeffrey Carver, Martin Voelp, Nico Zazworka, et al. Experiments to understand HPC time to development. CTWatch Quarterly, 2(4A), 2006.

    Google Scholar 

  5. Dan Gabriel Cacuci. Handbook of Nuclear Engineering, volume 2. Springer Science & Business Media, 2010.

    Google Scholar 

  6. Dewayne Adams. Six security risks in high performance computing (HPC). http://patriot-tech.com/six-security-risks-in-high-performance-computing-hpc/.

  7. Curtis M Keliiaa and Jason R Hamlet. National cyber defense high performance computing and analysis: Concepts, planning and roadmap. SANDIA Report, 2010.

    Google Scholar 

  8. Matt Bishop. What is computer security? IEEE Security & Privacy, 1(1):67–69, 2003.

    Article  Google Scholar 

  9. Alex Malin and Graham Van Heule. Continuous monitoring and cyber security for high performance computing. In Proc. of ACM CLHS, 2013.

    Google Scholar 

  10. insideHPC. What is high performance computing? http://insidehpc.com/hpc-basic-training/what-is-hpc/.

  11. Oak Ridge National Laboratory. Titan user guide. https://www.olcf.ornl.gov/support/system-user-guides/titan-user-guide/.

  12. Jiuxing Liu, Jiesheng Wu, and Dhabaleswar K Panda. High performance rdma-based mpi implementation over infiniband. International J. of Parallel Programming, 32(3):167–198, 2004.

    Google Scholar 

  13. David Luebke. CUDA: Scalable parallel programming for high-performance scientific computing. In Proc. of IEEE ISBI, 2008.

    Google Scholar 

  14. NVIDIA. What is GPU-accelerated computing? http://www.nvidia.com/object/what-is-gpu-computing.html.

  15. Oak Ridge National Laboratory. ORNL debuts Titan supercomputer. https://www.olcf.ornl.gov/wp-content/themes/olcf/titan/Titan_Debuts.pdf.

  16. Garrick Staples. Torque resource manager. In Proc. of ACM/IEEE conference on Supercomputing, 2006.

    Google Scholar 

  17. Amjad Majid Ali, Don Albert, Par Andersson, Ernest Artiaga, Daniel Auble, Susanne Balle, Anton Blanchard, Hongjia Cao, Daniel Christians, Gilles Civario, et al. Simple linux utility for resource management. Technical report, Lawrence Livermore National Laboratory, 2008.

    Google Scholar 

  18. Cray. Cray Linux Environment (CLE) software release overview. http://docs.cray.com/books/S-2425-52xx/.

  19. Redhat. Redhat HPC solution. http://www.dell.com/downloads/global/solutions/vslc/redhat_hpc_solution.pdf.

  20. IBM. IBM Spectrum Computing accelerates high-performance and data-intensive workloads. https://www.ibm.com/spectrum-computing.

  21. Stephen Booth and Elson Mourao. Single sided MPI implementations for SUN MPI. In Proc. of IEEE SC, 2000.

    Google Scholar 

  22. William Gropp, Ewing Lusk, Nathan Doss, and Anthony Skjellum. A high-performance, portable implementation of the MPI message passing interface standard. Parallel computing, 22(6):789–828, 1996.

    Article  Google Scholar 

  23. IBM. IBM Spectrum MPI. https://www.ibm.com/us-en/marketplace/spectrum-mpi.

  24. Intel. Intel MPI Library. https://software.intel.com/en-us/intel-mpi-library.

  25. Edgar Gabriel, Graham E Fagg, George Bosilca, Thara Angskun, Jack J Dongarra, Jeffrey M Squyres, Vishal Sahay, Prabhanjan Kambadur, Brian Barrett, Andrew Lumsdaine, et al. Open mpi: Goals, concept, and design of a next generation mpi implementation. In Proc. of Springer European PVM. Springer, 2004.

    Google Scholar 

  26. Grant Mackey, Saba Sehrish, John Bent, Julio Lopez, Salman Habib, and Jun Wang. Introducing map-reduce to high end computing. In Proc. of PDSW, 2008.

    Google Scholar 

  27. Andrzej Bialecki, Michael Cafarella, Doug Cutting, and Owen OMalley. Hadoop: a framework for running applications on large clusters built of commodity hardware. Wiki at http://lucene.apache.org/hadoop , 11, 2005.

  28. Matei Zaharia, Mosharaf Chowdhury, Michael J Franklin, Scott Shenker, and Ion Stoica. Spark: cluster computing with working sets. HotCloud, 10:10–10, 2010.

    Google Scholar 

  29. Ankit Toshniwal, Siddarth Taneja, Amit Shukla, Karthik Ramasamy, Jignesh M Patel, Sanjeev Kulkarni, Jason Jackson, Krishna Gade, Maosong Fu, Jake Donham, et al. Storm @ twitter. In Proc. of ACM SIGMOD, 2014.

    Google Scholar 

  30. Paris Carbone, Stephan Ewen, Seif Haridi, Asterios Katsifodimos, Volker Markl, and Kostas Tzoumas. Apache flink: Stream and batch processing in a single engine. Data Engineering, page 28, 2015.

    Google Scholar 

  31. Rajiv Ranjan. Streaming big data processing in datacenter clouds. IEEE Cloud Computing, 1(1):78–83, 2014.

    Article  Google Scholar 

  32. T Nakashima, M Oyama, H Hisada, and N Ishii. Analysis of software bug causes and its prevention. Information and Software technology, 41(15):1059–1068, 1999.

    Article  Google Scholar 

  33. Crispan Cowan, Calton Pu, Dave Maier, Jonathan Walpole, Peat Bakke, Steve Beattie, Aaron Grier, Perry Wagle, Qian Zhang, and Heather Hinton. Stackguard: Automatic adaptive detection and prevention of buffer-overflow attacks. In Proc. of USENIX Security Symposium, 1998.

    Google Scholar 

  34. Yilin Mo and Bruno Sinopoli. Secure control against replay attacks. In Proc. of IEEE Communication, Control, and Computing, 2009.

    Google Scholar 

  35. Shuo Chen, Jun Xu, Emre Can Sezer, Prachi Gauriar, and Ravishankar K Iyer. Non-control-data attacks are realistic threats. In Proc. of USENIX Security Symposium, 2005.

    Google Scholar 

  36. Niels Provos, Markus Friedl, and Peter Honeyman. Preventing privilege escalation. In Proc. of USENIX Security Symposium, 2003.

    Google Scholar 

  37. Florian Kammüller and Christian W Probst. Modeling and verification of insider threats using logical analysis. 2016.

    Google Scholar 

  38. Peter Mell, Karen Scarfone, and Sasha Romanosky. Common vulnerability scoring system. IEEE Security & Privacy, 4(6), 2006.

    Google Scholar 

  39. Michael Hayden. The insider threat to us government information systems. Technical report, National Security Agency/Central Security Service Fort George G Meade MD, 1999.

    Google Scholar 

  40. BBC. Russian nuclear scientists arrested for ‘Bitcoin mining plot’. http://www.bbc.com/news/world-europe-43003740.

  41. Robert C Seacord. Secure Coding in C and C+ +. Pearson Education, 2005.

    Google Scholar 

  42. David Evans and David Larochelle. Improving security using extensible lightweight static analysis. IEEE software, 19(1):42–51, 2002.

    Article  Google Scholar 

  43. Hossein Safyallah and Kamran Sartipi. Dynamic analysis of software systems using execution pattern mining. In Proc. of IEEE ICPC, 2006.

    Google Scholar 

  44. Reed Hastings and Bob Joyce. Purify: Fast detection of memory leaks and access errors. In Proc. of USENIX Security Symposium, 1991.

    Google Scholar 

  45. Nicholas Nethercote and Julian Seward. Valgrind: A program supervision framework. Electronic notes in theoretical computer science, 89(2):44–66, 2003.

    Article  Google Scholar 

  46. Chris Lattner. Llvm and clang: Next generation compiler technology. In The BSD Conference, 2008.

    Google Scholar 

  47. Chris Lattner and Vikram Adve. LLVM: A compilation framework for lifelong program analysis & transformation. In Proc. of IEEE CGO, 2004.

    Google Scholar 

  48. Ted Kremenek. Finding software bugs with the clang static analyzer. Apple Inc, 2008.

    Google Scholar 

  49. Bettina Krammer, Katrin Bidmon, Matthias S Müller, and Michael M Resch. Marmot: An mpi analysis and checking tool. In Advances in Parallel Computing, volume 13, pages 493–500. Elsevier, 2004.

    Google Scholar 

  50. Barbara Kreaseck, Michelle Mills Strout, and Paul Hovland. Depth analysis of mpi programs. In Proc. of AMP, 2010.

    Google Scholar 

  51. Stephen F Siegel. Verifying parallel programs with mpi-spin. In European Parallel Virtual Machine/Message Passing Interface Users Group Meeting, 2007.

    Google Scholar 

  52. Jeffrey S Vetter and Bronis R De Supinski. Dynamic software testing of mpi applications with umpire. In Proc. of IEEE SC, 2000.

    Google Scholar 

  53. Alexander Droste, Michael Kuhn, and Thomas Ludwig. Mpi-checker: static analysis for mpi. In Proc. of ACM LLVM in HPC, 2015.

    Google Scholar 

  54. Anh Vo, Sriram Aananthakrishnan, Ganesh Gopalakrishnan, Bronis R De Supinski, Martin Schulz, and Greg Bronevetsky. A scalable and distributed dynamic formal verifier for mpi programs. In Proc. of IEEE SC, 2010.

    Google Scholar 

  55. Leslie Lamport. Time, clocks, and the ordering of events in a distributed system. Communications of the ACM, 21(7):558–565, 1978.

    Article  Google Scholar 

  56. Koen Claessen and John Hughes. Quickcheck: a lightweight tool for random testing of Haskell programs. Acm sigplan notices, 46(4):53–64, 2011.

    Article  Google Scholar 

  57. Joachim Protze, Simone Atzeni, Dong H Ahn, Martin Schulz, Ganesh Gopalakrishnan, Matthias S Müller, Ignacio Laguna, Zvonimir Rakamarić, and Greg L Lee. Towards providing low-overhead data race detection for large openmp applications. In Proc. of IEEE LLVM in HPC, pages 40–47, 2014.

    Google Scholar 

  58. Konstantin Serebryany and Timur Iskhodzhanov. Threadsanitizer: data race detection in practice. In Proc. of ACM WBIA, pages 62–71, 2009.

    Google Scholar 

  59. Colin Scott, Vjekoslav Brajkovic, George Necula, Arvind Krishnamurthy, and Scott Shenker. Minimizing faulty executions of distributed systems. In Proc. of USENIX NSDI, 2016.

    Google Scholar 

  60. Oak Ridge National Laboratory. ORNL Hyperion Technology. https://www.ornl.gov/partnerships/ornl-hyperion-technology.

  61. William Yurcik, Gregory A Koenig, Xin Meng, and Joseph Greenseid. Cluster security as a unique problem with emergent properties: Issues and techniques. In Proc. of LCI ICLC, 2004.

    Google Scholar 

  62. Butler W Lampson. Protection. ACM SIGOPS Operating Systems Review, 8(1):18–24, 1974.

    Article  Google Scholar 

  63. D Elliott Bell and Leonard J LaPadula. Secure computer systems: Mathematical foundations. Technical report, DTIC Document, 1973.

    Google Scholar 

  64. James Morris, Stephen Smalley, and Greg Kroah-Hartman. Linux security modules: General security support for the linux kernel. In Proc. of USENIX Security Symposium, 2002.

    Google Scholar 

  65. Z Cliffe Schreuders, Tanya McGill, and Christian Payne. Empowering end users to confine their own applications: the results of a usability study comparing selinux, apparmor, and fbac-lsm. ACM Trans. TISSEC, 14(2):19, 2011.

    Article  Google Scholar 

  66. NSA Peter Loscocco. Integrating flexible support for security policies into the Linux operating system.

    Google Scholar 

  67. Andrew Blaich, Douglas Thain, and Aaron Striegel. Reflections on the virtues of modularity: a case study in linux security modules. Software: Practice and Experience, 39(15):1235–1251, 2009.

    Google Scholar 

  68. Toshiharu Harada, Takashi Horie, and Kazuo Tanaka. Task oriented management obviates your onus on linux. In Linux Conference, volume 3, page 23, 2004.

    Google Scholar 

  69. Imamjafar Borate and RK Chavan. Sandboxing in linux: From smartphone to cloud. International J. of Computer Applications, 148(8), 2016.

    Article  Google Scholar 

  70. Makan Pourzandi, Axelle Apvrille, E Gingras, A Medenou, and David Gordon. Distributed access control for carrier class clusters. In Proc. of PDPTA, 2003.

    Google Scholar 

  71. IBM. IBM Security Access Manager. https://www.ibm.com/us-en/marketplace/access-management.

  72. Fausto Giunchiglia, Rui Zhang, and Bruno Crispo. Relbac: Relation based access control. In Proc. of IEEE SKG, 2008.

    Google Scholar 

  73. Damien Gros, Mathieu Blanc, Jérémy Briffaut, and Christian Toinard. Advanced mac in hpc systems: performance improvement. In Proc. of IEEE CCGrid, 2012.

    Google Scholar 

  74. Sam Sanchez, Amanda Bonnie, Graham Van Heule, Conor Robinson, Adam DeConinck, Kathleen Kelly, Quellyn Snead, and J Brandt. Design and implementation of a scalable hpc monitoring system. In Proc. of IEEE PDPSW, 2016.

    Google Scholar 

  75. Sean Peisert. Fingerprinting communication and computation on HPC machines. Lawrence Berkeley National Laboratory, 2010.

    Google Scholar 

  76. Calvin Ko, Manfred Ruschitzka, and Karl Levitt. Execution monitoring of security-critical programs in distributed systems: A specification-based approach. In Proc. of IEEE S&P, 1997.

    Google Scholar 

  77. S Sandeep. Process tracing using ptrace. Linux Gazette, (81), 2002.

    Google Scholar 

  78. Petr Hosek and Cristian Cadar. Safe software updates via multi-version execution. In Proc. of IEEE ICSE, 2013.

    Google Scholar 

  79. Babak Salamat, Todd Jackson, Andreas Gal, and Michael Franz. Orchestra: intrusion detection using parallel execution and monitoring of program variants in user-space. In Proc. of ACM European CCS, 2009.

    Google Scholar 

  80. Petr Hosek and Cristian Cadar. Varan the unbelievable: An efficient N-version execution framework. ACM SIGARCH, 43(1):339–353, 2015.

    Article  Google Scholar 

  81. PaX Team. Pax address space layout randomization (aslr). 2003.

    Google Scholar 

  82. Gaurav S Kc, Angelos D Keromytis, and Vassilis Prevelakis. Countering code-injection attacks with instruction-set randomization. In Proc. of ACM CCS, 2003.

    Google Scholar 

  83. Sandeep Bhatkar and R Sekar. Data space randomization. In Proc. of Springer DIMVA, pages 1–22, 2008.

    Google Scholar 

  84. Ashish Venkat, Sriskanda Shamasunder, Hovav Shacham, and Dean M Tullsen. HIPStR: Heterogeneous-ISA program state relocation. In Proc. of ACM ASPLOS, 2016.

    Google Scholar 

  85. Marco Prandini and Marco Ramilli. Return-oriented programming. IEEE Security & Privacy, 10(6):84–87, 2012.

    Article  Google Scholar 

  86. Martín Abadi, Mihai Budiu, Ulfar Erlingsson, and Jay Ligatti. Control-flow integrity. In Proc. of ACM CCS, 2005.

    Google Scholar 

  87. Nathan Burow, Scott A Carr, Stefan Brunthaler, Mathias Payer, Joseph Nash, Per Larsen, and Michael Franz. Control-flow integrity: Precision, security, and performance. arXiv, 2016.

    Google Scholar 

  88. Caroline Tice, Tom Roeder, Peter Collingbourne, Stephen Checkoway, Úlfar Erlingsson, Luis Lozano, and Geoff Pike. Enforcing forward-edge control-flow integrity in gcc & llvm. In Proc. of USENIX Security Symposium, 2014.

    Google Scholar 

  89. Clang community. Clang 5 documentation: Control Flow Integrity. http://clang.llvm.org/docs/ControlFlowIntegrity.html#publications.

  90. Mingwei Zhang and R Sekar. Control flow integrity for cots binaries. In Proc. of USENIX Security Symposium, 2013.

    Google Scholar 

  91. Aydan R Yumerefendi, Benjamin Mickle, and Landon P Cox. Tightlip: Keeping applications from spilling the beans. In Proc. of USENIX NSDI, 2007.

    Google Scholar 

  92. Roberto Capizzi, Antonio Longo, VN Venkatakrishnan, and A Prasad Sistla. Preventing information leaks through shadow executions. In Proc. of IEEE ACSAC, 2008.

    Google Scholar 

  93. Dominique Devriese and Frank Piessens. Noninterference through secure multi-execution. In Proc. of IEEE S&P, 2010.

    Google Scholar 

  94. Benjamin Cox, David Evans, Adrian Filipi, Jonathan Rowanhill, Wei Hu, Jack Davidson, John Knight, Anh Nguyen-Tuong, and Jason Hiser. N-variant systems: A secretless framework for security through diversity. In Proc. of USENIX Security Symposium, 2006.

    Google Scholar 

  95. Artem Dinaburg. Bitsquatting: DNS Hijacking without exploitation. Proceedings of BlackHat Security, 2011.

    Google Scholar 

  96. Andy A Hwang, Ioan A Stefanovici, and Bianca Schroeder. Cosmic rays don’t strike twice: understanding the nature of DRAM errors and the implications for system design. In Proc. of ACM SIGPLAN Notices, 2012.

    Google Scholar 

  97. Edmund B Nightingale, John R Douceur, and Vince Orgovan. Cycles, cells and platters: an empirical analysis of hardware failures on a million consumer PCs. In Proc. of EuroSys, 2011.

    Google Scholar 

  98. KernelL, Bug Tracker. Data corruption with Opteron CPUs and Nvidia chipsets.

    Google Scholar 

  99. Ashish Gupta, Fan Yang, Jason Govig, Adam Kirsch, Kelvin Chan, Kevin Lai, Shuo Wu, Sandeep Govind Dhoot, Abhilash Rajesh Kumar, Ankur Agiwal, et al. Mesa: Geo-replicated, near real-time, scalable data warehousing. Proc. of the VLDB Endowment, 7(12):1259–1270, 2014.

    Article  Google Scholar 

  100. Miguel Castro and Barbara Liskov. Practical Byzantine fault tolerance and proactive recovery. ACM Trans. TOCS, 20(4):398–461, 2002.

    Article  Google Scholar 

  101. Chi Ho, Robbert Van Renesse, Mark Bickford, and Danny Dolev. Nysiad: Practical protocol transformation to tolerate byzantine failures. In Proc. of USENIX NSDI, 2008.

    Google Scholar 

  102. Michael G Merideth, Arun Iyengar, Thomas Mikalsen, Stefan Tai, Isabelle Rouvellou, and Priya Narasimhan. Thema: Byzantine-fault-tolerant middleware for web-service applications. In Proc. of IEEE SRDS, 2005.

    Google Scholar 

  103. Diogo Behrens, Marco Serafini, Flavio P. Junqueira, Sergei Arnautov, and Christof Fetzer. Scalable error isolation for distributed systems. In Proc. of USENIX NSDI, 2015.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhuo Lu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Hou, T., Wang, T., Shen, D., Lu, Z., Liu, Y. (2020). Autonomous Security Mechanisms for High-Performance Computing Systems: Review and Analysis. In: Jajodia, S., Cybenko, G., Subrahmanian, V., Swarup, V., Wang, C., Wellman, M. (eds) Adaptive Autonomous Secure Cyber Systems. Springer, Cham. https://doi.org/10.1007/978-3-030-33432-1_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-33432-1_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-33431-4

  • Online ISBN: 978-3-030-33432-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics