Skip to main content

Collaboration and Negotiation

  • Chapter
  • First Online:
Autonomous Intelligent Cyber Defense Agent (AICA)

Abstract

Collaboration and Negotiation is a critical high-level function of an Autonomous Intelligent Cyber-Defense Agent (AICA) that enables communication among agents, central cyber command-and-control (C2), and human operators. Maintaining the Confidentiality, Integrity, and Availability (CIA) triad while achieving mission goals requires stealthy AICA agents to exercise: (1) minimal communication as needed for avoiding detection, (2) verification of information received with limited resources, and (3) active learning during operations to address dynamic conditions. Moreover, negotiations to jointly identify and execute a Course of Action (COA) will require building consensus under distributed and/or decentralized multi-agent settings with information uncertainties. This chapter presents algorithmic approaches for enabling the collaboration and negotiation function. Strengths and limitations of potential techniques are identified, and a representative example is illustrated. Specifically, a two-tier Multi-Agent Reinforcement Learning (MARL) algorithm was implemented to learn joint strategies among agents in a navigation and communication simulation environment. Based on simulation experiments, emergent collaborative team behaviors among agents were observed under information uncertainties. Recommendations for future development are also discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Alshiekh, M., Bloem, R., Ehlers, R., Könighofer, B., Niekum, S., & Topcu, U. (2018). Safe reinforcement learning via shielding. In Proceedings of the AAAI conference on artificial intelligence (Vol. 32, No. 1).

    Google Scholar 

  • Amirkhani, A., & Barshooi, A. H. (2021). Consensus in multi-agent systems: A review. Artificial Intelligence Review, 1–39.

    Google Scholar 

  • Barrett, C., & Tinelli, C. (2018). Satisfiability modulo theories. In E. M. Clarke et al. (Eds.), Handbook of model checking (pp. 305–343). Springer.

    Chapter  Google Scholar 

  • Barto, A. G., & Mahadevan, S. (2003). Recent advances in hierarchical reinforcement learning. Discrete Event Dynamic Systems, 13(1), 41–77.

    Article  MathSciNet  MATH  Google Scholar 

  • Bertsekas, D. (2021). Rollout, policy iteration, and distributed reinforcement learning. Athena Scientific.

    Google Scholar 

  • Bhattacharya, A., Bopardikar, S. D., Chatterjee, S., & Vrabie, D. (2019). Cyber threat screening using a queuing-based game-theoretic approach. Journal of Information Warfare, 18(4), 37–52.

    Google Scholar 

  • Braziunas, D. (2003). POMDP solution methods. University of Toronto. https://www.techfak.uni-bielefeld.de/~skopp/Lehre/STdKI_SS10/POMDP_solution.pdf

  • Chatterjee, S., & Thekdi, S. (2020). An iterative learning and inference approach to managing dynamic cyber vulnerabilities of complex systems. Reliability Engineering & System Safety, 193, 106664.

    Article  Google Scholar 

  • Chatterjee, S., Halappanavar, M., Tipireddy, R., Oster, M., & Saha, S. (2015). Quantifying mixed uncertainties in cyber attacker payoffs. In 2015 IEEE international symposium on Technologies for Homeland Security (HST) (pp. 1–6). Best Paper Award (Cyber Security).

    Google Scholar 

  • Chatterjee, S., Halappanavar, M., Tipireddy, R., & Oster, M. (2016). Game theory and uncertainty quantification for cyber defense applications. SIAM News, 49(6).

    Google Scholar 

  • Chatterjee, S., Brigantic, R. T., & Waterworth, A. M. (Eds.). (2021). Applied risk analysis for guiding homeland security policy and decisions. Wiley.

    Google Scholar 

  • Dutta, A., & Al-Shaer, E. (2019a, April). Cyber defense matrix: A new model for optimal composition of cybersecurity controls to construct resilient risk mitigation. In Proceedings of the 6th annual symposium on hot topics in the science of security (pp. 1–2).

    Google Scholar 

  • Dutta, A., & Al-Shaer, E. (2019b, June). “What”, “Where”, and “Why” cybersecurity controls to enforce for optimal risk mitigation. In 2019 IEEE conference on Communications and Network Security (CNS) (pp. 160–168). IEEE.

    Google Scholar 

  • Dutta, A., Al-Shaer, E., & Chatterjee, S. (2021). Constraints satisfiability driven reinforcement learning for autonomous cyber defense. arXiv preprint, arXiv:2104.08994. In Proceedings of the 1st international conference on Autonomous Intelligent Cyber-Defence Agents (AICA) 2021.

    Google Scholar 

  • Figura, M., Lin, Y., Liu, J., & Gupta, V. (2021). Resilient consensus-based multi-agent reinforcement learning with function approximation. arXiv preprint, arXiv:2111.06776.

    Google Scholar 

  • Hedin, Y., & Moradian, E. (2015). Security in multi-agent systems. Procedia Computer Science, 60, 1604–1612.

    Article  Google Scholar 

  • Hsu, C. D., Jeong, H., Pappas, G. J., & Chaudhari, P. (2020). Scalable reinforcement learning policies for multi-agent control. arXiv preprint, arXiv:2011.08055.

    Google Scholar 

  • Huang, W., Mordatch, I., & Pathak, D. (2020). One policy to control them all: Shared modular policies for agent-agnostic control. In International conference on machine learning (pp. 4455–4464).

    Google Scholar 

  • Haarnoja, T., Zhou, A., Abbeel, P., & Levine, S. (2018, July). Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In International conference on machine learning (pp. 1861–1870).

    Google Scholar 

  • Iqbal, S., & Sha, F. (2019). Coordinated exploration via intrinsic rewards for multi-agent reinforcement learning. arXiv preprint arXiv:1905.12127.

    Google Scholar 

  • Jafarian, J. H. H., Al-Shaer, E., & Duan, Q. (2014, November). Spatio-temporal address mutation for proactive cyber agility against sophisticated attackers. In Proceedings of the first ACM workshop on moving target defense (pp. 69–78).

    Google Scholar 

  • Jafarian, J.H., Al-Shaer, E., & Duan, Q. (2015, April). Adversary-aware IP address randomization for proactive agility against sophisticated attackers. In 2015 IEEE conference on computer communications (INFOCOM) (pp. 738–746). IEEE.

    Google Scholar 

  • Kaelbling, L. P., Littman, M. L., & Moore, A. W. (1996). Reinforcement learning: A survey. Journal of Artificial Intelligence Research, 4, 237–285.

    Article  Google Scholar 

  • Kott, A., & Linkov, I. (Eds.). (2019). Cyber resilience of systems and networks (pp. 381–401). Springer.

    Book  Google Scholar 

  • Kott, A., & Théron, P. (2020). Doers, not watchers: Intelligent autonomous agents are a path to cyber resilience. IEEE Security & Privacy, 18(3), 62–66.

    Article  Google Scholar 

  • Kott, A., Théron, P., Drašar, M., Dushku, E., LeBlanc, B., Losiewicz, P., Guarino, A., Mancini, L., Panico, A., Pihelgas, M. et al. (2018). Autonomous Intelligent Cyber-defense Agent (AICA) reference architecture. Release 2.0. arXiv preprint arXiv:1803.10664.

    Google Scholar 

  • Ligo, A. K., Kott, A., & Linkov, I. (2021). Autonomous cyberdefense introduces risk: Can we manage the risk? Computer, 54(10), 106–110.

    Article  Google Scholar 

  • Linkov, I., Galaitsi, S., Trump, B. D., Keisler, J. M., & Kott, A. (2020). Cybertrust: From explainable to actionable and interpretable artificial intelligence. Computer, 53(9), 91–96.

    Article  Google Scholar 

  • Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., & Hassabis, D. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529–533.

    Article  Google Scholar 

  • Mnih, V., Badia, A. P., Mirza, M., Graves, A., Lillicrap, T., Harley, T., & Kavukcuoglu, K. (2016). Asynchronous methods for deep reinforcement learning. In International conference on machine learning (pp. 1928–1937).

    Google Scholar 

  • Nachum, O., Gu, S. S., Lee, H., & Levine, S. (2018). Data-efficient hierarchical reinforcement learning. Advances in Neural Information Processing Systems, 31, 3303–3313.

    Google Scholar 

  • National Research Council. (2012). Disaster resilience: A national imperative. The National Academies Press. https://doi.org/10.17226/13457

    Book  Google Scholar 

  • Nguyen, T. T., & Reddi, V. J. (2019). Deep reinforcement learning for cyber security. IEEE Transactions on Neural Networks and Learning Systems. https://doi.org/10.1109/TNNLS.2021.3121870

  • Puiutta, E., & Veith, E. (2020). Explainable reinforcement learning: A survey. In International cross-domain conference for machine learning and knowledge extraction (pp. 77–95). Springer.

    Google Scholar 

  • Russell, S., & Norvig, P. (2010). Artificial intelligence: A modern approach (3rd ed.). Pearson.

    MATH  Google Scholar 

  • Saha, S., Vullikanti, A. K. S., Halappanavar, M., & Chatterjee, S. (2016). Identifying vulnerabilities and hardening attack graphs for networked systems. In 2016 IEEE symposium on Technologies for Homeland Security (HST) (pp. 1–6). IEEE.

    Google Scholar 

  • Santos, M., Diaz-Mercado, Y., & Egerstedt, M. (2018). Coverage control for multirobot teams with heterogeneous sensing capabilities. IEEE Robotics and Automation Letters, 3(2), 919–925.

    Article  Google Scholar 

  • Schulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. (2017). Proximal policy optimization algorithms. arXiv preprint, arXiv:1707.06347.

    Google Scholar 

  • Shoham, Y., & Leyton-Brown, K. (2008). Multiagent systems: Algorithmic, game-theoretic, and logical foundations. Cambridge University Press.

    Book  MATH  Google Scholar 

  • Singh, M. P. (2015). Cybersecurity as an application domain for multiagent systems. In AAMAS (pp. 1207–1212).

    Google Scholar 

  • Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction (2nd ed.). MIT press.

    MATH  Google Scholar 

  • Sutton, R. S., McAllester, D., Singh, S., & Mansour, Y. (1999). Policy gradient methods for reinforcement learning with function approximation. Advances in Neural Information Processing Systems, 12.

    Google Scholar 

  • Théron, P. & Kott, A. (2019). When autonomous intelligent goodware will fight autonomous intelligent malware: A possible future of cyber defense. In 2019–2019 IEEE Military Communications Conference (MILCOM) (pp. 1–7). IEEE.

    Google Scholar 

  • Tipireddy, R., Chatterjee, S., Paulson, P., Oster, M., & Halappanavar, M. (2017). Agent-centric approach for cybersecurity decision-support with partial observability. In 2017 IEEE international symposium on technologies for Homeland Security (HST) (pp. 1–6).

    Google Scholar 

  • Van Hasselt, H., Guez, A., & Silver, D. (2016, March). Deep reinforcement learning with double q-learning. In Proceedings of the AAAI conference on artificial intelligence (Vol. 30, No. 1).

    Google Scholar 

  • Wang, H., Shi, D., & Song, B. (2018, October). A dynamic role assignment formation control algorithm based on Hungarian method. In 2018 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computing, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI, pp. 687–696). IEEE.

    Google Scholar 

  • Weiss, G. (2013). Multiagent systems (Intelligent robotics and autonomous agents series) (2nd ed.). MIT Press.

    Google Scholar 

  • Zhang, K., Sun, T., Tao, Y., Genc, S., Mallya, S., & Basar, T. (2020). Robust multi-agent reinforcement learning with model uncertainty. Advances in Neural Information Processing Systems, 33, 10571–10583.

    Google Scholar 

Download references

Acknowledgments

The research described in this book chapter is part of the Mathematics for Artificial Reasoning in Science Initiative at Pacific Northwest National Laboratory (PNNL). It was conducted under the Laboratory Directed Research and Development Program at PNNL, a multiprogram national laboratory operated by Battelle for the U.S. Department of Energy under contract DE-AC05-76RL01830.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Samrat Chatterjee .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Chatterjee, S. et al. (2023). Collaboration and Negotiation. In: Kott, A. (eds) Autonomous Intelligent Cyber Defense Agent (AICA). Advances in Information Security, vol 87. Springer, Cham. https://doi.org/10.1007/978-3-031-29269-9_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-29269-9_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-29268-2

  • Online ISBN: 978-3-031-29269-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics