Abstract
Collaboration and Negotiation is a critical high-level function of an Autonomous Intelligent Cyber-Defense Agent (AICA) that enables communication among agents, central cyber command-and-control (C2), and human operators. Maintaining the Confidentiality, Integrity, and Availability (CIA) triad while achieving mission goals requires stealthy AICA agents to exercise: (1) minimal communication as needed for avoiding detection, (2) verification of information received with limited resources, and (3) active learning during operations to address dynamic conditions. Moreover, negotiations to jointly identify and execute a Course of Action (COA) will require building consensus under distributed and/or decentralized multi-agent settings with information uncertainties. This chapter presents algorithmic approaches for enabling the collaboration and negotiation function. Strengths and limitations of potential techniques are identified, and a representative example is illustrated. Specifically, a two-tier Multi-Agent Reinforcement Learning (MARL) algorithm was implemented to learn joint strategies among agents in a navigation and communication simulation environment. Based on simulation experiments, emergent collaborative team behaviors among agents were observed under information uncertainties. Recommendations for future development are also discussed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Alshiekh, M., Bloem, R., Ehlers, R., Könighofer, B., Niekum, S., & Topcu, U. (2018). Safe reinforcement learning via shielding. In Proceedings of the AAAI conference on artificial intelligence (Vol. 32, No. 1).
Amirkhani, A., & Barshooi, A. H. (2021). Consensus in multi-agent systems: A review. Artificial Intelligence Review, 1–39.
Barrett, C., & Tinelli, C. (2018). Satisfiability modulo theories. In E. M. Clarke et al. (Eds.), Handbook of model checking (pp. 305–343). Springer.
Barto, A. G., & Mahadevan, S. (2003). Recent advances in hierarchical reinforcement learning. Discrete Event Dynamic Systems, 13(1), 41–77.
Bertsekas, D. (2021). Rollout, policy iteration, and distributed reinforcement learning. Athena Scientific.
Bhattacharya, A., Bopardikar, S. D., Chatterjee, S., & Vrabie, D. (2019). Cyber threat screening using a queuing-based game-theoretic approach. Journal of Information Warfare, 18(4), 37–52.
Braziunas, D. (2003). POMDP solution methods. University of Toronto. https://www.techfak.uni-bielefeld.de/~skopp/Lehre/STdKI_SS10/POMDP_solution.pdf
Chatterjee, S., & Thekdi, S. (2020). An iterative learning and inference approach to managing dynamic cyber vulnerabilities of complex systems. Reliability Engineering & System Safety, 193, 106664.
Chatterjee, S., Halappanavar, M., Tipireddy, R., Oster, M., & Saha, S. (2015). Quantifying mixed uncertainties in cyber attacker payoffs. In 2015 IEEE international symposium on Technologies for Homeland Security (HST) (pp. 1–6). Best Paper Award (Cyber Security).
Chatterjee, S., Halappanavar, M., Tipireddy, R., & Oster, M. (2016). Game theory and uncertainty quantification for cyber defense applications. SIAM News, 49(6).
Chatterjee, S., Brigantic, R. T., & Waterworth, A. M. (Eds.). (2021). Applied risk analysis for guiding homeland security policy and decisions. Wiley.
Dutta, A., & Al-Shaer, E. (2019a, April). Cyber defense matrix: A new model for optimal composition of cybersecurity controls to construct resilient risk mitigation. In Proceedings of the 6th annual symposium on hot topics in the science of security (pp. 1–2).
Dutta, A., & Al-Shaer, E. (2019b, June). “What”, “Where”, and “Why” cybersecurity controls to enforce for optimal risk mitigation. In 2019 IEEE conference on Communications and Network Security (CNS) (pp. 160–168). IEEE.
Dutta, A., Al-Shaer, E., & Chatterjee, S. (2021). Constraints satisfiability driven reinforcement learning for autonomous cyber defense. arXiv preprint, arXiv:2104.08994. In Proceedings of the 1st international conference on Autonomous Intelligent Cyber-Defence Agents (AICA) 2021.
Figura, M., Lin, Y., Liu, J., & Gupta, V. (2021). Resilient consensus-based multi-agent reinforcement learning with function approximation. arXiv preprint, arXiv:2111.06776.
Hedin, Y., & Moradian, E. (2015). Security in multi-agent systems. Procedia Computer Science, 60, 1604–1612.
Hsu, C. D., Jeong, H., Pappas, G. J., & Chaudhari, P. (2020). Scalable reinforcement learning policies for multi-agent control. arXiv preprint, arXiv:2011.08055.
Huang, W., Mordatch, I., & Pathak, D. (2020). One policy to control them all: Shared modular policies for agent-agnostic control. In International conference on machine learning (pp. 4455–4464).
Haarnoja, T., Zhou, A., Abbeel, P., & Levine, S. (2018, July). Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In International conference on machine learning (pp. 1861–1870).
Iqbal, S., & Sha, F. (2019). Coordinated exploration via intrinsic rewards for multi-agent reinforcement learning. arXiv preprint arXiv:1905.12127.
Jafarian, J. H. H., Al-Shaer, E., & Duan, Q. (2014, November). Spatio-temporal address mutation for proactive cyber agility against sophisticated attackers. In Proceedings of the first ACM workshop on moving target defense (pp. 69–78).
Jafarian, J.H., Al-Shaer, E., & Duan, Q. (2015, April). Adversary-aware IP address randomization for proactive agility against sophisticated attackers. In 2015 IEEE conference on computer communications (INFOCOM) (pp. 738–746). IEEE.
Kaelbling, L. P., Littman, M. L., & Moore, A. W. (1996). Reinforcement learning: A survey. Journal of Artificial Intelligence Research, 4, 237–285.
Kott, A., & Linkov, I. (Eds.). (2019). Cyber resilience of systems and networks (pp. 381–401). Springer.
Kott, A., & Théron, P. (2020). Doers, not watchers: Intelligent autonomous agents are a path to cyber resilience. IEEE Security & Privacy, 18(3), 62–66.
Kott, A., Théron, P., Drašar, M., Dushku, E., LeBlanc, B., Losiewicz, P., Guarino, A., Mancini, L., Panico, A., Pihelgas, M. et al. (2018). Autonomous Intelligent Cyber-defense Agent (AICA) reference architecture. Release 2.0. arXiv preprint arXiv:1803.10664.
Ligo, A. K., Kott, A., & Linkov, I. (2021). Autonomous cyberdefense introduces risk: Can we manage the risk? Computer, 54(10), 106–110.
Linkov, I., Galaitsi, S., Trump, B. D., Keisler, J. M., & Kott, A. (2020). Cybertrust: From explainable to actionable and interpretable artificial intelligence. Computer, 53(9), 91–96.
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., & Hassabis, D. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529–533.
Mnih, V., Badia, A. P., Mirza, M., Graves, A., Lillicrap, T., Harley, T., & Kavukcuoglu, K. (2016). Asynchronous methods for deep reinforcement learning. In International conference on machine learning (pp. 1928–1937).
Nachum, O., Gu, S. S., Lee, H., & Levine, S. (2018). Data-efficient hierarchical reinforcement learning. Advances in Neural Information Processing Systems, 31, 3303–3313.
National Research Council. (2012). Disaster resilience: A national imperative. The National Academies Press. https://doi.org/10.17226/13457
Nguyen, T. T., & Reddi, V. J. (2019). Deep reinforcement learning for cyber security. IEEE Transactions on Neural Networks and Learning Systems. https://doi.org/10.1109/TNNLS.2021.3121870
Puiutta, E., & Veith, E. (2020). Explainable reinforcement learning: A survey. In International cross-domain conference for machine learning and knowledge extraction (pp. 77–95). Springer.
Russell, S., & Norvig, P. (2010). Artificial intelligence: A modern approach (3rd ed.). Pearson.
Saha, S., Vullikanti, A. K. S., Halappanavar, M., & Chatterjee, S. (2016). Identifying vulnerabilities and hardening attack graphs for networked systems. In 2016 IEEE symposium on Technologies for Homeland Security (HST) (pp. 1–6). IEEE.
Santos, M., Diaz-Mercado, Y., & Egerstedt, M. (2018). Coverage control for multirobot teams with heterogeneous sensing capabilities. IEEE Robotics and Automation Letters, 3(2), 919–925.
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. (2017). Proximal policy optimization algorithms. arXiv preprint, arXiv:1707.06347.
Shoham, Y., & Leyton-Brown, K. (2008). Multiagent systems: Algorithmic, game-theoretic, and logical foundations. Cambridge University Press.
Singh, M. P. (2015). Cybersecurity as an application domain for multiagent systems. In AAMAS (pp. 1207–1212).
Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction (2nd ed.). MIT press.
Sutton, R. S., McAllester, D., Singh, S., & Mansour, Y. (1999). Policy gradient methods for reinforcement learning with function approximation. Advances in Neural Information Processing Systems, 12.
Théron, P. & Kott, A. (2019). When autonomous intelligent goodware will fight autonomous intelligent malware: A possible future of cyber defense. In 2019–2019 IEEE Military Communications Conference (MILCOM) (pp. 1–7). IEEE.
Tipireddy, R., Chatterjee, S., Paulson, P., Oster, M., & Halappanavar, M. (2017). Agent-centric approach for cybersecurity decision-support with partial observability. In 2017 IEEE international symposium on technologies for Homeland Security (HST) (pp. 1–6).
Van Hasselt, H., Guez, A., & Silver, D. (2016, March). Deep reinforcement learning with double q-learning. In Proceedings of the AAAI conference on artificial intelligence (Vol. 30, No. 1).
Wang, H., Shi, D., & Song, B. (2018, October). A dynamic role assignment formation control algorithm based on Hungarian method. In 2018 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computing, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI, pp. 687–696). IEEE.
Weiss, G. (2013). Multiagent systems (Intelligent robotics and autonomous agents series) (2nd ed.). MIT Press.
Zhang, K., Sun, T., Tao, Y., Genc, S., Mallya, S., & Basar, T. (2020). Robust multi-agent reinforcement learning with model uncertainty. Advances in Neural Information Processing Systems, 33, 10571–10583.
Acknowledgments
The research described in this book chapter is part of the Mathematics for Artificial Reasoning in Science Initiative at Pacific Northwest National Laboratory (PNNL). It was conducted under the Laboratory Directed Research and Development Program at PNNL, a multiprogram national laboratory operated by Battelle for the U.S. Department of Energy under contract DE-AC05-76RL01830.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Chatterjee, S. et al. (2023). Collaboration and Negotiation. In: Kott, A. (eds) Autonomous Intelligent Cyber Defense Agent (AICA). Advances in Information Security, vol 87. Springer, Cham. https://doi.org/10.1007/978-3-031-29269-9_11
Download citation
DOI: https://doi.org/10.1007/978-3-031-29269-9_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-29268-2
Online ISBN: 978-3-031-29269-9
eBook Packages: Computer ScienceComputer Science (R0)