Collaboration and Negotiation

Chatterjee, Samrat; Bhattacharya, Arnab; Dutta, Ashutosh; Rahman, Aowabin; Ramachandran, Thiagarajan; Chikkagoudar, Satish; Bharadwaj, Ramesh

doi:10.1007/978-3-031-29269-9_11

Samrat Chatterjee⁶,
Arnab Bhattacharya⁷,
Ashutosh Dutta⁸,
Aowabin Rahman⁷,
Thiagarajan Ramachandran⁷,
Satish Chikkagoudar⁹ &
…
Ramesh Bharadwaj⁹

Part of the book series: Advances in Information Security ((ADIS,volume 87))

503 Accesses

Abstract

Collaboration and Negotiation is a critical high-level function of an Autonomous Intelligent Cyber-Defense Agent (AICA) that enables communication among agents, central cyber command-and-control (C2), and human operators. Maintaining the Confidentiality, Integrity, and Availability (CIA) triad while achieving mission goals requires stealthy AICA agents to exercise: (1) minimal communication as needed for avoiding detection, (2) verification of information received with limited resources, and (3) active learning during operations to address dynamic conditions. Moreover, negotiations to jointly identify and execute a Course of Action (COA) will require building consensus under distributed and/or decentralized multi-agent settings with information uncertainties. This chapter presents algorithmic approaches for enabling the collaboration and negotiation function. Strengths and limitations of potential techniques are identified, and a representative example is illustrated. Specifically, a two-tier Multi-Agent Reinforcement Learning (MARL) algorithm was implemented to learn joint strategies among agents in a navigation and communication simulation environment. Based on simulation experiments, emergent collaborative team behaviors among agents were observed under information uncertainties. Recommendations for future development are also discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 139.00; Price excludes VAT (USA)

Softcover Book: USD 179.99; Price excludes VAT (USA)

Hardcover Book: USD 179.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Alshiekh, M., Bloem, R., Ehlers, R., Könighofer, B., Niekum, S., & Topcu, U. (2018). Safe reinforcement learning via shielding. In Proceedings of the AAAI conference on artificial intelligence (Vol. 32, No. 1).
Google Scholar
Amirkhani, A., & Barshooi, A. H. (2021). Consensus in multi-agent systems: A review. Artificial Intelligence Review, 1–39.
Google Scholar
Barrett, C., & Tinelli, C. (2018). Satisfiability modulo theories. In E. M. Clarke et al. (Eds.), Handbook of model checking (pp. 305–343). Springer.
Chapter Google Scholar
Barto, A. G., & Mahadevan, S. (2003). Recent advances in hierarchical reinforcement learning. Discrete Event Dynamic Systems, 13(1), 41–77.
Article MathSciNet MATH Google Scholar
Bertsekas, D. (2021). Rollout, policy iteration, and distributed reinforcement learning. Athena Scientific.
Google Scholar
Bhattacharya, A., Bopardikar, S. D., Chatterjee, S., & Vrabie, D. (2019). Cyber threat screening using a queuing-based game-theoretic approach. Journal of Information Warfare, 18(4), 37–52.
Google Scholar
Braziunas, D. (2003). POMDP solution methods. University of Toronto. https://www.techfak.uni-bielefeld.de/~skopp/Lehre/STdKI_SS10/POMDP_solution.pdf
Chatterjee, S., & Thekdi, S. (2020). An iterative learning and inference approach to managing dynamic cyber vulnerabilities of complex systems. Reliability Engineering & System Safety, 193, 106664.
Article Google Scholar
Chatterjee, S., Halappanavar, M., Tipireddy, R., Oster, M., & Saha, S. (2015). Quantifying mixed uncertainties in cyber attacker payoffs. In 2015 IEEE international symposium on Technologies for Homeland Security (HST) (pp. 1–6). Best Paper Award (Cyber Security).
Google Scholar
Chatterjee, S., Halappanavar, M., Tipireddy, R., & Oster, M. (2016). Game theory and uncertainty quantification for cyber defense applications. SIAM News, 49(6).
Google Scholar
Chatterjee, S., Brigantic, R. T., & Waterworth, A. M. (Eds.). (2021). Applied risk analysis for guiding homeland security policy and decisions. Wiley.
Google Scholar
Dutta, A., & Al-Shaer, E. (2019a, April). Cyber defense matrix: A new model for optimal composition of cybersecurity controls to construct resilient risk mitigation. In Proceedings of the 6th annual symposium on hot topics in the science of security (pp. 1–2).
Google Scholar
Dutta, A., & Al-Shaer, E. (2019b, June). “What”, “Where”, and “Why” cybersecurity controls to enforce for optimal risk mitigation. In 2019 IEEE conference on Communications and Network Security (CNS) (pp. 160–168). IEEE.
Google Scholar
Dutta, A., Al-Shaer, E., & Chatterjee, S. (2021). Constraints satisfiability driven reinforcement learning for autonomous cyber defense. arXiv preprint, arXiv:2104.08994. In Proceedings of the 1st international conference on Autonomous Intelligent Cyber-Defence Agents (AICA) 2021.
Google Scholar
Figura, M., Lin, Y., Liu, J., & Gupta, V. (2021). Resilient consensus-based multi-agent reinforcement learning with function approximation. arXiv preprint, arXiv:2111.06776.
Google Scholar
Hedin, Y., & Moradian, E. (2015). Security in multi-agent systems. Procedia Computer Science, 60, 1604–1612.
Article Google Scholar
Hsu, C. D., Jeong, H., Pappas, G. J., & Chaudhari, P. (2020). Scalable reinforcement learning policies for multi-agent control. arXiv preprint, arXiv:2011.08055.
Google Scholar
Huang, W., Mordatch, I., & Pathak, D. (2020). One policy to control them all: Shared modular policies for agent-agnostic control. In International conference on machine learning (pp. 4455–4464).
Google Scholar
Haarnoja, T., Zhou, A., Abbeel, P., & Levine, S. (2018, July). Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In International conference on machine learning (pp. 1861–1870).
Google Scholar
Iqbal, S., & Sha, F. (2019). Coordinated exploration via intrinsic rewards for multi-agent reinforcement learning. arXiv preprint arXiv:1905.12127.
Google Scholar
Jafarian, J. H. H., Al-Shaer, E., & Duan, Q. (2014, November). Spatio-temporal address mutation for proactive cyber agility against sophisticated attackers. In Proceedings of the first ACM workshop on moving target defense (pp. 69–78).
Google Scholar
Jafarian, J.H., Al-Shaer, E., & Duan, Q. (2015, April). Adversary-aware IP address randomization for proactive agility against sophisticated attackers. In 2015 IEEE conference on computer communications (INFOCOM) (pp. 738–746). IEEE.
Google Scholar
Kaelbling, L. P., Littman, M. L., & Moore, A. W. (1996). Reinforcement learning: A survey. Journal of Artificial Intelligence Research, 4, 237–285.
Article Google Scholar
Kott, A., & Linkov, I. (Eds.). (2019). Cyber resilience of systems and networks (pp. 381–401). Springer.
Book Google Scholar
Kott, A., & Théron, P. (2020). Doers, not watchers: Intelligent autonomous agents are a path to cyber resilience. IEEE Security & Privacy, 18(3), 62–66.
Article Google Scholar
Kott, A., Théron, P., Drašar, M., Dushku, E., LeBlanc, B., Losiewicz, P., Guarino, A., Mancini, L., Panico, A., Pihelgas, M. et al. (2018). Autonomous Intelligent Cyber-defense Agent (AICA) reference architecture. Release 2.0. arXiv preprint arXiv:1803.10664.
Google Scholar
Ligo, A. K., Kott, A., & Linkov, I. (2021). Autonomous cyberdefense introduces risk: Can we manage the risk? Computer, 54(10), 106–110.
Article Google Scholar
Linkov, I., Galaitsi, S., Trump, B. D., Keisler, J. M., & Kott, A. (2020). Cybertrust: From explainable to actionable and interpretable artificial intelligence. Computer, 53(9), 91–96.
Article Google Scholar
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., & Hassabis, D. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529–533.
Article Google Scholar
Mnih, V., Badia, A. P., Mirza, M., Graves, A., Lillicrap, T., Harley, T., & Kavukcuoglu, K. (2016). Asynchronous methods for deep reinforcement learning. In International conference on machine learning (pp. 1928–1937).
Google Scholar
Nachum, O., Gu, S. S., Lee, H., & Levine, S. (2018). Data-efficient hierarchical reinforcement learning. Advances in Neural Information Processing Systems, 31, 3303–3313.
Google Scholar
National Research Council. (2012). Disaster resilience: A national imperative. The National Academies Press. https://doi.org/10.17226/13457
Book Google Scholar
Nguyen, T. T., & Reddi, V. J. (2019). Deep reinforcement learning for cyber security. IEEE Transactions on Neural Networks and Learning Systems. https://doi.org/10.1109/TNNLS.2021.3121870
Puiutta, E., & Veith, E. (2020). Explainable reinforcement learning: A survey. In International cross-domain conference for machine learning and knowledge extraction (pp. 77–95). Springer.
Google Scholar
Russell, S., & Norvig, P. (2010). Artificial intelligence: A modern approach (3rd ed.). Pearson.
MATH Google Scholar
Saha, S., Vullikanti, A. K. S., Halappanavar, M., & Chatterjee, S. (2016). Identifying vulnerabilities and hardening attack graphs for networked systems. In 2016 IEEE symposium on Technologies for Homeland Security (HST) (pp. 1–6). IEEE.
Google Scholar
Santos, M., Diaz-Mercado, Y., & Egerstedt, M. (2018). Coverage control for multirobot teams with heterogeneous sensing capabilities. IEEE Robotics and Automation Letters, 3(2), 919–925.
Article Google Scholar
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. (2017). Proximal policy optimization algorithms. arXiv preprint, arXiv:1707.06347.
Google Scholar
Shoham, Y., & Leyton-Brown, K. (2008). Multiagent systems: Algorithmic, game-theoretic, and logical foundations. Cambridge University Press.
Book MATH Google Scholar
Singh, M. P. (2015). Cybersecurity as an application domain for multiagent systems. In AAMAS (pp. 1207–1212).
Google Scholar
Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction (2nd ed.). MIT press.
MATH Google Scholar
Sutton, R. S., McAllester, D., Singh, S., & Mansour, Y. (1999). Policy gradient methods for reinforcement learning with function approximation. Advances in Neural Information Processing Systems, 12.
Google Scholar
Théron, P. & Kott, A. (2019). When autonomous intelligent goodware will fight autonomous intelligent malware: A possible future of cyber defense. In 2019–2019 IEEE Military Communications Conference (MILCOM) (pp. 1–7). IEEE.
Google Scholar
Tipireddy, R., Chatterjee, S., Paulson, P., Oster, M., & Halappanavar, M. (2017). Agent-centric approach for cybersecurity decision-support with partial observability. In 2017 IEEE international symposium on technologies for Homeland Security (HST) (pp. 1–6).
Google Scholar
Van Hasselt, H., Guez, A., & Silver, D. (2016, March). Deep reinforcement learning with double q-learning. In Proceedings of the AAAI conference on artificial intelligence (Vol. 30, No. 1).
Google Scholar
Wang, H., Shi, D., & Song, B. (2018, October). A dynamic role assignment formation control algorithm based on Hungarian method. In 2018 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computing, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI, pp. 687–696). IEEE.
Google Scholar
Weiss, G. (2013). Multiagent systems (Intelligent robotics and autonomous agents series) (2nd ed.). MIT Press.
Google Scholar
Zhang, K., Sun, T., Tao, Y., Genc, S., Mallya, S., & Basar, T. (2020). Robust multi-agent reinforcement learning with model uncertainty. Advances in Neural Information Processing Systems, 33, 10571–10583.
Google Scholar

Download references

Acknowledgments

The research described in this book chapter is part of the Mathematics for Artificial Reasoning in Science Initiative at Pacific Northwest National Laboratory (PNNL). It was conducted under the Laboratory Directed Research and Development Program at PNNL, a multiprogram national laboratory operated by Battelle for the U.S. Department of Energy under contract DE-AC05-76RL01830.

Author information

Authors and Affiliations

Physical and Computational Sciences Directorate, Pacific Northwest National Laboratory, Richland, WA, USA
Samrat Chatterjee
Energy and Environment Directorate, Pacific Northwest National Laboratory, Richland, WA, USA
Arnab Bhattacharya, Aowabin Rahman & Thiagarajan Ramachandran
National Security Directorate, Pacific Northwest National Laboratory, Richland, WA, USA
Ashutosh Dutta
Center for High Assurance Computer Systems, U.S. Naval Research Laboratory, Washington, DC, USA
Satish Chikkagoudar & Ramesh Bharadwaj

Authors

Samrat Chatterjee
View author publications
You can also search for this author in PubMed Google Scholar
Arnab Bhattacharya
View author publications
You can also search for this author in PubMed Google Scholar
Ashutosh Dutta
View author publications
You can also search for this author in PubMed Google Scholar
Aowabin Rahman
View author publications
You can also search for this author in PubMed Google Scholar
Thiagarajan Ramachandran
View author publications
You can also search for this author in PubMed Google Scholar
Satish Chikkagoudar
View author publications
You can also search for this author in PubMed Google Scholar
Ramesh Bharadwaj
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Samrat Chatterjee .

Editor information

Editors and Affiliations

United States Army Research Laboratory, Adelphi, MD, USA
Alexander Kott

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Chatterjee, S. et al. (2023). Collaboration and Negotiation. In: Kott, A. (eds) Autonomous Intelligent Cyber Defense Agent (AICA). Advances in Information Security, vol 87. Springer, Cham. https://doi.org/10.1007/978-3-031-29269-9_11

Download citation

DOI: https://doi.org/10.1007/978-3-031-29269-9_11
Published: 03 June 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-29268-2
Online ISBN: 978-3-031-29269-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics