Advertisement

eSense 2.0: Modeling Multi-agent Biomimetic Predation with Multi-layered Reinforcement Learning

  • D. Michael FranklinEmail author
  • Derek Martin
Conference paper
Part of the Lecture Notes in Networks and Systems book series (LNNS, volume 70)

Abstract

Learning in multi-agent systems, especially with adversarial behavior being exhibited, is difficult and challenging. The learning within these complicated environments is often muddied by the multitudinous conflicting or poorly correlated data coming from the multiple agents and their diverse goals. This should not be compared against well-known flocking-type behaviors where each agent has the same policy; rather, in our scenario each agent may have their own policy, sets of behaviors, or overall group strategy. Most learning algorithms will observe the actions of the agents and inform their algorithm which seeks to form the models. When these actions are consistent a reasonable model can be formed; however, eSense was designed to work even when observing complicated and highly-interactive must-agent behavior. eSense provides a powerful yet simplistic reinforcement learning algorithm that employs model-based behavior across multiple learning layers. These independent layers split the learning objectives across multiple layers, avoiding the learning-confusion common in many multi-agent systems. We examine a multi-agent predator-prey biomimetic sensing environment that simulates such coordinated and adversarial behaviors across multiple goals. This work could also be applied to theater wide autonomous vehicle coordination, such as that of the hierarchical command and control of autonomous drones and ground vehicles.

Keywords

Artificial intelligence Multi-agent systems Strategy Hierarchical reasoning Predator-prey 

References

  1. 1.
    Ammari Habib, T.B., Garnier, J.: Modeling active electrolocation in weakly electric fish. SIAM J. Imaging Sci. 6(1), 285–321 (2013)MathSciNetCrossRefGoogle Scholar
  2. 2.
    Batista, G.E.A.P.A., Prati, R.C., Monard, M.C.: A study of the behavior of several methods for balancing machine learning training data. SIGKDD Explor. Newslett. 6(1), 20–29 (2004).  https://doi.org/10.1145/1007730.1007735CrossRefGoogle Scholar
  3. 3.
    Boyer, F., et al.: Model for a sensor inspired by electric fish. IEEE Trans. Robot. 28(2), 492–505 (2012)CrossRefGoogle Scholar
  4. 4.
    Coggan, M.: Exploration and Exploitation in Reinforcement Learning, 3(3), p. 1448. CRA-W DMP Project at McGill University, Scholarpedia (2008)Google Scholar
  5. 5.
    Franklin, D.M.: Strategy inference in stochastic games using belief networks comprised of probabilistic graphical models. In: Proceedings of FLAIRS (2015)Google Scholar
  6. 6.
    Franklin, D.M., Martin, D.: eSense: BioMimetic modeling of echolocation and electrolocation using homeostatic dual-layered reinforcement learning. Proc. ACM SE 2016 (2016)Google Scholar
  7. 7.
    Freedman, H., Waltman, P.: Persistence in models of three interacting predator-prey populations. Math. Biosci. 68(2), 213–231 (1984)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Hopkins, C.D.: Electroreception: Passive Electrolocation and the Sensory Guidance of Oriented Behavior. Springer, New York (2005)Google Scholar
  9. 9.
    Hussein, S.: Predator-prey modeling. Undergraduate J. Math. Model.: One + Two 3(1), 32 (2010)Google Scholar
  10. 10.
    Lima, S.L.: Putting predators back into behavioral predator-prey interactions. Trends Ecol. Evol. 17(2), 70–75 (2002)CrossRefGoogle Scholar
  11. 11.
    Shieh, K.T., et al.: Short-range orientation in electric fish: an experimental study of passive electrolocation. J. Exp. Biol. 199(11), 2383–2393 (1996)Google Scholar
  12. 12.
    Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, Chaps. 4, 5, 8 (1998)CrossRefGoogle Scholar
  13. 13.
    Taylor, M.E., Whiteson, S., Stone, P.: Comparing evolutionary and temporal difference methods in a reinforcement learning domain. In: Proceedings of the 8th Annual Conference on Genetic and Evolutionary Computation. ACM (2006)Google Scholar
  14. 14.
    Woergoetter, F., Porr, B.: Reinforcement learning. Scholarpedia 3(3), 1448 (2008)CrossRefGoogle Scholar
  15. 15.
    Yi, F., Wei, J., Shi, J.: Bifurcation and spatiotemporal patterns in a homogeneous diffusive predatorprey system. J. Differ. Equ. 246(5), 1944–1977 (2009).  https://doi.org/10.1016/j.jde.2008.10.024, http://www.sciencedirect.com/science/article/pii/S0022039608004373MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Kennesaw State UniversityMariettaUSA

Personalised recommendations