Skip to main content

Learning and Reusing Goal-Specific Policies for Goal-Driven Autonomy

  • Conference paper

Part of the Lecture Notes in Computer Science book series (LNAI,volume 7466)

Abstract

In certain adversarial environments, reinforcement learning (RL) techniques require a prohibitively large number of episodes to learn a high-performing strategy for action selection. For example, Q-learning is particularly slow to learn a policy to win complex strategy games. We propose GRL, the first GDA system capable of learning and reusing goal-specific policies. GRL is a case-based goal-driven autonomy (GDA) agent embedded in the RL cycle. GRL acquires and reuses cases that capture episodic knowledge about an agent’s (1) expectations, (2) goals to pursue when these expectations are not met, and (3) actions for achieving these goals in given states. Our hypothesis is that, unlike RL, GRL can rapidly fine-tune strategies by exploiting the episodic knowledge captured in its cases. We report performance gains versus a state-of-the-art GDA agent and an RL agent for challenging tasks in two real-time video game domains.

Keywords

  • Case-based learning
  • reinforcement learning
  • goal-driven autonomy

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (Canada)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (Canada)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (Canada)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Aha, D.W., Klenk, M., Muñoz-Avila, H., Ram, A., Shapiro, D. (eds.): Goal-Directed Autonomy: Notes from the AAAI Workshop (W4), Atlanta, GA (2010), http://www.cse.lehigh.edu/~munoz/gda/

  • Auslander, B., Lee-Urban, S., Hogg, C., Muñoz-Avila, H.: Recognizing the Enemy: Combining Reinforcement Learning with Strategy Selection Using Case-Based Reasoning. In: Althoff, K.-D., Bergmann, R., Minor, M., Hanft, A. (eds.) ECCBR 2008. LNCS (LNAI), vol. 5239, pp. 59–73. Springer, Heidelberg (2008)

    CrossRef  Google Scholar 

  • Bianchi, R.A.C., Ros, R., Lopez de Mantaras, R.: Improving Reinforcement Learning by Using Case Based Heuristics. In: McGinty, L., Wilson, D.C. (eds.) ICCBR 2009. LNCS, vol. 5650, pp. 75–89. Springer, Heidelberg (2009)

    CrossRef  Google Scholar 

  • Bridge, D.G.: The Virtue of Reward: Performance, Reinforcement and Discovery in Case-Based Reasoning. In: Muñoz-Ávila, H., Ricci, F. (eds.) ICCBR 2005. LNCS (LNAI), vol. 3620, pp. 1–1. Springer, Heidelberg (2005)

    CrossRef  Google Scholar 

  • Bouguerra, A., Karlsson, L., Saffiotti, A.: Monitoring the execution of robot plans using semantic knowledge. Robotics and Autonomous Systems 56(11), 942–954 (2008)

    CrossRef  Google Scholar 

  • Cox, M.T.: Perpetual self-aware cognitive agents. AI Magazine 28(1), 23–45 (2007)

    Google Scholar 

  • Dilts, M., Muñoz-Avila, H.: Reducing the Memory Footprint of Temporal Difference Learning over Finitely Many States by Using Case-Based Generalization. In: Bichindaritz, I., Montani, S. (eds.) ICCBR 2010. LNCS, vol. 6176, pp. 81–95. Springer, Heidelberg (2010)

    CrossRef  Google Scholar 

  • Gabel, T., Riedmiller, M.: CBR for State Value Function Approximation in Reinforcement Learning. In: Muñoz-Ávila, H., Ricci, F. (eds.) ICCBR 2005. LNCS (LNAI), vol. 3620, pp. 206–221. Springer, Heidelberg (2005)

    CrossRef  Google Scholar 

  • Gabel, T., Riedmiller, M.: An Analysis of Case-Based Value Function Approximation by Approximating State Transition Graphs. In: Weber, R.O., Richter, M.M. (eds.) ICCBR 2007. LNCS (LNAI), vol. 4626, pp. 344–358. Springer, Heidelberg (2007)

    CrossRef  Google Scholar 

  • Ghallab, M., Nau, D.S., Traverso, P.: Automated planning: Theory and practice. Morgan Kaufmann, San Mateo (2004)

    MATH  Google Scholar 

  • Jaidee, U., Muñoz-Avila, H., Aha, D.W.: Integrated learning for goal-driven autonomy. In: Proceedings of the Twenty-Second International Conference on Artificial Intelligence. AAAI Press, Barcelona (2011a)

    Google Scholar 

  • Jaidee, U., Muñoz-Avila, H., Aha, D.W.: Case-based learning in goal-driven autonomy agents for real-time strategy combat tasks. In: Floyd, M.W., Sánchez-Ruiz, A.A. (eds.) Case-Based Reasoning in Computer Games: Papers from the ICCBR Workshop. U. Greenwich, London (2011b)

    Google Scholar 

  • Lopez de Mántaras, R., McSherry, D., Bridge, D., Leake, D., Smyth, B., Craw, S., Faltings, B., Maher, M.L., Cox, M.T., Forbus, K., Keane, M., Aamodt, A., Watson, I.: Retrieval, reuse and retention in case-based reasoning. Knowledge Engineering Review 20(3), 215–240 (2005)

    CrossRef  Google Scholar 

  • Molineaux, M., Klenk, M., Aha, D.W.: Goal-driven autonomy in a Navy strategy simulation. In: Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence. AAAI Press, Atlanta (2010)

    Google Scholar 

  • Muñoz-Avila, H., Jaidee, U., Aha, D.W., Carter, E.: Goal-Driven Autonomy with Case-Based Reasoning. In: Bichindaritz, I., Montani, S. (eds.) ICCBR 2010. LNCS, vol. 6176, pp. 228–241. Springer, Heidelberg (2010)

    CrossRef  Google Scholar 

  • Nau, D.S.: Current trends in automated planning. AI Magazine 28(4), 43–58 (2007)

    Google Scholar 

  • Powell, J., Molineaux, M., Aha, D.W.: Active and interactive discovery of goal selection knowledge. In: Proceedings of the Twenty-Fourth Conference of the Florida AI Research Society. AAAI Press, West Palm Beach (2011)

    Google Scholar 

  • Ram, A., Santamaria, J.C.: Continuous case-based reasoning. Artificial Intelligence 90(1-2), 25–77 (1997)

    CrossRef  MATH  Google Scholar 

  • Smith, M., Lee-Urban, S., Muñoz-Avila, H.: RETALIATE: Learning winning policies in first-person shooter games. In: Proceedings of the Nineteenth Innovative Applications of AI Conference, pp. 1801–1806. AAAI Press, Vancouver (2007)

    Google Scholar 

  • Sutton, R.S., Barto, A.G.: Reinforcement learning: An introduction. MIT Press, Cambridge (1998)

    Google Scholar 

  • Wargus. Source code for Wargus (2012), http://wargus.sourceforge.net/ (last checked: January 2012)

  • Weber, B., Mateas, M., Jhala, A.: Applying goal-driven autonomy to StarCraft. In: Proceedings of the Sixth Conference on Artificial Intelligence and Interactive Digital Entertainment. AAAI Press, Stanford (2010a)

    Google Scholar 

  • Weber, B., Mateas, M., Jhala, A.: Case-based goal formulation. In: Aha, et al. (eds.) (2010b)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Jaidee, U., Muñoz-Avila, H., Aha, D.W. (2012). Learning and Reusing Goal-Specific Policies for Goal-Driven Autonomy. In: Agudo, B.D., Watson, I. (eds) Case-Based Reasoning Research and Development. ICCBR 2012. Lecture Notes in Computer Science(), vol 7466. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32986-9_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-32986-9_15

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-32985-2

  • Online ISBN: 978-3-642-32986-9

  • eBook Packages: Computer ScienceComputer Science (R0)