Encyclopedia of Complexity and Systems Science

Living Edition
| Editors: Robert A. Meyers

Learning and Planning (Intelligent Systems)

  • Ugur Kuter
Living reference work entry
DOI: https://doi.org/10.1007/978-3-642-27737-5_308-2


The ability to produce knowledge about past experiences and exploit that knowledge in an operational context in later problem-solving and planning sessions is an important attribute of any intelligent system, human- or AI-based alike. Automated planning and learning is a research paradigm that focuses on the development of intelligent systems and technologies that combine the ability to make decisions and generate courses of actions (i.e., plans) with the capability to reason and produce knowledge about past experiences, future problems that the system needs to tackle, and strategies about how to tackle them.

Probably the first work that laid a formal treatment for this combination is the early planning system STRIPS (Fikes and Nilsson 1971), developed in the early 1970s. The STRIPS planning system was an evidence that planning and learning are usually two pieces of an intelligent system, where the knowledge acquired via learning is used to enhance the problem-solving and...


Planning Problem Planning Domain Control Rule Target Concept Domain Theory 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in to check access.



This work was partly supported by DARPA’s “Mission-oriented Resilient Cloud (MRC)” (Contract #: FA865011C7191) and Office of Naval Research (ONR)’s “Computational Methods for Decision Making” program (Contract #: N0001412C0239). An earlier version of this work was partly supported by DARPA’s Transfer Learning and Integrated Learning programs. The opinions in this entry are those of the author and do not necessarily reflect the opinions of the funders.


  1. Aha DW (2002) Plan deconfliction, repair, and authoring in EDSS. Technical report, Naval Research Laboratory. Progress reportGoogle Scholar
  2. Ai-Chang M, Bresina J, Charest L, Hsu J, Josson AK, Kanefsky B, Maldague P, Morris P, Rajan K, Yglesias J (2004) Mapgen: mixed-initiative planning and scheduling for the mars exploration rover mission. Intelligent Systems, IEEE 19(1):8–12Google Scholar
  3. Basar T, Olsder GJ (1995) Dynamic noncooperative game theory. Academic, London/New YorkzbMATHGoogle Scholar
  4. Bergmann R, Wilke W (1996) On the role of abstraction in case-based reasoning. In: EWCBR. Berlin Heidelberg, Germany pp 28–43Google Scholar
  5. Borrajo D, Veloso M (1997) Lazy incremental learning of control knowledge for efficiently obtaining quality plans. Artif Intell Rev 11(1–5):371–405CrossRefGoogle Scholar
  6. Botea A, Müller M, Schaeffer J (2005) Learning partial-order macros from solutions. In: ICAPS. Monterey, California, USA. pp 231–240Google Scholar
  7. Burstein M, Brinn M, Cox M, Hussain T, Laddaga R, McDermott D, McDonald D, Tomlinson R (2007) An architecture and language for the integrated learning of demonstrations. In: AAAI workshop acquiring planning knowledge via demonstration. Vancouver, British Columbia, Canada. pp 6–11Google Scholar
  8. Candadai A, Herrmann JW, Minis I (1995) A group technology-based variant approach for agile manufacturing. In: Proceedings of the ASME international mechanical engineering congress and exposition, San Francisco. ASMEGoogle Scholar
  9. Candadai A, Herrmann JW, Minis I (1996) Applications of group technology in distributed manufacturing. J Intell Manuf 7:271–291CrossRefGoogle Scholar
  10. Chien SA (1989) Using and refining simplifications: explanation-based learning of plans in intractable domains. In: IJCAI. Detroit, Michigan, USAGoogle Scholar
  11. Choi D, Langley P (2005) Learning teleoreactive logic programs from problem solving. In: International conference on inductive logic programming. Berlin Heidelberg, Germany. pp 51–68Google Scholar
  12. Collins D (2005) A synthesis process model of creative thinking in music composition. Psychol Music 33(2):193–216CrossRefGoogle Scholar
  13. DeJong GF, Mooney R (1986) Explanation-based learning: an alternative view. Mach Learn 1(2):145–176Google Scholar
  14. Dietterich TG (2000) Hierarchical reinforcement learning with the MAXQ value function decomposition. J Artif Intell Res 13:227–303MathSciNetzbMATHGoogle Scholar
  15. DOT (1999) An assessment of the U.S. marine transportation system, a report to congress. Technical report, U.S. Department of Transportation. 103 ppGoogle Scholar
  16. Edelkamp S, Hoffmann J (2004) International planning competition. http://ipc.icaps-conference.org
  17. Erol K, Nau DS, Subrahmanian VS (1995) Complexity, decidability and undecidability results for domain-independent planning. Artif Intell 76(1–2):75–88MathSciNetCrossRefzbMATHGoogle Scholar
  18. Erol K, Hendler J, Nau DS (1996) Complexity results for hierarchical task-network planning. Ann Math Artif Intell 18:69–93MathSciNetCrossRefzbMATHGoogle Scholar
  19. Estlin TA, Mooney RJ (1997) Learning to improve both efficiency and quality of planning. In: IJCAI, Nagoya, pp 1227–1232Google Scholar
  20. Fern A, Yoon S, Givan R (2004) Learning domain-specific control knowledge from random walks. In: ICAPS. Whistler, British Columbia, CanadaGoogle Scholar
  21. Fikes RE, Nilsson NJ (1971) STRIPS: a new approach to the application of theorem proving to problem solving. Artif Intell 2:189–208CrossRefzbMATHGoogle Scholar
  22. Franklin GF, David Powell J, Emami-Naeini A (2002) Feedback control of dynamic systems. Prentice Hall, New JerseyGoogle Scholar
  23. Gao S (2012) Computable analysis, decision procedures, and hybrid automata: a new framework for the formal verification of cyber-physical systems. PhD thesis, CMUGoogle Scholar
  24. Gerevini A, Dimopoulos Y, Haslum P, Saetti A (2006) International planning competition. http://zeus.ing.unibs.it/ipc-5/
  25. Ghallab M, Nau D, Traverso P (2004) Automated planning: theory and practice. Morgan KaufmannzbMATHGoogle Scholar
  26. Giraud-Carrier C (2000) A note on the utility of incremental learning. AI Commun 13:215–223zbMATHGoogle Scholar
  27. Goldman R (2004) Adapting research planners for applications. In: ICAPS 2004 workshop on connecting planning theory with practice. Whistler, British Columbia, CanadaGoogle Scholar
  28. Gratch J, DeJong G (1992) Composer: a probabilistic solution to the utility problem in speed-up learning. In: AAAI. San Jose, California, USAGoogle Scholar
  29. Hebbar K, Smith SJJ, Minis I, Nau DS (1996) Plan-based evaluation of designs for microwave modules. In: Proceedings of the ASME design technical conference. Irvine, California, USAGoogle Scholar
  30. Henderson M, Musti S (1988) Automated group technology part coding from a three-dimensional cad database. J Eng Ind 110(3):278–287CrossRefGoogle Scholar
  31. Huang Y, Kautz H, Selman B (2000) Learning declarative control rules for constraint-based planning. In: ICML. Stanford, California, USAGoogle Scholar
  32. Ilghami O, Munoz-Avila H, Nau DS, Aha DW (2005a) Learning approximate pre-conditions for methods in hierarchical plans. In: ICML, BonnGoogle Scholar
  33. Ilghami O, Nau DS, Muñoz-Avila H, Aha DW (2005b) Learning preconditions for planning from plan traces and HTN structure. Comput Intell 21(4):388–413CrossRefzbMATHGoogle Scholar
  34. Kambhampati S (2000) Planning graph as (dynamic) CSP: exploiting EBL, DDB and other CSP techniques in Graphplan. J Artif Intell Res 12:1–34zbMATHGoogle Scholar
  35. Kambhampati S, Katukam S, Qu Y (1996) Failure driven dynamic search control for partial order planners: an explanation based approach. Artif Intell 88(1–2):253–315CrossRefzbMATHGoogle Scholar
  36. Knoblock CA (1993) Generating abstraction hierarchies: an automated approach to reducing search in planning. Kluwer, BostonCrossRefzbMATHGoogle Scholar
  37. Laird J, Newell A, Rosenbloom P (1987) SOAR: an architecture for general intelligence. Artif Intell 33(1):1–67MathSciNetCrossRefGoogle Scholar
  38. Lanchas J, Jimenez S, Fernandez F, Borrajo D (2007) Learning action durations from executions. In: Proceedings of the ICAPS-07 workshop on AI planning and learning. Providence, Rhode Island, USAGoogle Scholar
  39. Langley P, Choi D (2006) Learning recursive control programs from problem solving. J Mach Learn Res 7:493–518MathSciNetzbMATHGoogle Scholar
  40. Levine G, DeJong GF (2006) Explanation-based acquisition of planning operators. In: ICAPS. Ambleside, The English Lake District, UKGoogle Scholar
  41. Levine G, Kuter U, Rebguns A, Green DT, Spears DF (2012) Learning and verifying safety constraints for planners in a knowledge-impoverished system. Comput Intell 28(3):329–357MathSciNetCrossRefGoogle Scholar
  42. Lopez de Mantaras R, Arcos JL (2002) AI and music: from composition to expressive performances. AI Mag 3:43–57Google Scholar
  43. Matthew D, Oates T, Cohen PR (2000) Learning planning operators in real-world, partially observable environments. In: AIPS. Breckenridge, Colorado, USAGoogle Scholar
  44. Minton S (1988) Learning effective search control knowledge: an explanation-based approach. Technical report TR CMU-CS-88-133, School of Computer Science, Carnegie Mellon UniversityGoogle Scholar
  45. Mitchell TM (1977) Version spaces: a candidate elimination approach to rule learning. In: IJCAI. AAAI Press, Cambridge, MA, pp 305–310Google Scholar
  46. Mitchell TM (1997) Machine learning. McGraw-Hill, New YorkzbMATHGoogle Scholar
  47. Mitchell T, Keller R, Kedar-Ceballi S (1986) Explanation-based generalization: a unifying view. Mach Learn 1(1):47–80Google Scholar
  48. Mooney RJ (1988) Generalizing the order of operators in macro-operators. Machine Learning, pp 270–283Google Scholar
  49. Muñoz-Avila H, Breslow LA, Aha DW, Nau DS (1998) Description and functionality of NEODocTA. Technical report AIC-96-005, Naval Research Laboratory, Navy Center for Applied Research in Artificial IntelligenceGoogle Scholar
  50. Muñoz-Avila H, Aha DW, Breslow L, Nau DS (1999) HICAP: an interactive case-based planning architecture and its application to noncombatant evacuation operations. In: AAAI/IAAI proceedings. Orlando, Florida, USA. pp 870–875Google Scholar
  51. Nau DS, Cao Y, Lotem A, Muñoz-Avila H (1999) SHOP: simple hierarchical ordered planner. In: Dean T (ed) IJCAI, 31 July–6 Aug 1999. Morgan Kaufmann. Stockholm, Sweden. pp 968–973Google Scholar
  52. Nejati N, Langley P, Konik T (2006) Learning hierarchical task networks by observation. In: Proceedings of the 23rd international conference on machine learning. Pittsburgh, Pennsylvania, USAGoogle Scholar
  53. Parr R (1998) Hierarchical control and learning for Markov decision processes PhD thesis, University of California at BerkeleyGoogle Scholar
  54. Reddy C, Tadepalli P (1997) Learning goal-decomposition rules using exercises. In: ICML. Nashville, Tennessee, USAGoogle Scholar
  55. Reddy C, Tadepalli P (1999) Learning horn definitions: theory and application to planning. N Gener Comput 17(1):77–98CrossRefGoogle Scholar
  56. Ruby D, Kibler DF (1991) SteppingStone: an empirical and analytic evaluation. In: AAAI. Morgan Kaufmann, Anaheim, California, USA pp 527–531Google Scholar
  57. Russell S, Norvig P (2003) Artificial intelligence, a modern approach, 2nd edn. Prentice-Hall, Upper Saddle RiverGoogle Scholar
  58. Sacerdoti E (1975) The nonlinear nature of plans. In: IJCAI, pp 206–214. Reprinted in [3], pp 162–170Google Scholar
  59. Shah JJ, Bhatnagar A (1989) Group technology classification from feature-based geometric models. Manuf Rev 2(3):204–213Google Scholar
  60. Smith SJJ, Nau DS, Throop T (1998) Computer bridge: a big win for AI planning. AI Mag 19(2):93–105Google Scholar
  61. Tate A (1977) Generating project networks. In: IJCAI. Cambridge, MA, USA pp 888–893Google Scholar
  62. Wang X (1994a) Learning by observation and practice: a framework for automatic acquisition of planning operators. In: AAAIGoogle Scholar
  63. Wang X (1994b) Learning planning operators by observation and practice. In: AIPS. Seattle, Washington, USAGoogle Scholar
  64. Watkins CJCH, Dayan P (1992) Q-learning. Mach Learn 8(3–4):279–292zbMATHGoogle Scholar
  65. Xu K, Munoz-Avila H (2005) A domain-independent system for case-based task decomposition without domain theories. In: AAAI. Pittsburgh, Pennsylvania, USAGoogle Scholar
  66. Yang Q, Wu K, Jiang Y (2005) Learning actions models from plan examples with incomplete knowledge. In: ICAPS. Monterey, California, USAGoogle Scholar
  67. Yang Q, Wu K, Jiang Y (2007) Learning action models from plan examples using weighted max-sat. Artif Intell 171(2–3):107–143MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  1. 1.Smart Information Flow Technologies (SIFT)MinneapolisUSA