Advertisement

Learning by Knowledge Sharing in Autonomous Intelligent Systems

  • Ramón García-Martínez
  • Daniel Borrajo
  • Pablo Maceri
  • Paola Britos
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4140)

Abstract

Very few learning systems applied to problem solving have focused on learning operator definitions from the interaction with a completely unknown environment. In order to achieve better learning convergence, several agents that learn separately are allowed to interchange each learned set of planning operators. Learning is achieved by establishing plans, executing those plans in the environment, analyzing the results of the execution, and combining new evidence with prior evidence. Operators are generated incrementally by combining rote learning, induction, and a variant of reinforcement learning. The results show how allowing the communication among individual learning (and planning) agents provides a much better percentage of successful plans, plus an improved convergence rate than the individual agents alone.

Keywords

Reinforcement Learning Markov Decision Process Planning Operator Hide State Partially Observable Markov Decision Process 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Barbehenn, M., Hutchinson, S.: An integrated architecture for learning and planning in robotic domains. Sigart Butlletin 2(4), 29–33 (1991)CrossRefGoogle Scholar
  2. Bennet, S.W., DeJong, G.: Real world robotics: Learning to plan for a robust execution. Machine Learning 23, 121–162 (1996)Google Scholar
  3. Borrajo, D., Veloso, M.: Lazy incremental learning of control knowledge for efficiently obtaining quality plans. AI Review Journal 11, 371–405 (1997)Google Scholar
  4. Boutilier, C., Dearden, R., Goldszmidt, M.: Exploiting structure in policy construction. In: Proc. 14th Int. Joint Conf. on AI, pp. 1104–1111. Morgan Kaufmann, San Francisco (1995)Google Scholar
  5. Carbonell, J.G., Gil, Y.: Learning by experimentation: The operator refinement method. In: Michalski, Kodratoff (eds.) Machine Learning: An AI Approach, vol. III, pp. 191–213. Morgan Kaufmann, San Francisco (1990)Google Scholar
  6. Christiansen, A.: Automatic Acquisition of Task Theories for Robotic Manipulation. PhD thesis, School of Computes Science, Carnegie Mellon University (1992)Google Scholar
  7. Fritz, W., García-Martínez, R., Blanqué, J., Rama, A., Adobbati, R., Sarno, M.: The autonomous intelligent system. Robotics and Autonomous Systems 5, 109–125 (1989)CrossRefGoogle Scholar
  8. García-Martínez, R., Borrajo, D.: Planning, learning, and executing in autonomous systems. In: Steel, S. (ed.) ECP 1997. LNCS (LNAI), vol. 1348, pp. 208–220. Springer, Heidelberg (1997)CrossRefGoogle Scholar
  9. García Martínez, R., Borrajo, D.: An Integrated Approach of Learning, Planning and Executing. Journal of Intelligent and Robotic Systems 29, 47–78 (2000)MATHCrossRefGoogle Scholar
  10. Hayes-Roth, F.: Using proofs and refutations to learn from experience. In: Michalski, R.S., Carbonell, J.G., Mitchell, T.M., (eds.) Machine Learning, An AI Approach, pp. 221–240. Tioga Press, Palo Alto (1983)Google Scholar
  11. Kaebling, L., Littman, M., Cassandra, A.: Planning and acting in partially observable stochastic domains. Artificial Intelligence 10, 99–134 (1998)CrossRefGoogle Scholar
  12. Klingspor, V., Morik, K., Rieger, A.: Learning concepts from sensor data of a mobile robot. Machine Learning 23, 305–332 (1996)Google Scholar
  13. Lin, L.: Reinforcement Learning of Non-Markov Decision Processes. Artificial Intelligence 73, 271–306 (1995)CrossRefGoogle Scholar
  14. Mahavedan, S., Connell, J.: Automatic programming of behavior-based robots using reinforce-ment learning. Artificial Intelligence 55, 311–365 (1992)CrossRefGoogle Scholar
  15. Safra, S., Tennenholtz, M.: On planning while learning. JAIR 2, 111–129 (1994)Google Scholar
  16. Salzberg, S.: Heuristics for inductive learning. In: Proc. 9th International Joint Conference on AI, Los Angeles, CA, pp. 603–609 (1985)Google Scholar
  17. Shen, W.: Discovery as autonomous learning from environment. Machine Learning 12, 143–165 (1993)Google Scholar
  18. Sutton, R.: Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. In: Proc. 7th Int. Conf. on ML, pp. 216–224. Morgan Kaufmann, San Francisco (1990)Google Scholar
  19. Tan, M.: Multi-agent reinforcement learning: Independent vs. cooperative agents. In: Proc. 10th International Conference on ML, pp. 330–337. Morgan Kaufman, Amherst (1993)Google Scholar
  20. Wang, X.: Planning while learning operators. PhD thesis, School of Computes Science, Carnegie Mellon University (1996)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Ramón García-Martínez
    • 1
    • 2
    • 3
  • Daniel Borrajo
    • 1
    • 2
    • 3
  • Pablo Maceri
    • 1
    • 2
    • 3
  • Paola Britos
    • 1
    • 2
    • 3
  1. 1.Software and Knowledge Engineering Center, Graduate School, Buenos Aires Institute of TechnologyArgentina
  2. 2.Departamento de InformáticaUniversidad Carlos III de MadridSpain
  3. 3.Intelligent Systems Laboratory. School of EngineeringUniversity of Buenos AiresArgentina

Personalised recommendations