Active Learning of Relational Action Models
We consider an agent which learns a relational action model in order to be able to predict the effects of his actions. The model consists of a set of STRIPS-like rules, i.e. rules predicting what has changed in the current state when applying a given action as far as a set of preconditions is satisfied by the current state. Here several rules can be associated to a given action, therefore allowing to model conditional effects. Learning is online, as examples result from actions performed by the agent, and incremental, as the current action model is revised each time it is contradicted by unexpected effects resulting from his actions. The form of the model allows using it as an input of standard planners.
In this work, the learning unit IRALe is embedded in an integrated system able to i) learn an action model ii) select its actions iii) plan to reach a goal. The agent uses the current action model to perform active learning, i.e. to select actions with the purpose of reaching states that will enforce a revision of the model, and uses its planning abilities to have a realistic evaluation of the accuracy of the model.
KeywordsAction Model Incremental Learning Inductive Logic Programming Action Rule Exploration Mode
Unable to display preview. Download preview PDF.
- 2.Benson, S.: Inductive learning of reactive action models. In: ICML 1995, pp. 47–54 (1995)Google Scholar
- 3.Biba, M., Ferilli, S., Esposito, F., Di Mauro, N., Basile, T.M.A.: A fast partial memory approach to incremental learning through an advanced data storage framework. In: Proceedings of the Fifteenth Italian Symposium on Advanced Database Systems, SEBD 2007, pp. 52–63 (2007)Google Scholar
- 4.Croonenborghs, T., Ramon, J., Blockeel, H., Bruynooghe, M.: Online learning and exploiting relational models in reinforcement learning. In: IJCAI, pp. 726–731 (2007)Google Scholar
- 5.Dabney, W., McGovern, A.: Utile distinctions for relational reinforcement learning. In: IJCAI, pp. 738–743 (2007)Google Scholar
- 8.Esposito, F., Ferilli, S., Fanizzi, N., Basile, T.M.A., Di Mauro, N.: Incremental learning and concept drift in inthelex. Intell. Data Anal. 8(3), 213–237 (2004)Google Scholar
- 9.Gil, Y.: Learning by experimentation: Incremental refinement of incomplete planning domains. In: ICML, pp. 87–95 (1994)Google Scholar
- 10.Hoffmann, J.: Ff: The fast-forward planning system. The AI Magazine (2001)Google Scholar
- 11.Jiménez, S., Fernández, F., Borrajo, D.: The pela architecture: integrating planning and learning to improve execution. In: 23rd National Conference on Artificial Intelligence, vol. 3, pp. 1294–1299. AAAI Press (2008)Google Scholar
- 12.Kearns, M., Singh, S.: Near-optimal reinforcement learning in polynomial time. Machine Learning 49 (2002)Google Scholar
- 14.Li, L., Littman, M.L., Walsh, T.J.: Knows what it knows: a framework for self-aware learning. In: ICML, pp. 568–575 (2008)Google Scholar
- 16.McDermott, D.: The 1998 ai planning systems competition. AI Magazine 21(2), 35–55 (2000)Google Scholar
- 20.Pasula, H.M., Zettlemoyer, L.S., Pack Kaelbling, L.: Learning probabilistic planning rules. In: ICAPS, pp. 146–163 (2004)Google Scholar
- 21.Richards, B.L., Mooney, R.J.: Automated refinement of first-order horn-clause domain theories. Machine Learning 19, 95–131 (1995)Google Scholar
- 22.Rodrigues, C., Gérard, P., Rouveirol, C., Soldano, H.: Incremental learning of relational action rules. In: ICMLA (2010)Google Scholar
- 23.Settles, B.: Active Learning Literature Survey. Technical Report Technical Report 1648, University of Wisconsin-Madison (2009)Google Scholar
- 26.Walsh, T.J., Littman, M.L.: Efficient learning of action schemas and web-service descriptions. In: AAAI, pp. 714–719 (2008)Google Scholar
- 27.Walsh, T.J., Szita, I., Diuk, M., Littman, M.L.: Exploring compact reinforcement-learning representations with linear regression. In: UAI, pp. 714–719 (2009)Google Scholar
- 28.Wang, X.: Learning by observation and practice: An incremental approach for planning operator acquisition. In: ICML, pp. 549–557 (1995)Google Scholar