Abstract
In this paper we describe ENSEMBLE-ROLLER, a learning-based automated planner that uses a bagging approach to enhance existing techniques for learning planning policies. Previous policy-style planning and learning systems sort state successors based on action predictions from a relational classifier. However, these learning-based planners can produce several plans of bad quality, since it is very difficult to encode in a single classifier all possible situations occurring in a planning domain. We propose to use ensembles of relational classifiers to generate more robust policies. As in other applications of machine learning, the idea of the ensembles of classifiers consists of providing accuracy for particular scenarios and diversity to cover a wide range of situations. In particular, ENSEMBLE-ROLLER learns ensembles of relational decision trees for each planning domain. The control knowledge from different sets of trees is aggregated as a single prediction or applied separately in a multiple-queue search algorithm. Experimental results show that both ways of using new policies produce on average plans of better quality.
Similar content being viewed by others
References
Blockeel, H., De Raedt, L.: Top-down induction of first-order logical decision trees. Artif. Intell. 101(1-2), 285–297 (1998)
Breiman, L.: Bagging predictors. Mach. Learn. 24, 123–140 (1996)
Cunningham, P., Carney, J.: Diversity versus quality in classification ensembles based on feature selection. In: Machine Learning: ECML 2000, pp. 109–116. Springer (2000)
De la Rosa, T., Jiménez, S., Fuentetaja, R., Borrajo, D.: Scaling up heuristic planning with relational decision trees. JAIR 40, 767–813 (2011). http://www.plg.inf.uc3m.es/rfuentet/papers/roller-jair10.pdf
Dietterich, T.: Ensemble methods in machine learning. In: 1st. International Workshop in Multiple Classifier Systems (2000)
Fox, M., Long, D.: PDDL2.1: An extension to PDDL for expressing temporal planning domains. JAIR 20, 61–124 (2003)
Freund, Y., Schapire, R.: Experiments with a new boosting algorithm. In: Proceedings of the 13th. Conference on Machine Learning (1996)
Helmert, M.: The fast downward planning system. JAIR 26, 191–246 (2006)
Hoffmann, J., Nebel, B.: The FF planning system: Fast plan generation through heuristic search. JAIR 14, 253–302 (2001)
Jiménez, S., De la Rosa, T., Fernández, S., Fernández, F., Borrajo, D.: A review of machine learning for automated planning. Knowl. Eng. Rev. 27(4), 433–467 (2012)
Khardon, R.: Learning action strategies for planning domains. Artif. Intell. 113, 125–148 (1999)
Krajnansky, M., Buffet, O., Hoffmann, J., Fern, A.: Learning pruning rules for heuristic search planning. In: Proceedings of the 21st European Conference on Artificial Intelligence (ECAI’14) (2014)
Martin, M., Geffner, H.: Learning generalized policies in planning using concept languages. In: International Conference on Artificial Intelligence Planning Systems, AIPS00 (2000)
Minton, S.: Learning Effective Search Control Knowledge: An Explanation-Based Approach. Kluwer Academic Publishers, Boston, MA (1988)
Röger, G., Helmert, M.: The more, the merrier: Combining heuristic estimators for satisficing planning. In: ICAPS, pp. 246–249 (2010)
Yoon, S., Fern, A., Givan, R.: Inductive policy selection for first-order mdps. In: Proceedings of the 18th. Conference on Uncertainty in Artificial Intelligence, pp. 568–576. Morgan Kaufmann Publishers Inc. (2002)
Yoon, S., Fern, A., Givan, R.: Learning control knowledge for forward search planning. J. Mach. Learn. Res. 9, 683–718 (2008)
Zimmerman, T., Kambhampati, S.: Learning-assisted automated planning: looking back, taking stock, going forward. AI Mag. 24, 73–96 (2003)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
de la Rosa, T., Fuentetaja, R. Bagging strategies for learning planning policies. Ann Math Artif Intell 79, 291–305 (2017). https://doi.org/10.1007/s10472-016-9523-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10472-016-9523-9