Program Search for Machine Learning Pipelines Leveraging Symbolic Planning and Reinforcement Learning

  • Fangkai Yang
  • Steven GustafsonEmail author
  • Alexander Elkholy
  • Daoming Lyu
  • Bo Liu
Part of the Genetic and Evolutionary Computation book series (GEVO)


In this paper we investigate an alternative knowledge representation and learning strategy for the automated machine learning (AutoML) task. Our approach combines a symbolic planner with reinforcement learning to evolve programs that process data and train machine learning classifiers. The planner, which generates all feasible plans from the initial state to the goal state, gives preference first to shortest programs and then later to ones that maximize rewards. The results demonstrate the efficacy of the approach for finding good machine learning pipelines, while at the same time showing that the representation can be used to infer new knowledge relevant for the problem instances being solved. These insights can be useful for other automatic programming approaches, like genetic programming (GP) and Bayesian optimization pipeline learning, with respect to representation and learning strategies.


  1. 1.
    Barto, A., Mahadevan, S.: Recent advances in hierarchical reinforcement learning. Discrete Event Systems Journal 13, 41–77 (2003)MathSciNetCrossRefGoogle Scholar
  2. 2.
    Cimatti, A., Pistore, M., Traverso, P.: Automated planning. In: F. van Harmelen, V. Lifschitz, B. Porter (eds.) Handbook of Knowledge Representation. Elsevier (2008)Google Scholar
  3. 3.
    Feurer, M., Klein, A., Eggensperger, K., Springenberg, J., Blum, M., Hutter, F.: Efficient and robust automated machine learning. In: Advances in Neural Information Processing Systems, pp. 2962–2970 (2015)Google Scholar
  4. 4.
    Gebser, M., Kaufmann, B., Schaub, T.: Conflict-driven answer set solving: From theory to practice. Artificial Intelligence 187–188, 52–89 (2012)MathSciNetCrossRefGoogle Scholar
  5. 5.
    Gelfond, M., Lifschitz, V.: Action languages. Electronic Transactions on Artificial Intelligence (ETAI) 6 (1998)Google Scholar
  6. 6.
    Gulwani, S., Harris, W.R., Singh, R.: Spreadsheet data manipulation using examples. Commun. ACM 55(8), 97–105 (2012)CrossRefGoogle Scholar
  7. 7.
    Hutter, F., Hoos, H.H., Leyton-Brown, K.: Sequential model-based optimization for general algorithm configuration. In: International Conference on Learning and Intelligent Optimization, pp. 507–523. Springer (2011)Google Scholar
  8. 8.
    Lee, J., Lifschitz, V., Yang, F.: Action Language \(\mathcal {BC}\): A Preliminary Report. In: International Joint Conference on Artificial Intelligence (IJCAI), pp. 983–989 (2013)Google Scholar
  9. 9.
    Lifschitz, V.: What is answer set programming? In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 1594–1597. MIT Press (2008)Google Scholar
  10. 10.
    Maas, A.L., Daly, R.E., Pham, P.T., Huang, D., Ng, A.Y., Potts, C.: Learning word vectors for sentiment analysis. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 142–150. Association for Computational Linguistics, Portland, Oregon, USA (2011)Google Scholar
  11. 11.
    Mahadevan, S.: Average reward reinforcement learning: Foundations, algorithms, and empirical results. Machine Learning 22, 159–195 (1996)zbMATHGoogle Scholar
  12. 12.
    Martineau, J., Finin, T.: Delta TFIDF: An Improved Feature Space for Sentiment Analysis. In: Proceedings of the Third AAAI Internatonal Conference on Weblogs and Social Media, pp. 258–261. AAAI Press, San Jose, CA (2009)Google Scholar
  13. 13.
    McDermott, D., Ghallab, M., Howe, A., Knoblock, C., Ram, A., Veloso, M., Weld, D., Wilkins, D.: PDDL-the planning domain definition language. Tech. Rep. CVC-TR-98–003, Yale Center for Computational Vision and Control (1998)Google Scholar
  14. 14.
    Olson, R.S., Bartley, N., Urbanowicz, R.J., Moore, J.H.: Evaluation of a tree-based pipeline optimization tool for automating data science. In: Proceedings of the Genetic and Evolutionary Computation Conference 2016, GECCO ’16, pp. 485–492. ACM, New York, NY, USA (2016)Google Scholar
  15. 15.
    O’Reilly, U.M., Oppacher, F.: Program search with a hierarchical variable length representation: Genetic programming, simulated annealing and hill climbing. In: Y. Davidor, H.P. Schwefel, R. Männer (eds.) Parallel Problem Solving from Nature — PPSN III, pp. 397–406. Springer Berlin Heidelberg, Berlin, Heidelberg (1994)CrossRefGoogle Scholar
  16. 16.
    Pang, B., Lee, L.: A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the 42nd annual meeting on Association for Computational Linguistics, p. 271. Association for Computational Linguistics (2004)Google Scholar
  17. 17.
    Puterman, M.L.: Markov Decision Processes. Wiley Interscience, New York, USA (1994)CrossRefGoogle Scholar
  18. 18.
    Schwartz, A.: A reinforcement learning method for maximizing undiscounted rewards. In: Proceedings of the Tenth International Conference on International Conference on Machine Learning, ICML’93, pp. 298–305. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (1993)CrossRefGoogle Scholar
  19. 19.
    Thornton, C., Hutter, F., Hoos, H.H., Leyton-Brown, K.: Auto-WEKA: Combined selection and hyperparameter optimization of classification algorithms. In: Proc. of KDD-2013, pp. 847–855 (2013)Google Scholar
  20. 20.
    Wolpert, D.H.: The lack of a priori distinctions between learning algorithms. Neural Computation 8, 1341–1390 (1996)CrossRefGoogle Scholar
  21. 21.
    Yang, F., Lyu, D., Liu, B., Gustafson, S.: Peorl: Integrating symbolic planning and hierarchical reinforcement learning for robust decision-making. In: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18, pp. 4860–4866. International Joint Conferences on Artificial Intelligence Organization (2018)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Fangkai Yang
    • 1
  • Steven Gustafson
    • 1
    Email author
  • Alexander Elkholy
    • 1
  • Daoming Lyu
    • 2
  • Bo Liu
    • 2
  1. 1.Maana, Inc.BellevueUSA
  2. 2.Auburn UniversityAuburnUSA

Personalised recommendations