Optimization of heuristic search using recursive algorithm selection and reinforcement learning

  • Vasileios Vasilikos
  • Michail G. LagoudakisEmail author


The traditional approach to computational problem solving is to use one of the available algorithms to obtain solutions for all given instances of a problem. However, typically not all instances are the same, nor a single algorithm performs best on all instances. Our work investigates a more sophisticated approach to problem solving, called Recursive Algorithm Selection, whereby several algorithms for a problem (including some recursive ones) are available to an agent that makes an informed decision on which algorithm to select for handling each sub-instance of a problem at each recursive call made while solving an instance. Reinforcement learning methods are used for learning decision policies that optimize any given performance criterion (time, memory, or a combination thereof) from actual execution and profiling experience. This paper focuses on the well-known problem of state-space heuristic search and combines the A* and RBFS algorithms to yield a hybrid search algorithm, whose decision policy is learned using the Least-Squares Policy Iteration (LSPI) algorithm. Our benchmark problem domain involves shortest path finding problems in a real-world dataset encoding the entire street network of the District of Columbia (DC), USA. The derived hybrid algorithm exhibits better performance results than the individual algorithms in the majority of cases according to a variety of performance criteria balancing time and memory. It is noted that the proposed methodology is generic, can be applied to a variety of other problems, and requires no prior knowledge about the individual algorithms used or the properties of the underlying problem instances being solved.


Heuristic search Algorithm selection Reinforcement learning Software optimization Hybrid algorithms 

Mathematics Subject Classifications (2010)

68T20 68T05 68T37 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bast, H., Funke, S., Sanders, P., Schultes, D.: Fast routing in road networks with transit nodes. Science 316(5824), 566 (2007)CrossRefMathSciNetGoogle Scholar
  2. 2.
    Bellman, R.: Dynamic Programming. Princeton University Press (1957)Google Scholar
  3. 3.
    Bertsekas, D.P., Tsitsiklis, J.N.: Neuro-Dynamic Programming. Athena Scientific (1996)Google Scholar
  4. 4.
    Chakrabarti, P.P., Ghose, S., Acharya, A., Sarkar, S.C.D.: Heuristic search in restricted memory. Artif. Intell. 41(2), 197–221 (1989)zbMATHCrossRefGoogle Scholar
  5. 5.
    Cherkassky, B.V., Goldberg, A.V., Radzik, T.: Shortest paths algorithms: theory and experimental evaluation. Math. Program. 73, 129–174 (1996)zbMATHMathSciNetGoogle Scholar
  6. 6.
    Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms. The MIT Press and McGraw-Hill Book Company (2001)Google Scholar
  7. 7.
    Delling, D., Sanders, P., Schultes, D., Wagner, D.: Engineering route planning algorithms. In: Lerner, J., Wagner, D., Zweig, K. (eds.) Algorithmics of Large and Complex Networks. Lecture Notes in Computer Science, vol. 5515, pp. 117–139. Springer (2009)Google Scholar
  8. 8.
    Dijkstra, E.W.: A note on two problems in connexion with graphs. Numer. Math. 1, 269–271 (1959)zbMATHCrossRefMathSciNetGoogle Scholar
  9. 9.
    Gagliolo, M., Schmidhuber, J.: Learning dynamic algorithm portfolios. Ann. Math. Artif. Intell. 47(3–4), 295–328 (2006)zbMATHMathSciNetGoogle Scholar
  10. 10.
    Geisberger, R., Sanders, P., Schultes, D., Delling, D.: Contraction hierarchies: faster and simpler hierarchical routing in road networks. In: Proceedings of the 7th International Conference on Experimental Algorithms, pp. 319–333 (2008)Google Scholar
  11. 11.
    Goldberg, A.V., Harrelson, C.: Computing the shortest path: A* search meets graph theory. In: Proceedings of the 16th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 156–165 (2005)Google Scholar
  12. 12.
    Gomes, C.P., Selman, B., Crato, N., Kautz, H.: Heavy-tailed phenomena in satisfiability and constraint satisfaction problems. J. Autom. Reason. 24(1–2), 67–100 (2000)zbMATHCrossRefMathSciNetGoogle Scholar
  13. 13.
    Guo, H.: Algorithm selection for sorting and probabilistic inference: a machine learning-based approach. PhD thesis, Kansas State University, Manhattan, USA (2003)Google Scholar
  14. 14.
    Hart, P.E., Nilsson, N.J., Raphael, B.: A formal basis for the heuristic determination of minimum cost paths. IEEE Trans. Syst. Sci. Cybern. 4(2), 100–107 (1968)CrossRefGoogle Scholar
  15. 15.
    Howard, R.A.: Dynamic Programming and Markov Processes. The MIT Press (1960)Google Scholar
  16. 16.
    Huberman, B.A., Lukose, R.M., Hogg, T.: An economics approach to hard computational problems. Science 275(5296), 51–54 (1997)CrossRefGoogle Scholar
  17. 17.
    Hutter, F., Hoos, H.H., Stützle, T.: Automatic algorithm configuration based on local search. In: Proceedings of the 22nd National Conference on Artificial Intelligence, pp. 1152–1157 (2007)Google Scholar
  18. 18.
    Kaelbling, L.P., Littman, M., Moore, A.: Reinforcement learning: a survey. J. Artif. Intell. Res. 4, 237–285 (1996)Google Scholar
  19. 19.
    Korf, R.E.: Real-time heuristic search. Artif. Intell. 42(3), 189–212 (1990)zbMATHCrossRefGoogle Scholar
  20. 20.
    Lagoudakis, M.G., Littman, M.L.: Algorithm selection using reinforcement learning. In: Proceedings of the 17th International Conference on Machine Learning, pp. 511–518 (2000)Google Scholar
  21. 21.
    Lagoudakis, M.G., Littman, M.L.: Learning to select branching rules in the DPLL procedure for satisfiability. In: Proceedings of the 2001 Workshop on Theory and Applications of Satisfiability Testing, pp. 344–359 (2001)Google Scholar
  22. 22.
    Lagoudakis, M.G., Parr, R.: Least-squares policy iteration. J. Mach. Learn. Res. 4, 1107–1149 (2003)CrossRefMathSciNetGoogle Scholar
  23. 23.
    Lagoudakis, M.G., Littman, M.L., Parr, R.: Selecting the right algorithm. In: Proceedings of the 2001 AAAI Fall Symposium Series: Using Uncertainty within Computation (2001)Google Scholar
  24. 24.
    Musser, D.R.: Introspective sorting and selection algorithms. Softw. Pract. Exper. 27(8), 983–993 (1997)CrossRefGoogle Scholar
  25. 25.
    Puterman, M.L.: Markov Decision Processes—Discrete Stochastic Dynamic Programming. Wiley, Inc (1994)Google Scholar
  26. 26.
    Rice, J.R.: The algorithm selection problem. Adv. Comput. 15, 65–118 (1976)CrossRefGoogle Scholar
  27. 27.
    Russell, S.: Efficient memory-bounded search methods. In: Proceedings of the 10th European Conference on Artificial Intelligence, pp. 1–5 (1992)Google Scholar
  28. 28.
    Russell, S., Norvig, P.: Artificial Intelligence: A Modern Approach, 2nd edn. Prentice Hall (2003)Google Scholar
  29. 29.
    Sanders, P., Schultes, D.: Engineering highway hierarchies. In: Proceedings of the 14th European Symposium on Algorithms, pp. 804–816 (2006)Google Scholar
  30. 30.
    Smith-Miles, K.A.: Cross-disciplinary perspectives on meta-learning for algorithm selection. ACM Comput. Surv. 41(1), 1–25 (2008)CrossRefGoogle Scholar
  31. 31.
    Sutton, R., Barto, A.: Reinforcement Learning: An Introduction. The MIT Press (1998)Google Scholar
  32. 32.
    TIGER/Line: Topologically Integrated Geographic Encoding and Referencing System: US Road Network Data. (2008)
  33. 33.
    Wolpert, D.H., Macready, W.G.: No free lunch theorems for optimization. IEEE Trans. Evol. Comput. 1, 67–82 (1997)CrossRefGoogle Scholar
  34. 34.
    Xu, L., Hutter, F., Hoos, H.H., Leyton-Brown, K.: Satzilla-07: the design and analysis of an algorithm portfolio for SAT. In: Proceedings of the 13th International Conference on Principles and Practice of Constraint Programming, pp. 712–727 (2007)Google Scholar

Copyright information

© Springer Science+Business Media B.V. 2010

Authors and Affiliations

  1. 1.Intelligent Systems Laboratory, Department of Electronic and Computer EngineeringTechnical University of CreteCreteGreece
  2. 2.Faculty of Electrical Engineering, Mathematics and Computer ScienceDelft University of TechnologyDelftThe Netherlands

Personalised recommendations