Algorithm portfolio selection as a bandit problem with unbounded losses

  • Matteo Gagliolo
  • Jürgen Schmidhuber


We propose a method that learns to allocate computation time to a given set of algorithms, of unknown performance, with the aim of solving a given sequence of problem instances in a minimum time. Analogous meta-learning techniques are typically based on models of algorithm performance, learned during a separate offline training sequence, which can be prohibitively expensive. We adopt instead an online approach, named GAMBLETA, in which algorithm performance models are iteratively updated, and used to guide allocation on a sequence of problem instances. GAMBLETA is a general method for selecting among two or more alternative algorithm portfolios. Each portfolio has its own way of allocating computation time to the available algorithms, possibly based on performance models, in which case its performance is expected to improve over time, as more runtime data becomes available. The resulting exploration-exploitation trade-off is represented as a bandit problem. In our previous work, the algorithms corresponded to the arms of the bandit, and allocations evaluated by the different portfolios were mixed, using a solver for the bandit problem with expert advice, but this required the setting of an arbitrary bound on algorithm runtimes, invalidating the optimal regret of the solver. In this paper, we propose a simpler version of GAMBLETA, in which the allocators correspond to the arms, such that a single portfolio is selected for each instance. The selection is represented as a bandit problem with partial information, and an unknown bound on losses. We devise a solver for this game, proving a bound on its expected regret. We present experiments based on results from several solver competitions, in various domains, comparing GAMBLETA with another online method.


Algorithm selection Algorithm portfolios Meta learning Online learning Multi-armed bandit problem Survival analysis Las Vegas algorithms Computational complexity Combinatorial optimization Constraint programming Satisfiability 

Mathematics Subject Classifications (2010)

68T05 68T20 68W27 68Q25 62N99 62G99 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Allenberg, C., Auer, P., Györfi, L., Ottucsák, G.: Hannan consistency in on-line learning in case of unbounded losses under partial monitoring. In: Balcázar, J.L., Long, P.M., Stephan, F. (eds.) Algorithmic Learning Theory—ALT. LNCS, vol. 4264, pp. 229–243. Springer (2006)Google Scholar
  2. 2.
    Auer, P., Cesa-Bianchi, N., Freund, Y., Schapire, R.E.: The nonstochastic multiarmed bandit problem. SIAM J. Comput. 32(1), 48–77 (2002)CrossRefzbMATHMathSciNetGoogle Scholar
  3. 3.
    Babai, L.: Monte-Carlo Algorithms in Graph Isomorphism Testing. Technical Report 79-10, Univ. de Montréal, Dép. de mathématiques et de statistique (1979)Google Scholar
  4. 4.
    Battiti, R., Brunato, M., Mascia, F.: Reactive search and intelligent optimization. In: Operations Research/Computer Science Interfaces, vol. 45. Springer Verlag, Berlin (2008)Google Scholar
  5. 5.
    Cesa-Bianchi, N., Lugosi, G., Stoltz, G.: Minimizing regret with label efficient prediction. IEEE Trans. Inf. Theory 51(6), 2152–2162 (2005)CrossRefMathSciNetGoogle Scholar
  6. 6.
    Cesa-Bianchi, N., Mansour, Y., Stoltz, G.: Improved second-order bounds for prediction with expert advice. In: Auer, P., Meir, R. (eds.) 18th Annual Conference on Learning Theory—COLT. LNCS, vol. 3559, pp. 217–232. Springer (2005)Google Scholar
  7. 7.
    Cesa-Bianchi, N., Mansour, Y., Stoltz, G.: Improved second-order bounds for prediction with expert advice. Mach. Learn. 66(2–3), 321–352 (2007)CrossRefGoogle Scholar
  8. 8.
    Finkelstein, L., Markovitch, S., Rivlin, E.: Optimal schedules for parallelizing anytime algorithms: the case of independent processes. In: Eighteenth National Conference on Artificial Intelligence—AAAI, pp. 719–724. AAAI Press (2002)Google Scholar
  9. 9.
    Finkelstein, L., Markovitch, S., Rivlin, E.: Optimal schedules for parallelizing anytime algorithms: the case of shared resources. J. Artif. Intell. Res. 19, 73–138 (2003)zbMATHMathSciNetGoogle Scholar
  10. 10.
    Fitzmaurice, G., Davidian, M., Verbeke, G., Molenberghs, G.: Longitudinal Data Analysis. Chapman & Hall/CRC Press (2008)Google Scholar
  11. 11.
    Gagliolo, M., Zhumatiy, V., and Schmidhuber, J.: Adaptive online time allocation to search algorithms. In: Boulicaut, J.F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) Machine Learning: ECML 2004. Proceedings of the 15th European Conference on Machine Learning. LNCS, vol. 3201, pp. 134–143. Springer (2004)Google Scholar
  12. 12.
    Gagliolo, M.: Universal search. Scholarpedia 2(11), 2575 (2007)Google Scholar
  13. 13.
    Gagliolo, M.: Online dynamic algorithm portfolios. PhD thesis, IDSIA/University of Lugano, Lugano, Switzerland (2010)Google Scholar
  14. 14.
    Gagliolo, M., Legrand, C.: Algorithm survival analysis. In: Bartz-Beielstein, T., Chiarandini, M., Paquete, L., Preuss, M. (eds.) Experimental Methods for the Analysis of Optimization Algorithms, pp. 161–184. Springer, Berlin, Heidelberg (2010)CrossRefGoogle Scholar
  15. 15.
    Gagliolo, M., Legrand, C., Birattari, M.: Mixed-effects modeling of optimisation algorithm performance. In: Stützle, T., Birattari, M., Hoos, H.H. (eds.) Engineering Stochastic Local Search Algorithms—SLS. LNCS, vol. 5752, pp. 150–154. Springer (2009)Google Scholar
  16. 16.
    Gagliolo, M., Schmidhuber, J.: A neural network model for inter-problem adaptive online time allocation. In: Duch, W., et al. (eds.) Artificial Neural Networks: Formal Models and Their Applications—ICANN, Proceedings, Part 2. LNCS, vol. 3697, pp. 7–12. Springer, Berlin (2005)Google Scholar
  17. 17.
    Gagliolo, M., Schmidhuber, J.: Learning dynamic algorithm portfolios. Ann. Math. Artif. Intell. 47(3–4), 295–328 (2006)zbMATHMathSciNetGoogle Scholar
  18. 18.
    Gagliolo, M., Schmidhuber, J.: Learning restart strategies. In: Veloso, M.M. (ed.) Twentieth International Joint Conference on Artificial Intelligence—IJCAI, vol. 1, pp. 792–797. AAAI Press (2007)Google Scholar
  19. 19.
    Gagliolo, M., Schmidhuber, J.: Towards distributed algorithm portfolios. In: Corchado, J.M., et al. (eds.) International Symposium on Distributed Computing and Artificial Intelligence—DCAI. Advances in Soft Computing, vol. 50, pp. 634–643. Springer (2008)Google Scholar
  20. 20.
    Gagliolo, M., Schmidhuber, J.: Algorithm selection as a bandit problem with unbounded losses. In: Blum, C., Battiti, R. (eds.) Learning and Intelligent Optimization. 4th International Conference, LION 4, Venice, Italy, January 18–22 (2010) Selected Papers, Lecture Notes in Computer Science, vol. 6073, pp. 82–96. Springer, Berlin, Heidelberg (2010)Google Scholar
  21. 21.
    Gomes, C.P., Selman, B.: Algorithm portfolios. Artif. Intell. 126(1–2), 43–62 (2001)CrossRefzbMATHMathSciNetGoogle Scholar
  22. 22.
    Gomes, C.P., Selman, B., Crato, N., Kautz, H.: Heavy-tailed phenomena in satisfiability and constraint satisfaction problems. J. Autom. Reason. 24(1–2), 67–100 (2000)CrossRefzbMATHMathSciNetGoogle Scholar
  23. 23.
    Hoos, H.H., Stützle, T.: Stochastic Local Search: Foundations & Applications. Morgan Kaufmann (2004)Google Scholar
  24. 24.
    Horvitz, E.J., Zilberstein, S.: Computational tradeoffs under bounded resources (editorial). Artif. Intell. 126(1–2), 1–4 (2001)CrossRefGoogle Scholar
  25. 25.
    Huberman, B.A., Lukose, R.M., Hogg, T.: An economics approach to hard computational problems. Science 27, 51–53 (1997)CrossRefGoogle Scholar
  26. 26.
    Hutter, F., Hamadi, Y.: Parameter Adjustment Based on Performance Prediction: Towards an Instance-aware Problem Solver. Technical Report MSR-TR-2005-125, Microsoft Research, Cambridge, UK, (2005)Google Scholar
  27. 27.
    Kaplan, E.L., Meyer, P.: Nonparametric estimation from incomplete samples. J. Am. Stat. Assoc. 73, 457–481 (1958)CrossRefGoogle Scholar
  28. 28.
    Klein, J.P., Moeschberger, M.L.: Survival Analysis: Techniques for Censored and Truncated Data, 2nd edn. Springer, Berlin (2003)zbMATHGoogle Scholar
  29. 29.
    Kolen, J.F.: Faster learning through a probabilistic approximation algorithm. In: IEEE International Conference on Neural Networks, vol. 1, pp. 449–454 (1988)Google Scholar
  30. 30.
    Leyton-Brown, K., Nudelman, E., Shoham, Y.: Learning the empirical hardness of optimization problems: the case of combinatorial auctions. In: Van Hentenryck, P. (ed.) ICCP: International Conference on Constraint Programming (CP). LNCS, vol. 2470, pp. 556–572. Springer (2002)Google Scholar
  31. 31.
    Littlestone, N., Warmuth, M.K.: The weighted majority algorithm. Inf. Comput. 108(2), 212–261 (1994)CrossRefzbMATHMathSciNetGoogle Scholar
  32. 32.
    Luby, M., Sinclair, A., Zuckerman, D.: Optimal speedup of Las Vegas algorithms. Inf. Process. Lett. 47(4), 173–180 (1993)CrossRefzbMATHMathSciNetGoogle Scholar
  33. 33.
    Mitchell, T.M.: Machine Learning. McGraw-Hill Science/Engineering/Math (1997)Google Scholar
  34. 34.
    Muselli, M., Rabbia, M.: Parallel trials versus single search in supervised learning. In: Simula, O. (ed.) Second International Conference on Artificial Neural Networks—ICANN, pp. 24–28. Elsevier (1991)Google Scholar
  35. 35.
    Nudelman, E., Leyton-Brown, K., Hoos, H.H., Devkar, A., Shoham, Y.: Understanding random SAT: beyond the clauses-to-variables ratio. In: Wallace, M. (ed.) Principles and Practice of Constraint Programming—CP. LNCS, vol. 3258, pp. 438–452. Springer (2004)Google Scholar
  36. 36.
    Petrik, M.: Learning Parallel Portfolios of Algorithms. Master’s thesis, Comenius University (2005)Google Scholar
  37. 37.
    Petrik, M.: Statistically Optimal Combination of Algorithms. Presented at SOFSEM (2005)Google Scholar
  38. 38.
    Petrik, M., Zilberstein, S.: Learning parallel portfolios of algorithms. Ann. Math. Artif. Intell. 48(1–2), 85–106 (2006)zbMATHMathSciNetGoogle Scholar
  39. 39.
    Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley-Interscience, New York (1994)zbMATHGoogle Scholar
  40. 40.
    Rice, J.R.: The algorithm selection problem. In: Rubinoff, M., Yovits, M.C. (eds.) Advances in Computers, vol. 15, pp. 65–118. Academic Press (1976)Google Scholar
  41. 41.
    Robbins, H.: Some aspects of the sequential design of experiments. Bull. Amer. Math. Soc. 58, 527–535 (1952)CrossRefzbMATHMathSciNetGoogle Scholar
  42. 42.
    Sayag, T., Fine, S., Mansour, Y.: Combining multiple heuristics. In: STACS, pp. 242–253 (2006)Google Scholar
  43. 43.
    Schaul, T., Schmidhuber, J.: Metalearning. Scholarpedia 5(6), 4650 (2010)CrossRefGoogle Scholar
  44. 44.
    Smith-Miles, K.A.: Cross-disciplinary perspectives on meta-learning for algorithm selection. ACM Comput. Surv. 41(1), 1–25 (2008)CrossRefGoogle Scholar
  45. 45.
    Streeter, M.: Using Online Algorithms to Solve NP-hard Problems more Efficiently in Practice. PhD thesis, Carnegie Mellon University (2007)Google Scholar
  46. 46.
    Streeter, M., Smith, S.F.: New techniques for algorithm portfolio design. In: McAllester, D.A., Myllymäki, P. (eds.) 24th Conference on Uncertainty in Artificial Intelligence—UAI, pp. 519–527. AUAI Press (2008)Google Scholar
  47. 47.
    Streeter, M.J., Golovin, D., Smith, S.F.: Combining multiple heuristics online. In: Holte, R.C., Howe, A. (eds.) Twenty-second AAAI Conference on Artificial Intelligence, pp. 1197–1203. AAAI Press (2007)Google Scholar
  48. 48.
    Xu, L., Hutter, F., Hoos, H.H., Leyton-Brown, K.: SATzilla-07: the design and analysis of an algorithm portfolio for SAT. In: Bessiere, C. (ed.) Principles and Practice of Constraint Programming—CP. LNCS, vol. 4741, pp. 712–727. Springer (2007)Google Scholar
  49. 49.
    Xu, L., Hutter, F., Hoos, H.H., Leyton-Brown, K.: SATzilla: portfolio-based algorithm selection for SAT. J. Artif. Intell. Res. 32, 565–606 (2008)zbMATHGoogle Scholar

Copyright information

© Springer Science+Business Media B.V. 2011

Authors and Affiliations

  1. 1.CoMoVrije Universiteit BrusselBrusselsBelgium
  2. 2.IDSIAManno (Lugano)Switzerland
  3. 3.Faculty of InformaticsUniversity of LuganoLuganoSwitzerland

Personalised recommendations