Advertisement

Learning dynamic algorithm portfolios

  • Matteo Gagliolo
  • Jürgen Schmidhuber
Article

Abstract

Algorithm selection can be performed using a model of runtime distribution, learned during a preliminary training phase. There is a trade-off between the performance of model-based algorithm selection, and the cost of learning the model. In this paper, we treat this trade-off in the context of bandit problems. We propose a fully dynamic and online algorithm selection technique, with no separate training phase: all candidate algorithms are run in parallel, while a model incrementally learns their runtime distributions. A redundant set of time allocators uses the partially trained model to propose machine time shares for the algorithms. A bandit problem solver mixes the model-based shares with a uniform share, gradually increasing the impact of the best time allocators as the model improves. We present experiments with a set of SAT solvers on a mixed SAT-UNSAT benchmark; and with a set of solvers for the Auction Winner Determination problem.

Keywords

algorithm selection algorithm portfolios online learning life-long learning bandit problem expert advice survival analysis satisfiability constraint programming 

Mathematics Subject Classifications (2000)

68T05 68T20 62N99 62G99 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Auer, P., Cesa-Bianchi, N., Freund, Y., Schapire, R.E.: Gambling in a rigged casino: the adversarial multi-armed bandit problem. In: Proceedings of the 36th Annual Symposium on Foundations of Computer Science, pp. 322–331. IEEE Computer Society Press, Los Alamitos, CA (1995)Google Scholar
  2. 2.
    Auer, P., Cesa-Bianchi, N., Freund, Y., Schapire, R.E.: The nonstochastic multiarmed bandit problem. SIAM J. Comput. 32(1), 48–77 (2002)zbMATHCrossRefMathSciNetGoogle Scholar
  3. 3.
    Battiti, R., Protasi, M.: Reactive search, a history-sensitive heuristic for max-sat. ACM J. Exp. Algorithms 2, 2 (1997)CrossRefMathSciNetGoogle Scholar
  4. 4.
    Beck, C.J., Freuder, E.C.: Simple rules for low-knowledge algorithm selection. In: Régin, J.C., Rueher, M. (eds.) Integration of AI and OR Techniques in Constraint Programming for Combinatorial Optimization Problems, First International Conference, CPAIOR 2004. Lecture Notes in Computer Science, vol. 3011, pp. 50–64. Springer, Berlin Heidelberg New York (2004)Google Scholar
  5. 5.
    Beran, R.: Nonparametric regression with randomly censored survival data. Technical report, University of California, Berkeley (1981)Google Scholar
  6. 6.
    Berry, D.A., Fristedt, B.: Bandit Problems: Sequential Allocation of Experiments. Chapman & Hall, London (1985)zbMATHGoogle Scholar
  7. 7.
    Birattari, M., Stützle, T., Paquete, L., Varrentrapp, K.: A racing algorithm for configuring metaheuristics. In: Langdon, W., et al. (eds.) GECCO 2002: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 11–18. Morgan Kaufmann, San Mateo, CA (2002)Google Scholar
  8. 8.
    Boddy, M., Dean, T.L.: Deliberation scheduling for problem solving in time-constrained environments. Artif. Intell. 67(2), 245–285 (1994)zbMATHCrossRefGoogle Scholar
  9. 9.
    Borrett, J.E., Tsang, E.P.K., Walsh, N.R.: Adaptive constraint satisfaction: The quickest first principle. In: Wahlster, W. (ed.) Proceedings of the 12th European Conference on Artificial Intelligence, pp. 160–164. Wiley, Chichester, UK (1996)Google Scholar
  10. 10.
    Carchrae, T., Beck, J.C.: Applying machine learning to low knowledge control of optimization algorithms. Comput. Intell. 21(4), 373–387 (2005)MathSciNetGoogle Scholar
  11. 11.
    Cicirello, V.A., Smith, S.F.: Heuristic selection for stochastic search optimization: Modeling solution quality by extreme value theory. In: Wallace, M. (ed.) Principles and Practice of Constraint Programming – CP 2004. Lecture Notes in Computer Science, vol. 3258, pp. 197–211. Springer, Berlin Heidelberg New York (2004)Google Scholar
  12. 12.
    Cicirello, V.A., Smith, S.F.: The max k-armed bandit: A new model of exploration applied to search heuristic selection. In: Veloso, M.M., Kambhampati, S. (eds.) Proceedings, The Twentieth National Conference on Artificial Intelligence and the Seventeenth Innovative Applications of Artificial Intelligence Conference, July 9-13, 2005, Pittsburgh, Pennsylvania, USA, pp. 1355–1361 (2005)Google Scholar
  13. 13.
    Cramer, N.L.: A representation for the adaptive generation of simple sequential programs. In: Grefenstette, J. (ed.) Proceedings of an International Conference on Genetic Algorithms and Their Applications, Carnegie-Mellon University, July 24–26, 1985. Lawrence Erlbaum Associates, Hillsdale NJ (1985)Google Scholar
  14. 14.
    Etzioni, O.: Embedding decision-analytic control in a learning architecture. Artif. Intell. 49(1–3), 129–159 (1991)zbMATHCrossRefMathSciNetGoogle Scholar
  15. 15.
    Fürnkranz, J.: On-line bibliography on meta-learning (2001). EU ESPRIT METAL Project (26.357): A Meta-learning Assistant for Providing User Support in Machine Learning Mining. Http://faculty.cs.byu.edu/~cgc/Research/MetalearningBiblio/metal-bib.html
  16. 16.
    Gagliolo, M., Schmidhuber, J.: Gambling in a computationally expensive casino: Algorithm selection as a bandit problem. NIPS 2006 Workshop on Online Trading of Exploration and Exploitation, Whistler Canada, 8 December 2006Google Scholar
  17. 17.
    Gagliolo, M., Schmidhuber, J.: A neural network model for inter-problem adaptive online time allocation. In: Duch, W., et al. (eds.) Artificial Neural Networks: Formal Models and Their Applications – Proceedings ICANN 2005. Part 2. Lecture Notes in Computer Science, vol. 3697, pp. 7–12. Springer, Berlin Heidelberg New York (2005)Google Scholar
  18. 18.
    Gagliolo, M., Schmidhuber, J.: Dynamic algorithm portfolios. Ninth international symposium on artificial intelligence and mathematics, Fort Lauderdale FL, 4–6 January 2006Google Scholar
  19. 19.
    Gagliolo, M., Schmidhuber, J.: Impact of censored sampling on the performance of restart strategies. In: Benhamou, F. (ed.) Principles and Practice of Constraint Programming – CP 2006. Lecture Notes in Computer Science, vol. 4204, pp. 167–181. Springer, Berlin Heidelberg New York (2006)CrossRefGoogle Scholar
  20. 20.
    Gagliolo, M., Schmidhuber, J.: Learning restart strategies. In: IJCAI 2007 – Twentieth International Joint Conference on Artificial Intelligence (2007) (in press)Google Scholar
  21. 21.
    Gagliolo, M., Zhumatiy, V., Schmidhuber, J.: Adaptive online time allocation to search algorithms. In: Boulicaut, J.F., et al. (eds.) Machine Learning: ECML 2004. Proceedings of the 15th European Conference on Machine Learning, pp. 134–143. Springer, Berlin Heidelberg New York (2004) (Extended tech. report available at http://www.idsia.ch/idsiareport/IDSIA-23-04.ps.gz)
  22. 22.
    Gent, I., Walsh, T.: The search for satisfaction. Technical report, Department of Computer Science, University of Strathclyde (1999)Google Scholar
  23. 23.
    Giraud-Carrier, C., Vilalta, R., Brazdil, P.: Introduction to the special issue on meta-learning. Mach. Learn. 54(3), 187–193 (2004)CrossRefGoogle Scholar
  24. 24.
    Gomes, C.P., Selman, B.: Algorithm portfolios. Artif. Intell. 126(1–2), 43–62 (2001)zbMATHCrossRefMathSciNetGoogle Scholar
  25. 25.
    Gomes, C.P., Selman, B., Crato, N., Kautz, H.: Heavy-tailed phenomena in satisfiability and constraint satisfaction problems. J. Autom. Reason. 24(1-2), 67–100 (2000)zbMATHCrossRefMathSciNetGoogle Scholar
  26. 26.
    Hansen, E.A., Zilberstein, S.: Monitoring and control of anytime algorithms: A dynamic programming approach. Artif. Intell. 126(1–2), 139–157 (2001)zbMATHCrossRefMathSciNetGoogle Scholar
  27. 27.
    Harick, G.R., Lobo, F.G.: A parameter-less genetic algorithm. In: Banzhaf, W., et al. (eds.) Proceedings of the Genetic and Evolutionary Computation Conference, vol. 2, pp. 1867–1875. Morgan Kaufmann, San Mateo, CA (1999)Google Scholar
  28. 28.
    Holland, J.H.: Adaptation in Natural and Artificial Systems. University of Michigan Press, Ann Arbor, MI (1975)Google Scholar
  29. 29.
    Hoos, H.H., Stützle, T.: Local search algorithms for SAT: an empirical evaluation. J. Autom. Reason. 24(4), 421–481 (2000)zbMATHCrossRefGoogle Scholar
  30. 30.
    Hoos, H.H., Stützle, T.: SATLIB: An Online Resource for Research on SAT. In: Gent, T.W.I.P., Maaren, H.v. (ed.) SAT 2000, pp. 283–292. IOS Press, Amsterdam, The Netherlands (2000) (http://www.satlib.org)
  31. 31.
    Horvitz, E., Ruan, Y., Gomes, C.P., Kautz, H.A., Selman, B., Chickering, D.M.: A bayesian approach to tackling hard computational problems. In: Breese, J.S., Koller, D. (eds.) UAI ’01: Proceedings of the 17th Conference in Uncertainty in Artificial Intelligence, pp. 235–244. Morgan Kaufmann, San Mateo, CA (2001)Google Scholar
  32. 32.
    Horvitz, E.J., Zilberstein, S.: Computational tradeoffs under bounded resources (editorial). Artif. Intell. 126(1–2), 1–4 (2001) (Special Issue)CrossRefGoogle Scholar
  33. 33.
    Huberman, B.A., Lukose, R.M., Hogg, T.: An economic approach to hard computational problems. Science 275, 51–54 (1997)CrossRefGoogle Scholar
  34. 34.
    Hutter, F., Hamadi, Y.: Parameter adjustment based on performance prediction: Towards an instance-aware problem solver. Technical Report. MSR-TR-2005-125, Microsoft Research, Cambridge, UK (2005)Google Scholar
  35. 35.
    Hutter, F., Hamadi, Y., Hoos, H.H., Leyton-Brown, K.: Performance prediction and automated tuning of randomized and parametric algorithms. In: Benhamou, F. (ed.) Principles and Practice of Constraint Programming – CP 2006. Lecture Notes in Computer Science, vol. 4204, pp. 213–228. Springer, Berlin Heidelberg New York (2006)CrossRefGoogle Scholar
  36. 36.
    Ibrahim, J.G., Chen, M.H., Sinha, D.: Bayesian Survival Analysis. Springer, Berlin Heidelberg New York (2001)zbMATHGoogle Scholar
  37. 37.
    Jr., D.W.H., Lemeshow, S.: Applied Survival Analysis: Regression Modeling of Time to Event Data. Wiley, New York (1999)Google Scholar
  38. 38.
    Kaelbling, L., Littman, M., Moore, A.: Reinforcement learning: a survey. J. Artif. Intell. Res. 4, 237–285 (1996)Google Scholar
  39. 39.
    Kaplan, E., Meyer, P.: Nonparametric estimation from incomplete samples. J. Am. Stat. Assoc. 73, 457–481 (1958)CrossRefGoogle Scholar
  40. 40.
    Kautz, H.A., Horvitz, E., Ruan, Y., Gomes, C.P., Selman, B.: Dynamic restart policies. In: Proceedings of the Eighteenth National Conference on Artificial Intelligence and Fourteenth Conference on Innovative Applications of Artificial Intelligence (AAAI/IAAI), pp. 674–681 (2002)Google Scholar
  41. 41.
    Van Keilegom, I., Akritas, M., Veraverbeke, N.: Estimation of the conditional distribution in regression with censored data : a comparative study. Comput. Stat. Data Anal. 35, 487–500 (2001)zbMATHCrossRefGoogle Scholar
  42. 42.
    Keller, J., Giraud-Carrier, C.: ECML 2000 workshop on meta-learning: building automatic advice strategies for model selection and method combination. In: Eleventh European Conference on Machine Learning (ECML-2000), 30 May–2 June, Barcelona, Spain (2000)Google Scholar
  43. 43.
    van der Laan, M.J., Robins, J.M.: Unified Methods for Censored Longitudinal Data and Causality. Springer, Berlin Heidelberg New York (2003)zbMATHGoogle Scholar
  44. 44.
    Lagoudakis, M.G., Littman, M.L.: Algorithm selection using reinforcement learning. In: Langley, P. (ed.) Proceedings of the Seventeenth International Conference on Machine Learning (ICML 2000), pp. 511–518. Morgan Kaufmann, San Mateo, CA (2000)Google Scholar
  45. 45.
    Leyton-Brown, K., Nudelman, E., Shoham, Y.: Learning the empirical hardness of optimization problems: The case of combinatorial auctions. In: Hentenryck, P.V. (ed.) Principles and Practice of Constraint Programming – CP 2002. Lecture Notes in Computer Science, vol. 2470. Springer, Berlin Heidelberg New York (2002)Google Scholar
  46. 46.
    Leyton-Brown, K., Nudelman, E., Shoham, Y.: Empirical hardness models: Methodology and a case study on combinatorial auctions. J. ACM (submitted)Google Scholar
  47. 47.
    Li, C.M., Anbulagan: Heuristics based on unit propagation for satisfiability problems. In: Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence, IJCAI 97, vol. 1, pp. 366–371. Morgan Kaufmann, San Mateo, CA (1997)Google Scholar
  48. 48.
    Li, C.M., Huang, W.: Diversification and determinism in local search for satisfiability. In: Bacchus, F., Walsh, T. (eds.) Theory and Applications of Satisfiability Testing, 8th International Conference, SAT 2005, Lecture Notes in Computer Science, vol. 3569, pp. 158–172. Springer, Berlin Heidelberg New York (2005)Google Scholar
  49. 49.
    Li, G., Doss, H.: An approach to nonparametric regression for life history data using local linear fitting. Ann. Stat. 23, 787–823 (1995)zbMATHMathSciNetGoogle Scholar
  50. 50.
    Li, H.: Censored data regression in high dimension and low sample size settings for genomic applications. Technical Report no. 9, University of Pennsylvania (2006)Google Scholar
  51. 51.
    Luby, M., Sinclair, A., Zuckerman, D.: Optimal speedup of las vegas algorithms. Inf. Process. Lett. 47(4), 173–180 (1993)zbMATHCrossRefMathSciNetGoogle Scholar
  52. 52.
    Mitchell, D., Selman, B., Levesque, H.: Hard and easy distributions of sat problems. In: Proceedings 10th National Conf. on Artificial Intelligence, pp. 459–465 (1992)Google Scholar
  53. 53.
    Moore, A.W., Lee, M.S.: Efficient algorithms for minimizing cross validation error. In: Cohen, W.W., Hirsh, H. (eds.) Proceedings of the Eleventh International Conference (ICML). Machine Learning, pp. 190–198. Morgan Kaufmann, San Mateo, CA (1994)Google Scholar
  54. 54.
    Nelson, W.: Applied Life Data Analysis. Wiley, New York (1982)zbMATHCrossRefGoogle Scholar
  55. 55.
    Nielsen, J., Linton, O.: Kernel estimation in a nonparametric marker dependent hazard model. Ann. Stat. 23, 1735–1748 (1995)zbMATHMathSciNetGoogle Scholar
  56. 56.
    Nudelman, E.: Empirical approach to the complexity of hard problems. Ph.D. thesis, Stanford University, CA (2005)Google Scholar
  57. 57.
    Nudelman, E., Leyton-Brown, K., Hoos, H.H., Devkar, A., Shoham, Y.: Understanding random sat: Beyond the clauses-to-variables ratio. In: Wallace, M. (ed.) Principles and Practice of Constraint Programming – CP 2004. Lecture Notes in Computer Science, vol. 3258, pp. 438–452. Springer, Berlin Heidelberg New York (2004)Google Scholar
  58. 58.
    Petrik, M.: Learning parallel portfolios of algorithms. Master’s thesis, Comenius University (2005)Google Scholar
  59. 59.
    Petrik, M.: Statistically optimal combination of algorithms. SOFSEM 2005 – 31st Annual Conference on Current Trends in Theory and Practice of Informatics, Slovak Republic, 22–28 January, 2005Google Scholar
  60. 60.
    Petrik, M., Zilberstein, S.: Learning static parallel portfolios of algorithms. Ninth international symposium on artificial intelligence and mathematics, Fort Lauderdale FL, 4–6 January 2006Google Scholar
  61. 61.
    Pfahringer, B., Bensusan, H., Giraud-Carrier, C.: Meta-learning by landmarking various learning algorithms. In: Langley, P. (ed.) Proceedings of the Seventeenth International Conference on Machine Learning (ICML 2000), pp. 743–750. Morgan Kaufmann, San Mateo, CA (2000)Google Scholar
  62. 62.
    Pratt, L., Thrun, S.: Guest editors’ introduction. Mach. Learn. 28, 5 (1997) (Special Issue on Inductive Transfer)CrossRefGoogle Scholar
  63. 63.
    Rice, J.R.: The algorithm selection problem. In: Rubinoff, M., Yovits, M.C. (eds.) Advances in Computers, vol. 15, pp. 65–118. Academic, New York (1976)Google Scholar
  64. 64.
    Robbins, H.: Some aspects of the sequential design of experiments. Bull. Am. Math. Soc. 58, 527–535 (1952)zbMATHMathSciNetCrossRefGoogle Scholar
  65. 65.
    Russell, S.J., Wefald, E.H.: Principles of metareasoning. Artif. Intell. 49(1–3), 361–395 (1991)zbMATHCrossRefMathSciNetGoogle Scholar
  66. 66.
    Russell, S.J., Zilberstein, S.: Anytime sensing, planning, and action: A practical model for robot control. In: Bajcsy, R. (ed.) Proceedings of the International Conference on Artificial Intelligence (IJCAI-93), Chambéry, France, pp. 1402–1407. Morgan Kaufmann, San Mateo, CA (1993)Google Scholar
  67. 67.
    Sałustowicz, R.P., Schmidhuber, J.: Probabilistic incremental program evolution. Evol. Comput. 5(2), 123–141 (1997)Google Scholar
  68. 68.
    Schmidhuber, J.: Optimal ordered problem solver. Mach. Learn. 54, 211–254 (2004) (Short version in NIPS 15, p. 1571–1578, 2003)zbMATHCrossRefGoogle Scholar
  69. 69.
    Schmidhuber, J., Zhao, J., Wiering, M.: Shifting inductive bias with success-story algorithm, adaptive Levin search, and incremental self-improvement. Mach. Learn. 28, 105–130 (1997) (Based on: Simple principles of metalearning. TR IDSIA-69-96, 1996)CrossRefGoogle Scholar
  70. 70.
    Soares, C., Brazdil, P.B., Kuba, P.: A meta-learning method to select the kernel width in support vector regression. Mach. Learn. 54(3), 195–209 (2004)zbMATHCrossRefGoogle Scholar
  71. 71.
    Solomonoff, R.J.: Progress in incremental machine learning. Technical Report. IDSIA-16-03, IDSIA (2003)Google Scholar
  72. 72.
    Spierdijk, L.: Nonparametric conditional hazard rate estimation: a local linear approach. Technical Report. TW Memorandum, University of Twente (2005)Google Scholar
  73. 73.
    Streeter, M.J., Smith, S.F.: An asymptotically optimal algorithm for the max k-armed bandit problem. In: Proceedings, The Twenty-First National Conference on Artificial Intelligence and the Eighteenth Innovative Applications of Artificial Intelligence Conference (AAAI/IAAI). AAAI Press, Menlo Park, CA (2006)Google Scholar
  74. 74.
    Vapnik, V.: The Nature of Statistical Learning Theory. Springer, Berlin Heidelberg New York (1995)zbMATHGoogle Scholar
  75. 75.
    Vilalta, R., Drissi, Y.: A perspective view and survey of meta-learning. Artif. Intell. Rev. 18(2), 77–95 (2002)CrossRefGoogle Scholar
  76. 76.
    Wang, J.L.: Smoothing hazard rate. In: Armitage, P., et al. (eds.) Encyclopedia of Biostatistics, 2nd Edition, vol. 7, pp. 4986–4997. Wiley, New York (2005)Google Scholar
  77. 77.
    Wichert, L., Wilke, R.A.: Application of a simple nonparametric conditional quantile function estimator in unemployment duration analysis. Technical Report. ZEW Discussion Paper No. 05-67, Centre for European Economic Research (2005)Google Scholar

Copyright information

© Springer Science+Business Media, Inc. 2007

Authors and Affiliations

  1. 1.IDSIAManno (Lugano)Switzerland
  2. 2.Faculty of InformaticsUniversity of LuganoLuganoSwitzerland
  3. 3.TU MunichMünchenGermany

Personalised recommendations