Advertisement

Algorithm Survival Analysis

  • Matteo GaglioloEmail author
  • Catherine LegrandEmail author
Chapter

Abstract

Algorithm selection is typically based on models of algorithm performance,learned during a separate offline training sequence, which can be prohibitively expensive. In recent work, we adopted an online approach, in which models of the runtime distributions of the available algorithms are iteratively updated and used to guide the allocation of computational resources, while solving a sequence of problem instances. The models are estimated using survival analysis techniques, which allow us to reduce computation time, censoring the runtimes of the slower algorithms. Here, we review the statistical aspects of our online selection method, discussing the bias induced in the runtime distributions (RTD) models by the competition of different algorithms on the same problem instances.

Keywords

Problem Instance Algorithm Selection Bandit Problem Censor Survival Data Algorithm Portfolio 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Notes

Acknowledgements

The authors would like to thank an anonymous reviewer, and Faustino Gomez, for useful comments on a draft of this chapter. The first author was supported by the Swiss National Science Foundation with a grant for prospective researchers (n. PBTI2–118573).

References

  1. Aalen O (1978) Nonparametric inference for a family of counting processes. Annals of Statistics 6:701–726zbMATHCrossRefMathSciNetGoogle Scholar
  2. Akritas M (1994) Nearest neighbor estimation of a bivariate distribution under random censoring. Annals of Statistics 22:1299–1327zbMATHCrossRefMathSciNetGoogle Scholar
  3. Auer P, Cesa-Bianchi N, Freund Y, Schapire RE (2003) The nonstochastic multiarmed bandit problem. SIAM Journal on Computing 32(1):48–77CrossRefMathSciNetGoogle Scholar
  4. Beran R (1981) Nonparametric regression with randomly censored survival data. Tech. rep., University of California, Berkeley, CAGoogle Scholar
  5. Bishop CM (1995) Neural networks for pattern recognition. Oxford University PressGoogle Scholar
  6. Boddy M, Dean TL (1994) Deliberation scheduling for problem solving in timeconstrained environments. Artificial Intelligence 67(2):245–285zbMATHCrossRefGoogle Scholar
  7. Collet D (2003) Modeling survival data in medical research. Chapman & Hall/CRC, Boca RatonGoogle Scholar
  8. Cox D (1972) Regression models and life-tables. Journal of the Royal Statistics Society, Series B 34:187–220zbMATHGoogle Scholar
  9. Cox D, Oakes D (1984) Analysis of survival data. Chapman & Hall, LondonGoogle Scholar
  10. Finkelstein L, Markovitch S, Rivlin E (2002) Optimal schedules for parallelizing anytime algorithms: The case of independent processes. Tech. rep., CS Department, Technion, Haifa, IsraelGoogle Scholar
  11. Finkelstein L, Markovitch S, Rivlin E (2003) Optimal schedules for parallelizing anytime algorithms: The case of shared resources. Journal of Artificial Intelligence Research 19:73–138zbMATHMathSciNetGoogle Scholar
  12. Fitzmaurice G, Davidian M, Verbeke G, Molenberghs G (2008) Longitudinal Data Analysis. Chapman & Hall/CRCGoogle Scholar
  13. Fleming T, Harrington D (1991) Counting processes and survival analysis. Wiley, New York, NYzbMATHGoogle Scholar
  14. Frost D, Rish I, Vila L (1997) Summarizing CSP hardness with continuous probability distributions. In: Kuipers B, et al. (eds) Fourteenth National Conference on Artificial Intelligence, pp 327–333Google Scholar
  15. Gagliolo M, Schmidhuber J (2006a) Impact of censored sampling on the performance of restart strategies. In: Benhamou F (ed) Principles and Practice of Constraint Programming, Springer, pp 167–181CrossRefGoogle Scholar
  16. Gagliolo M, Schmidhuber J (2006b) Learning dynamic algorithm portfolios. Annals of Mathematics and Artificial Intelligence 47(3–4):295–328zbMATHMathSciNetGoogle Scholar
  17. Gagliolo M, Schmidhuber J (2007) Learning restart strategies. In: Veloso MM (ed) Twentieth International Joint Conference on Artificial Intelligence, vol 1, AAAI, pp 792–797Google Scholar
  18. Gagliolo M, Schmidhuber J (2008a) Algorithm selection as a bandit problem with unbounded losses. Tech. Rep. IDSIA - 07 - 08, IDSIAGoogle Scholar
  19. Gagliolo M, Schmidhuber J (2008b) Towards distributed algorithm portfolios. In: Corchado JM, et al. (eds) International Symposium on Distributed Computing and Artificial Intelligence (DCAI 2008), Springer, pp 634–643Google Scholar
  20. Gent I, Walsh T (1999) The search for satisfaction. Tech. rep., Dept. of Computer Science, University of StrathclydeGoogle Scholar
  21. Gomes CP, Selman B (2001) Algorithm portfolios. Artificial Intelligence 126(1–2):43–62zbMATHCrossRefMathSciNetGoogle Scholar
  22. Gomes CP, Selman B, Kautz H (1998) Boosting combinatorial search through randomization. In: Mostow J, et al. (eds) Fifteenth National Conference on Artificial Intelligence, pp 431–437Google Scholar
  23. Gomes CP, Selman B, Crato N, Kautz H (2000) Heavy-tailed phenomena in satisfiability and constraint satisfaction problems. Journal of Automated Reasoning 24(1–2):67–100zbMATHCrossRefMathSciNetGoogle Scholar
  24. Hansen EA, Zilberstein S (2001) Monitoring and control of anytime algorithms: A dynamic programming approach. Artificial Intelligence 126(1–2):139–157zbMATHCrossRefMathSciNetGoogle Scholar
  25. Hogg T, Williams CP (1994) The hardest constraint problems: a double phase transition. Artificial Intelligence 69(1–2):359–377zbMATHCrossRefGoogle Scholar
  26. Hoos HH (1999) On the run-time behaviour of stochastic local search algorithms for SAT. In: Hendler J, et al. (eds) Sixteenth National Conference on Artificial Intelligence, pp 661–666Google Scholar
  27. Hoos HH (2002) A mixture-model for the behaviour of SLS algorithms for SAT In: Hendler JA (ed) Eighteenth National Conference on Artificial Intelligence, pp 661–667Google Scholar
  28. Hoos HH, Stützle T (2000) SATLIB: An online resource for research on SAT. In: Gent I, et al. (eds) SAT 2000 — Highlights of Satisfiability Research in the Year 2000, IOS, pp 283–292Google Scholar
  29. Hoos HH, Stützle T (2004) Stochastic Local Search : Foundations & Applications. Morgan KaufmannGoogle Scholar
  30. Huberman BA, Lukose RM, Hogg T (1997) An economic approach to hard computational problems. Science 27:51–53CrossRefGoogle Scholar
  31. Ibrahim JG, Chen MH, Sinha D (2001) Bayesian Survival Analysis. SpringerzbMATHGoogle Scholar
  32. Kaplan EL, Meyer P (1958) Nonparametric estimation from incomplete samples. Journal of the American Statistical Association 73:457–481CrossRefGoogle Scholar
  33. Keilegom IV, Akritas M, Veraverbeke N (2001) Estimation of the conditional distribution in regression with censored data: a comparative study. Computational Statistics and Data Analysis 35:487–500zbMATHCrossRefMathSciNetGoogle Scholar
  34. Klein JP, Moeschberger ML (2003) Survival Analysis: Techniques for Censored and Truncated Data, 2nd edn. SpringerzbMATHGoogle Scholar
  35. Leyton-Brown K, Nudelman E, Shoham Y (2002) Learning the empirical hardness of optimization problems: The case of combinatorial auctions. In: Van Hentenryck P (ed) Principles and Practice of Constraint Programming, pp 91–100Google Scholar
  36. Li CM, Anbulagan (1997) Heuristics based on unit propagation for satisfiability problems. In: Georgeff MP, et al. (eds) Fifteenth International Joint Conference on Artificial Intelligence, pp 366–371Google Scholar
  37. Li CM, Huang W (2005) Diversification and determinism in local search for satisfiability In: Bacchus F, et al. (eds) Theory and Applications of Satisfiability Testing, Springer, pp 158–172CrossRefGoogle Scholar
  38. Li G, Doss H (1995) An approach to nonparametric regression for life history data using local linear fitting. Annals of Statistics 23:787–823zbMATHCrossRefMathSciNetGoogle Scholar
  39. Liang K, Self S, Bandeen-Roche K, Zeger S (1995) Some recent developments for regression analysis of multivariate failure time data. Lifetime Data Analysis 1:403–415zbMATHCrossRefGoogle Scholar
  40. Luby M, Sinclair A, Zuckerman D (1993) Optimal speedup of las vegas algorithms. Information Processing Letters 47(4):173–180zbMATHCrossRefMathSciNetGoogle Scholar
  41. Machin D, Cheung Y, Parmar M (2006) Survival Analysis. A Practical Approach. Wiley, UK, second editionzbMATHCrossRefGoogle Scholar
  42. Mackay DC (2002) Information Theory, Inference and Learning Algorithms. Cambridge University PressGoogle Scholar
  43. Mitchell D, Selman B, Levesque H (1992) Hard and easy distributions of SAT problems. In: Swartout W (ed) Tenth National Conference on Artificial Intelligence, pp 459–465Google Scholar
  44. Nelson W (1972) Theory and applications of hazard plotting for censored failure data. Technometrics 14:945–965CrossRefGoogle Scholar
  45. Nelson W (1982) Applied Life Data Analysis. Wiley, New York, NYzbMATHCrossRefGoogle Scholar
  46. Nielsen J, Linton O (1995) Kernel estimation in a nonparametric marker dependent hazard model. Annals of Statistics 23:1735–1748zbMATHCrossRefMathSciNetGoogle Scholar
  47. Nudelman E, Brown KL, Hoos HH, Devkar A, Shoham Y (2004) Understanding random SAT: Beyond the clauses-to-variables ratio. In:Wallace M(ed) Principles and Practice of Constraint Programming, Springer, pp 438–452Google Scholar
  48. Pintilie M (2006) Competing Risks. A practical perspective. Wiley, New York, NYGoogle Scholar
  49. Putter H, Fiocco M, Geskus R (2006) Tutorial in biostatistics: Competing risks and multi-state models. Statistics in Medicine 26:2389–2430CrossRefMathSciNetGoogle Scholar
  50. Rice JR (1976) The algorithm selection problem. In: Rubinoff M, Yovits MC (eds) Advances in computers, vol 15, Academic, New York, pp 65–118Google Scholar
  51. Spierdijk L (2005) Nonparametric conditional hazard rate estimation: a local linear approach. Tech. Rep. TW Memorandum, University of TwenteGoogle Scholar
  52. Tsang E (1993) Foundations of Constraint Satisfaction. Academic, London and San DiegoGoogle Scholar
  53. Tsiatis A (1975) A nonidentifiability aspect of the problem of competing risks. Proceedings of the National Academy of Sciences of the USA 72(1):20–22zbMATHCrossRefMathSciNetGoogle Scholar
  54. Vilalta R, Drissi Y (2002) A perspective view and survey of meta-learning. Artificial Intelligence Review 18(2):77–95CrossRefGoogle Scholar
  55. Wang JL (2005) Smoothing hazard rate. In: Armitage P, et al. (eds) Encyclopedia of Biostatistics, 2nd Edition, vol 7, Wiley, pp 4986–4997Google Scholar
  56. Wichert L, Wilke RA (2005) Application of a simple nonparametric conditional quantile function estimator in unemployment duration analysis. Tech. Rep. ZEW Discussion Paper No. 05-67, Centre for European Economic ResearchGoogle Scholar
  57. Xu L, Hutter F, Hoos HH, Leyton-Brown K (2008) SATzilla: Portfolio-based algorithm selection for SAT. Journal of Artificial Intelligence Research 32 (2008) 565–606 32:565–606zbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  1. 1.IRIDIA, CoDE, Université Libre de BruxellesBrusselsBelgium
  2. 2.Faculty of InformaticsUniversity of LuganoLuganoSwitzerland
  3. 3.Institut de StatistiqueUniversité Catholique de LouvainLouvain-la-NeuveBelgium

Personalised recommendations