Flexible Transfer Learning Framework for Bayesian Optimisation

  • Tinu Theckel JoyEmail author
  • Santu Rana
  • Sunil Kumar Gupta
  • Svetha Venkatesh
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9651)


Bayesian optimisation is an efficient technique to optimise functions that are expensive to compute. In this paper, we propose a novel framework to transfer knowledge from a completed source optimisation task to a new target task in order to overcome the cold start problem. We model source data as noisy observations of the target function. The level of noise is computed from the data in a Bayesian setting. This enables flexible knowledge transfer across tasks with differing relatedness, addressing a limitation of the existing methods. We evaluate on the task of tuning hyperparameters of two machine learning algorithms. Treating a fraction of the whole training data as source and the whole as the target task, we show that our method finds the best hyperparameters in the least amount of time compared to both the state-of-art and no transfer method.


Target Function Radial Basis Function Kernel Transfer Learning Target Task Expected Improvement 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Snoek, J., Larochelle, H., Adams, R.P.: Practical Bayesian optimization of machine learning algorithms. In: Advances in neural information processing systems, pp. 2951–2959 (2012)Google Scholar
  2. 2.
    Mockus, J.: Application of Bayesian approach to numerical methods of global and stochastic optimization. J. Global Optim. 4(4), 347–365 (1994)MathSciNetCrossRefzbMATHGoogle Scholar
  3. 3.
    Lizotte, D.J., Wang, T., Bowling, M.H., Schuurmans, D.: Automatic gait optimization with Gaussian process regression. IJCAI 7, 944–949 (2007)Google Scholar
  4. 4.
    Brochu, E., Cora, V.M., De Freitas, N.: A tutorial on bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning (2010). arXiv preprint arXiv:1012.2599
  5. 5.
    Garnett, R., Osborne, M.A., Roberts, S.J.: Bayesian optimization for sensor set selection. In: Proceedings of the 9th ACM/IEEE International Conference on Information Processing in Sensor Networks, pp. 209–219. ACM (2010)Google Scholar
  6. 6.
    Bergstra, J., Bardenet, R., Kégl, B., Bengio, Y.: Implementations of algorithms for hyper-parameter optimization. In: NIPS Workshop on Bayesian Optimization, p. 29 (2011)Google Scholar
  7. 7.
    Thornton, C., Hutter, F., Hoos, H.H., Leyton-Brown, K.: Auto-weka: combined selection and hyperparameter optimization of classification algorithms. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 847–855. ACM (2013)Google Scholar
  8. 8.
    Bardenet, R., Brendel, M., Kégl, B., et al.: Collaborative hyperparameter tuning. In: Proceedings of the 30th International Conference on Machine Learning (ICML 2013), pp. 199–207 (2013)Google Scholar
  9. 9.
    Yogatama, D., Mann, G.: Efficient transfer learning method for automatic hyperparameter tuning. Transfer 1, 1 (2014)Google Scholar
  10. 10.
    Williams, C.K., Rasmussen, C.E.: Gaussian Processes for Machine Learning, vol. 2, 3rd edn. The MIT Press, Cambridge (2006)zbMATHGoogle Scholar
  11. 11.
    Kushner, H.J.: A new method of locating the maximum point of an arbitrary multipeak curve in the presence of noise. J. Fluids Eng. 86(1), 97–106 (1964)MathSciNetGoogle Scholar
  12. 12.
    Mockus, J., Tiesis, V., Zilinskas, A.: The application of bayesian methods for seeking the extremum. Towards Global Optim. 2(117–129), 2 (1978)zbMATHGoogle Scholar
  13. 13.
    Srinivas, N., Krause, A., Kakade, S., Seeger, M.: Gaussian process optimization in the bandit setting: no regret and experimental design. In: Proceedings of International Conference on Machine Learning (ICML) (2010)Google Scholar
  14. 14.
    Chang, C.-C., Lin, C.-J.: Libsvm: a library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2(3), 27 (2011)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Tinu Theckel Joy
    • 1
    Email author
  • Santu Rana
    • 1
  • Sunil Kumar Gupta
    • 1
  • Svetha Venkatesh
    • 1
  1. 1.Centre for Pattern Recognition and Data AnalyticsDeakin UniversityGeelongAustralia

Personalised recommendations