Abstract
Bayesian Optimization (BO) is an efficient method to optimize an expensive black-box function with continuous variables. However, in many cases, the function has only discrete variables as inputs, which cannot be optimized by traditional BO methods. A typical approach to optimize such functions assumes the objective function is on a continuous domain, then applies a normal BO method with a rounding of suggested continuous points to nearest discrete points at the end. This may cause BO to get stuck and repeat pre-existing observations. To overcome this problem, we propose a method (named Discrete-BO) that manipulates the exploration of an acquisition function and the length scale of a covariance function, which are two key components of a BO method, to prevent sampling a pre-existing observation. Our experiments on both synthetic and real-world applications show that the proposed method outperforms state-of-the-art baselines in terms of convergence rate. More importantly, we also show some theoretical analyses to prove the correctness of our method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bergstra, J., Yamins, D., Cox, D.D.: Hyperopt: a python library for optimizing the hyperparameters of machine learning algorithms. In: Proceedings of the 12th Python in Science Conference, pp. 13–20. Citeseer (2013)
Bergstra, J.S., Bardenet, R., Bengio, Y., Kégl, B.: Algorithms for hyper-parameter optimization. In: Advances in Neural Information Processing Systems, pp. 2546–2554 (2011)
Brochu, E., Cora, V.M., De Freitas, N.: A tutorial on Bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning. arXiv preprint arXiv:1012.2599 (2010)
Garrido-Merchán, E.C., Hernández-Lobato, D.: Dealing with categorical and integer-valued variables in Bayesian optimization with Gaussian processes. arXiv preprint arXiv:1805.03463 (2018)
Hutter, F., Hoos, H.H., Leyton-Brown, K.: Sequential model-based optimization for general algorithm configuration. In: Coello, C.A.C. (ed.) LION 2011. LNCS, vol. 6683, pp. 507–523. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-25566-3_40
Jalali, A., Azimi, J., Fern, X., Zhang, R.: A lipschitz exploration-exploitation scheme for Bayesian optimization. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds.) ECML PKDD 2013. LNCS (LNAI), vol. 8188, pp. 210–224. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40988-2_14
Jones, D.R., Schonlau, M., Welch, W.J.: Efficient global optimization of expensive black-box functions. J. Glob. Optim. 13(4), 455–492 (1998)
Kushner, H.J.: A new method of locating the maximum point of an arbitrary multipeak curve in the presence of noise. J. Basic Eng. 86(1), 97–106 (1964)
Lakshminarayanan, B., Roy, D.M., Teh, Y.W.: Mondrian forests for large-scale regression when uncertainty matters. In: Artificial Intelligence and Statistics, pp. 1478–1487 (2016)
Lizotte, D.J.: Practical Bayesian optimization. University of Alberta (2008)
Mockus, J.: Application of Bayesian approach to numerical methods of global and stochastic optimization. J. Glob. Optim. 4(4), 347–365 (1994)
Mockus, J., Tiesis, V., Zilinskas, A.: The application of Bayesian methods for seeking the extremum. Towards Glob. Optim. 2(117–129), 2 (1978)
Rasmussen, C.E.: Gaussian processes in machine learning. In: Bousquet, O., von Luxburg, U., Rätsch, G. (eds.) ML 2003. LNCS (LNAI), vol. 3176, pp. 63–71. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-28650-9_4
Reyes-Ortiz, J.L., Anguita, D., Ghio, A., Parra, X.: Human activity recognition using smartphones data set. UCI Machine Learning Repository; University of California, Irvine, School of Information and Computer Sciences: Irvine, CA, USA (2012)
Shahriari, B., Swersky, K., Wang, Z., Adams, R.P., De Freitas, N.: Taking the human out of the loop: a review of Bayesian optimization. Proc. IEEE 104(1), 148–175 (2016)
Srinivas, N., Krause, A., Kakade, S.M., Seeger, M.W.: Information-theoretic regret bounds for Gaussian process optimization in the bandit setting. IEEE Trans. Inform. Theory 58(5), 3250–3265 (2012)
Xu, L., Krzyzak, A., Suen, C.Y.: Methods of combining multiple classifiers and their applications to handwriting recognition. IEEE Trans. Syst. Man Cybern. 22(3), 418–435 (1992)
Acknowledgements
This research was partially funded by the Australian Government through the Australian Research Council (ARC). Prof Venkatesh is the recipient of an ARC Australian Laureate Fellowship (FL170100006).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Luong, P., Gupta, S., Nguyen, D., Rana, S., Venkatesh, S. (2019). Bayesian Optimization with Discrete Variables. In: Liu, J., Bailey, J. (eds) AI 2019: Advances in Artificial Intelligence. AI 2019. Lecture Notes in Computer Science(), vol 11919. Springer, Cham. https://doi.org/10.1007/978-3-030-35288-2_38
Download citation
DOI: https://doi.org/10.1007/978-3-030-35288-2_38
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-35287-5
Online ISBN: 978-3-030-35288-2
eBook Packages: Computer ScienceComputer Science (R0)