Abstract
In this paper we provide empirical data of the performance of the two most commonly used multiobjective reinforcement learning algorithms against a set of benchmarks. First, we describe a methodology that was used in this paper. Then, we carefully describe the details and properties of the proposed problems and how those properties influence the behavior of tested algorithms. We also introduce a testing framework that will significantly improve future empirical comparisons of multiobjective reinforcement learning algorithms. We hope this testing environment eventually becomes a central repository of test problems and algorithms The empirical results clearly identify features of the test problems which impact on the performance of each algorithm, demonstrating the utility of empirical testing of algorithms on problems with known characteristics.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Barrett, L., Narayanan, S.: Learning all optimal policies with multiple criteria. In: Proceedings of the International Conference on Machine Learning (2008)
Castelletti, A., Corani, G., Rizzolli, A., Soncinie-Sessa, R., Weber, E.: Reinforcement learning in the operational management of a water system. In: IFAC Workshop on Modeling and Control in Environmental Issues, pp. 325–330. Keio University, Yokohama (2002)
Gabor, Z., Kalmar, Z., Szepesvari, C.: Multi-criteria reinforcement learning. In: The Proceedings of the 15th International Conference on Machine Learning, pp. 197–205 (1998)
Geibel, P.: Reinforcement Learning for MDPs with Constraints. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 646–653. Springer, Heidelberg (2006)
Mannor, S., Shimkin, N.: The steering approach for multi-criteria reinforcement learning. In: Neural Information Processing Systems, Vancouver, Canada, pp. 1563–1570 (2001)
Natarajan, S., Tadepalli, P.: Dynamic preferences in multi-criteria reinforcement learning. In: The Proceedings of the International Conference on Machine Learning, Bonn, Germany, pp. 601–608 (2005)
Tanner, B., White, A.: RL-Glue: Language-Independent Software for Reinforcement-Learning Experiments. Journal of Machine Learning Research 10, 2133–2136 (2009)
Vamplew, P., Dazeley, R., Berry, A., Issabekov, R., Dekker, E.: Empirical Evaluation Methods for Multiobjective Reinforcement Learning Algorithms. Machine Learning, Special Issue on Empirical Evaluation of Reinforcement Learning 84(1-2), 51–80 (2011)
While, L., Bradstreet, L., Barone, L.: A Fast Way of Calculating Exact Hypervolumes. IEEE Transactions on Evolutionary Computation (2010)
Uchibe, E., Doya, K.: Finding intrinsic rewards by embodied evolution and constrained reinforcement learning. Neural Networks 21(10), 1447–1455 (2008)
Zitzler, E.: Evolutionary Algorithms for Multiobjective Optimization: Methods and Applications. PhD thesis, Swiss Federal Institute of Technology (ETH), Zurich, Switzerland (November 1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Issabekov, R., Vamplew, P. (2012). An Empirical Comparison of Two Common Multiobjective Reinforcement Learning Algorithms. In: Thielscher, M., Zhang, D. (eds) AI 2012: Advances in Artificial Intelligence. AI 2012. Lecture Notes in Computer Science(), vol 7691. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35101-3_53
Download citation
DOI: https://doi.org/10.1007/978-3-642-35101-3_53
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35100-6
Online ISBN: 978-3-642-35101-3
eBook Packages: Computer ScienceComputer Science (R0)