Abstract
Reinforcement learning is a machine learning technique that attempts to learn a policy based on a reward criterion through trial and error in the target environment. In this paper, we investigate a reinforcement learning algorithm for a partially observable environment without prior knowledge about the state transition and observation likelihood functions. When only noisy or incomplete measurement is available due to inexpensive sensors, we can use a state estimator in combination with a policy trained via reinforcement learning to recover the performance of the policy. In such cases, performance may degrade due to state estimation error. To solve this problem, we propose a planning method that considers the prediction of state estimation error. The proposed algorithm indirectly predicts the state estimation error and avoids to get into areas where large state estimation error may occur using prediction. We tested the proposed method in a simple inverted pendulum task, where the agent only receives noisy and drifted measurements. Empirical result demonstrates that performance of policy in a partially observable environment is improved by the proposed method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Brockman, G., et al.: Openai gym. arXiv preprint arXiv:1606.01540 (2016)
Deisenroth, M.P., et al.: Solving nonlinear continuous state-action-observation pomdps for mechanical systems with gaussian noise. In: Proceedings of European Workshop on Reinforcement Learning (2012)
Karkus, P., et al.: Particle filter networks with application to visual localization. In: Proceedings of The 2nd Conference on Robot Learning, vol. 87, pp. 169–178 (2018)
Ko, J., et al.: Gp-Bayes filters: Bayesian filtering using gaussian process prediction and observation models. Auton. Robot. 27(1), 75–90 (2009)
Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)
Nishida, T.: State feedback control using particle filter. IEE J Trans. Electron. Inf. Syst. 133(7), 1376–1383 (2013). (in Japanese)
Rigatos, G.G.: Particle and Kalman filtering for state estimation and control of dc motors. ISA Trans. 48(1), 62–72 (2009)
Sutton, R.S., et al.: Reinforcement Learning: An Introduction. The MIT Press (2018)
Suyama, A.: Bayesian Deep Learning. Kodansha (2019) (in Japanese)
Thrun, S., et al.: Probabilistic Robotics. MIT Press (2005)
Yano, K.: A tutorial on particle filters: filters, smoothing, and parameter estimation. J. Jpn. Stat. Soc. 44(1), 189–216 (2014). (in Japanese)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Watabe, Y., Shibuya, T. (2022). Reinforcement Learning Algorithm for Partially Observable Environment Considering State Estimation Error Prediction. In: Mandal, J.K., Buyya, R., De, D. (eds) Proceedings of International Conference on Advanced Computing Applications. Advances in Intelligent Systems and Computing, vol 1406. Springer, Singapore. https://doi.org/10.1007/978-981-16-5207-3_52
Download citation
DOI: https://doi.org/10.1007/978-981-16-5207-3_52
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-5206-6
Online ISBN: 978-981-16-5207-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)