Skip to main content

Reinforcement Learning Algorithm for Partially Observable Environment Considering State Estimation Error Prediction

  • Conference paper
  • First Online:
Proceedings of International Conference on Advanced Computing Applications

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1406))

  • 465 Accesses

Abstract

Reinforcement learning is a machine learning technique that attempts to learn a policy based on a reward criterion through trial and error in the target environment. In this paper, we investigate a reinforcement learning algorithm for a partially observable environment without prior knowledge about the state transition and observation likelihood functions. When only noisy or incomplete measurement is available due to inexpensive sensors, we can use a state estimator in combination with a policy trained via reinforcement learning to recover the performance of the policy. In such cases, performance may degrade due to state estimation error. To solve this problem, we propose a planning method that considers the prediction of state estimation error. The proposed algorithm indirectly predicts the state estimation error and avoids to get into areas where large state estimation error may occur using prediction. We tested the proposed method in a simple inverted pendulum task, where the agent only receives noisy and drifted measurements. Empirical result demonstrates that performance of policy in a partially observable environment is improved by the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 249.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Brockman, G., et al.: Openai gym. arXiv preprint arXiv:1606.01540 (2016)

  2. Deisenroth, M.P., et al.: Solving nonlinear continuous state-action-observation pomdps for mechanical systems with gaussian noise. In: Proceedings of European Workshop on Reinforcement Learning (2012)

    Google Scholar 

  3. Karkus, P., et al.: Particle filter networks with application to visual localization. In: Proceedings of The 2nd Conference on Robot Learning, vol. 87, pp. 169–178 (2018)

    Google Scholar 

  4. Ko, J., et al.: Gp-Bayes filters: Bayesian filtering using gaussian process prediction and observation models. Auton. Robot. 27(1), 75–90 (2009)

    Article  Google Scholar 

  5. Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)

    Article  Google Scholar 

  6. Nishida, T.: State feedback control using particle filter. IEE J Trans. Electron. Inf. Syst. 133(7), 1376–1383 (2013). (in Japanese)

    Google Scholar 

  7. Rigatos, G.G.: Particle and Kalman filtering for state estimation and control of dc motors. ISA Trans. 48(1), 62–72 (2009)

    Article  Google Scholar 

  8. Sutton, R.S., et al.: Reinforcement Learning: An Introduction. The MIT Press (2018)

    Google Scholar 

  9. Suyama, A.: Bayesian Deep Learning. Kodansha (2019) (in Japanese)

    Google Scholar 

  10. Thrun, S., et al.: Probabilistic Robotics. MIT Press (2005)

    Google Scholar 

  11. Yano, K.: A tutorial on particle filters: filters, smoothing, and parameter estimation. J. Jpn. Stat. Soc. 44(1), 189–216 (2014). (in Japanese)

    MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Takeshi Shibuya .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Watabe, Y., Shibuya, T. (2022). Reinforcement Learning Algorithm for Partially Observable Environment Considering State Estimation Error Prediction. In: Mandal, J.K., Buyya, R., De, D. (eds) Proceedings of International Conference on Advanced Computing Applications. Advances in Intelligent Systems and Computing, vol 1406. Springer, Singapore. https://doi.org/10.1007/978-981-16-5207-3_52

Download citation

Publish with us

Policies and ethics