Advertisement

Information Directed Policy Sampling for Partially Observable Markov Decision Processes with Parametric Uncertainty

  • Peeyush Kumar
  • Archis GhateEmail author
Conference paper
Part of the Springer Proceedings in Business and Economics book series (SPBE)

Abstract

This paper formulates partially observable Markov decision processes, where state-transition probabilities and measurement outcome probabilities are characterized by unknown parameters. An information theoretic solution method that adaptively manages the resulting exploitation-exploration trade-off is proposed. Numerical experiments for response guided dosing in healthcare are presented.

Notes

Acknowledgements

This research was funded in part by the National Science Foundation via grant CMMI #1536717.

References

  1. 1.
    Boucherie R, van Dijk NM. Markov decision processes in practice. Basel, Switzerland: Springer; 2017.CrossRefGoogle Scholar
  2. 2.
    Krishnamurthy V. Partially observed Markov decision processes. Cambridge, United Kingdom: Cambridge University Press; 2016.CrossRefGoogle Scholar
  3. 3.
    Kumar P. Information theoretic learning methods for Markov decision processes with parametric uncertainty. Ph.D. thesis, University of Washington, Seattle; 2018.Google Scholar
  4. 4.
    Kumar P, Ghate A. Information directed policy sampling for Markov decision processes with parameteric uncertaint. unpublished; 2018.Google Scholar
  5. 5.
    Lovejoy WS. A survey of algorithmic methods for partially observed Markov decision processes. Ann Oper Res. 1991;28(1):47–65.CrossRefGoogle Scholar
  6. 6.
    Powell WB. Approximate dynamic programming: solving the curse of dimensionality. Hoboken, NJ, USA: Wiley; 2007.CrossRefGoogle Scholar
  7. 7.
    Puterman ML. Markov decision processes: discrete stochastic dynamic programming. New York, NY, USA: Wiley; 1994.CrossRefGoogle Scholar
  8. 8.
    Russo D, Van Roy B. Learning to optimize via information directed sampling. Oper Res. 2017;66(1):230–52.CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Industrial & Systems EngineeringUniversity of WashingtonSeattleUSA

Personalised recommendations