Skip to main content

Information Directed Policy Sampling for Partially Observable Markov Decision Processes with Parametric Uncertainty

  • Conference paper
  • First Online:
Advances in Service Science (INFORMS-CSS 2018)

Part of the book series: Springer Proceedings in Business and Economics ((SPBE))

Included in the following conference series:

Abstract

This paper formulates partially observable Markov decision processes, where state-transition probabilities and measurement outcome probabilities are characterized by unknown parameters. An information theoretic solution method that adaptively manages the resulting exploitation-exploration trade-off is proposed. Numerical experiments for response guided dosing in healthcare are presented.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Boucherie R, van Dijk NM. Markov decision processes in practice. Basel, Switzerland: Springer; 2017.

    Book  Google Scholar 

  2. Krishnamurthy V. Partially observed Markov decision processes. Cambridge, United Kingdom: Cambridge University Press; 2016.

    Book  Google Scholar 

  3. Kumar P. Information theoretic learning methods for Markov decision processes with parametric uncertainty. Ph.D. thesis, University of Washington, Seattle; 2018.

    Google Scholar 

  4. Kumar P, Ghate A. Information directed policy sampling for Markov decision processes with parameteric uncertaint. unpublished; 2018.

    Google Scholar 

  5. Lovejoy WS. A survey of algorithmic methods for partially observed Markov decision processes. Ann Oper Res. 1991;28(1):47–65.

    Article  Google Scholar 

  6. Powell WB. Approximate dynamic programming: solving the curse of dimensionality. Hoboken, NJ, USA: Wiley; 2007.

    Book  Google Scholar 

  7. Puterman ML. Markov decision processes: discrete stochastic dynamic programming. New York, NY, USA: Wiley; 1994.

    Book  Google Scholar 

  8. Russo D, Van Roy B. Learning to optimize via information directed sampling. Oper Res. 2017;66(1):230–52.

    Article  Google Scholar 

Download references

Acknowledgements

This research was funded in part by the National Science Foundation via grant CMMI #1536717.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Archis Ghate .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kumar, P., Ghate, A. (2019). Information Directed Policy Sampling for Partially Observable Markov Decision Processes with Parametric Uncertainty. In: Yang, H., Qiu, R. (eds) Advances in Service Science. INFORMS-CSS 2018. Springer Proceedings in Business and Economics. Springer, Cham. https://doi.org/10.1007/978-3-030-04726-9_20

Download citation

Publish with us

Policies and ethics