Information Directed Policy Sampling for Partially Observable Markov Decision Processes with Parametric Uncertainty

Kumar, Peeyush; Ghate, Archis

doi:10.1007/978-3-030-04726-9_20

Peeyush Kumar³ &
Archis Ghate³

Part of the book series: Springer Proceedings in Business and Economics ((SPBE))

Included in the following conference series:

INFORMS International Conference on Service Science

891 Accesses
1 Citations

Abstract

This paper formulates partially observable Markov decision processes, where state-transition probabilities and measurement outcome probabilities are characterized by unknown parameters. An information theoretic solution method that adaptively manages the resulting exploitation-exploration trade-off is proposed. Numerical experiments for response guided dosing in healthcare are presented.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Boucherie R, van Dijk NM. Markov decision processes in practice. Basel, Switzerland: Springer; 2017.
Book Google Scholar
Krishnamurthy V. Partially observed Markov decision processes. Cambridge, United Kingdom: Cambridge University Press; 2016.
Book Google Scholar
Kumar P. Information theoretic learning methods for Markov decision processes with parametric uncertainty. Ph.D. thesis, University of Washington, Seattle; 2018.
Google Scholar
Kumar P, Ghate A. Information directed policy sampling for Markov decision processes with parameteric uncertaint. unpublished; 2018.
Google Scholar
Lovejoy WS. A survey of algorithmic methods for partially observed Markov decision processes. Ann Oper Res. 1991;28(1):47–65.
Article Google Scholar
Powell WB. Approximate dynamic programming: solving the curse of dimensionality. Hoboken, NJ, USA: Wiley; 2007.
Book Google Scholar
Puterman ML. Markov decision processes: discrete stochastic dynamic programming. New York, NY, USA: Wiley; 1994.
Book Google Scholar
Russo D, Van Roy B. Learning to optimize via information directed sampling. Oper Res. 2017;66(1):230–52.
Article Google Scholar

Download references

Acknowledgements

This research was funded in part by the National Science Foundation via grant CMMI #1536717.

Author information

Authors and Affiliations

Industrial & Systems Engineering, University of Washington, Seattle, WA, 98195, USA
Peeyush Kumar & Archis Ghate

Authors

Peeyush Kumar
View author publications
You can also search for this author in PubMed Google Scholar
Archis Ghate
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Archis Ghate .

Editor information

Editors and Affiliations

Department of Industrial Engineering, Pennsylvania State University, University Park, PA, USA
Hui Yang
Division of Engineering and Information Science, Pennsylvania State University, Malvern, PA, USA
Robin Qiu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kumar, P., Ghate, A. (2019). Information Directed Policy Sampling for Partially Observable Markov Decision Processes with Parametric Uncertainty. In: Yang, H., Qiu, R. (eds) Advances in Service Science. INFORMS-CSS 2018. Springer Proceedings in Business and Economics. Springer, Cham. https://doi.org/10.1007/978-3-030-04726-9_20

Download citation

DOI: https://doi.org/10.1007/978-3-030-04726-9_20
Published: 29 December 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-04725-2
Online ISBN: 978-3-030-04726-9
eBook Packages: Business and ManagementBusiness and Management (R0)

Publish with us

Policies and ethics