Annals of Operations Research

, Volume 190, Issue 1, pp 289–309 | Cite as

Embedding a state space model into a Markov decision process

  • Lars Relund NielsenEmail author
  • Erik Jørgensen
  • Søren Højsgaard


In agriculture Markov decision processes (MDPs) with finite state and action space are often used to model sequential decision making over time. For instance, states in the process represent possible levels of traits of the animal and transition probabilities are based on biological models estimated from data collected from the animal or herd.

State space models (SSMs) are a general tool for modeling repeated measurements over time where the model parameters can evolve dynamically.

In this paper we consider methods for embedding an SSM into an MDP with finite state and action space. Different ways of discretizing an SSM are discussed and methods for reducing the state space of the MDP are presented. An example from dairy production is given.


State space model Markov decision process Sequential decision making Stochastic dynamic programming 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Ausiello, G., Franciosa, P. G., & Frigioni, D. (2001). Directed hypergraphs: Problems, algorithmic results, and a novel decremental approach. In Lecture notes in computer science. Vol. 2202: Theoretical computer science: 7th Italian conference, ICTCS (pp. 312–328). Torino, Italy. Berlin: Springer. Google Scholar
  2. Boutilier, C., Dearden, R., & Goldszmidt, M. (2000). Stochastic dynamic programming with factored representations. Artificial Intelligence, 121(1–2), 49–107. doi: 10.1016/s0004-3702(00)00033-3. CrossRefGoogle Scholar
  3. Cornou, C. (2006). Automated oestrus detection methods in group housed sows: review of the current methods and perspectives for development. Livestock Science, 105, 1–11. doi: 10.1016/j.livsci.2006.05.023. CrossRefGoogle Scholar
  4. de Mol, R. M., Keen, A., Kroeze, G. H., & Achten, J. M. F. H. (1999). Description of a detection model for oestrus and diseases in dairy cattle based on time series analysis combined with a Kalman filter. Computers and Electronics in Agriculture, 22(2–3), 171–185. doi: 10.1016/s0168-1699(99)00016-2. CrossRefGoogle Scholar
  5. Diggle, P., & Kenward, M. G. (1994). Informative dropout in longitudinal data analysis. Applied Statistics, 43, 49–93. CrossRefGoogle Scholar
  6. Feng, Z., Dearden, R., Meuleau, N., & Washington, R. (2004). Dynamic programming for structured continuous Markov decision problems. In AUAI ’04: Proceedings of the 20th conference on uncertainty in artificial intelligence (pp. 154–161). Arlington, Virginia, USA. Google Scholar
  7. Firk, R., Stamer, E., Junge, W., & Krieter, J. (2002). Automation of oestrus detection in dairy cows: a review. Livestock Production Science, 75, 219–232. doi: 10.1016/s0301-6226(01)00323-2. CrossRefGoogle Scholar
  8. Gallo, G., Longo, G., Pallottino, S., & Nguyen, S. (1993). Directed hypergraphs and applications. Discrete Applied Mathematics, 42(2–3), 177–201. doi: 10.1016/0166-218X(93)90045-P. CrossRefGoogle Scholar
  9. Guestrin, C., Koller, D., Parr, R., & Venkataraman, S. (2003). Efficient solution algorithms for factored MDPs. Journal of Artificial Intelligence Research, 19, 399–468. Google Scholar
  10. Guestrin, C., Hauskrecht, M., & Kveton, B. (2004). Solving factored MDPs with continuous and discrete variables. In AUAI ’04: Proceedings of the 20th conference on uncertainty in artificial intelligence (pp. 235–242). Arlington, Virginia, USA. Google Scholar
  11. Houben, E. H. P., Huirne, R. B. M., Dijkhuizen, A. A., & Kristensen, A. R. (1994). Optimal replacement of mastitis cows determined by a hierarchic Markov process. Journal of Dairy Science, 77, 2975–2993. CrossRefGoogle Scholar
  12. Huirne, R. B. M., Dijkhuizen, A. A., & Renkema, J. A. (1991). Economic optimization of sow replacement decisions on the personal computer by method of stochastic dynamic programming. Livestock Production Science, 28, 331–347. doi: 10.1016/0301-6226(91)90014-H. CrossRefGoogle Scholar
  13. Jørgensen, E. (1992). Sow replacement: Reduction of state space in dynamic programming model and evaluation of benefit from using the model (Dina Research Report 6). National Institute of Animal Science. Google Scholar
  14. Kennedy, John O. S., & Stott, Alistair W. (1993). An adaptive decision-making aid for dairy cow replacement. Agricultural Systems, 42(1–2), 25–39. doi: 10.1016/0308-521X(93)90066-B. CrossRefGoogle Scholar
  15. Kozlov, A. V., & Koller, D. (1997). Nonuniform dynamic discretization in hybrid networks. In The thirteenth conference on uncertainty in artificial intelligence (UAI-97) (pp. 314–325). 1–3 August 1997. Google Scholar
  16. Kristensen, A. R. (1988). Hierarchic Markov processes and their applications in replacement models. European Journal of Operational Research, 35(2), 207–215. doi: 10.1016/0377-2217(88)90031-8. CrossRefGoogle Scholar
  17. Kristensen, A. R. (1993). Bayesian updating in hierarchical Markov-processes applied to the animal replacement-problem. European Review of Agricultural Economics, 20(2), 223–239. CrossRefGoogle Scholar
  18. Kristensen, A. R. (2003). A general software system for Markov decision processes in herd management applications. Computers and Electronics in Agriculture, 38(3), 199–215. doi: 10.1016/s0168-1699(02)00183-7. CrossRefGoogle Scholar
  19. Kristensen, A. R., & Jørgensen, E. (2000). Multi-level hierarchic Markov processes as a framework for herd management support. Annals of Operations Research, 94, 69–89. doi: 10.1023/A:1018921201113. CrossRefGoogle Scholar
  20. Kristensen, A. R., & Søllested, T. A. (2004a). A sow replacement model using Bayesian updating in a three-level hierarchic Markov process I. Biological model. Livestock Production Science, 87(1), 13–24. doi: 10.1016/j.livprodsci.2003.07.004. Google Scholar
  21. Kristensen, A. R., & Søllested, T. A. (2004b). A sow replacement model using Bayesian updating in a three-level hierarchic Markov process, II. Optimization model. Livestock Production Science, 87(1), 25–36. doi: 10.1016/j.livprodsci.2003.07.005. Google Scholar
  22. Lien, G., Kristensen, A. R., Hegrenes, A., & Hardaker, J. B. (2003). Optimal length of leys in an area with winter damage problems. Grass and Forage Science, 58(2), 168–177. doi: 10.1046/j.1365-2494.2003.00367.x. CrossRefGoogle Scholar
  23. Madsen, T. N., Andersen, S., & Kristensen, A. R. (2005). Modelling the drinking patterns of young pigs using a state space model. Computers and Electronics in Agriculture, 48(1), 39–61. doi: 10.1016/j.compag.2005.01.001. CrossRefGoogle Scholar
  24. Nielsen, L. R., & Kristensen, A. R. (2006). Finding the K best policies in a finite-horizon Markov decision process. European Journal of Operational Research, 175(2), 1164–1179. doi: 10.1016/j.ejor.2005.06.011. CrossRefGoogle Scholar
  25. Nielsen, L. R., Andersen, K. A., & Pretolani, D. (2005). Finding the K shortest hyperpaths. Computers and Operations Research, 32(6), 1477–1497. doi: 10.1016/j.cor.2003.11.014. CrossRefGoogle Scholar
  26. Nielsen, L. R., Jørgensen, E., Kristensen, A. R., & Østergaard, S. (2010). Optimal replacement policies for dairy cows based on daily yield measurements. Journal of Dairy Science, 93(1), 75–92. CrossRefGoogle Scholar
  27. Nielsen, L. R., Pretolani, D., & Andersen, K. A. (2009). Bicriterion shortest paths in stochastic time-dependent networks. In Lecture notes in economics and mathematical systems. Vol. 618: Multiobjective programming and goal programming (pp. 57–67). Berlin: Springer. doi: 10.1007/978-3-540-85646-7_6. CrossRefGoogle Scholar
  28. Pla, L. M., Pomar, C., & Pomar, J. (2003). A Markov decision sow model representing the productive lifespan of herd sows. Agricultural Systems, 76(1), 253–272. doi: 10.1016/s0308-521X(02)00102-6. CrossRefGoogle Scholar
  29. Puterman, M. L. (1994). Markov decision processes. Wiley series in probability and mathematical statistics. New York: Wiley-Interscience. CrossRefGoogle Scholar
  30. Stott, A. W., Jones, G. M., Humphry, R. W., & Gunn, G. J. (2005). Financial incentive to control paratuberculosis (Johne’s disease) on dairy farms in the United Kingdom. Veterinary Record, 156(26), 825–831. Google Scholar
  31. Thysen, I. (1993). Monitoring bulk tank somatic cell counts by a multi-process Kalman filter. Acta Agriculturae Scandinavica Section A—Animal Science, 43, 58–64. CrossRefGoogle Scholar
  32. Toft, N., & Jørgensen, E. (2002). Estimation of farm specific parameters in a longitudinal model for litter size with variance components and random dropout. Livestock Production Science, 77(2–3), 175–185. doi: 10.1016/s0301-6226(02)00061-1. CrossRefGoogle Scholar
  33. Van Bebber, J., Reinsch, N., Junge, W., & Kalm, E. (1999). Monitoring daily milk yields with a recursive test day repeatability model (Kalman filter). Journal of Dairy Science, 82(11), 2421. CrossRefGoogle Scholar
  34. West, M., & Harrison, J. (1997). Bayesian forecasting and dynamic models (2nd ed). New York: Springer. Google Scholar
  35. Yalcin, C., & Stott, A. W. (2000). Dynamic programming to investigate financial impacts of mastitis control decisions in milk production systems. Journal of Dairy Research, 67(4), 515–528, doi: 10.1017/s0022029900004453. CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2010

Authors and Affiliations

  • Lars Relund Nielsen
    • 1
    Email author
  • Erik Jørgensen
    • 1
  • Søren Højsgaard
    • 1
  1. 1.Research Unit of Bioinformatics, Genetics and Statistics, Department of Genetics and Biotechnology, Faculty of Agricultural SciencesUniversity of AarhusTjeleDenmark

Personalised recommendations