Abstract
Fully probabilistic design of decision strategies (FPD) extends Bayesian dynamic decision making. The FPD specifies the decision aim via so-called ideal - a probability density, which assigns high probability values to the desirable behaviours and low values to undesirable ones. The optimal decision strategy minimises the Kullback-Leibler divergence of the probability density describing the closed-loop behaviour to this ideal. In spite of the availability of explicit minimisers in the corresponding dynamic programming, it suffers from the curse of dimensionality connected with complexity of the value function. Recently proposed a lazy FPD tailors lazy learning, which builds a local model around the current behaviour, to estimation of the closed-loop model with the optimal strategy. This paper adds a theoretical support to the lazy FPD and outlines its further improvement.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bellman, R.: Adaptive Control Processes. Princeton U. Press, NJ (1961)
Berec, L., Kárný, M.: Identification of reality in Bayesian context. In: Warwick, K., Kárný, M. (eds.) Computer-Intensive Methods in Control and Signal Processing, pp. 181–193. Birkhäuser (1997)
Berger, J.: Statistical Decision Theory and Bayesian Analysis. Springer, New York (1985)
Bertsekas, D.: Dynamic Programming and Optimal Control. Athena Scientific, US (2001)
Bontempi, G., Birattari, M., Bersini, H.: Lazy learning for local modelling & control design. Int. J. of Control 72(7–8), 643–658 (1999)
Cappe, O., Godsill, S., Moulines, E.: An overview of existing methods and recent advances in sequential monte carlo. Proc. of the IEEE 95(5), 899–924 (2007)
Daum, F.: Nonlinear filters: beyond the kalman filter. IEEE Aerospace and Electronic Systems Magazine 20(8), 57–69 (2005)
Doucet, A., Johansen, A.: A tutorial on particle filtering and smoothing: Fifteen years later. In: Handbook of Nonlinear Filtering. Oxford University Press, Oxford (2011)
Feldbaum, A.: Theory of dual control. Autom. Remote Control 21(9) (1960)
Gilboa, I., Schmeidler, D.: Case-based decsion theory. The Quaterly Journal of Economics 110, 605–639 (1995)
Guan, P., Raginsky, M., Willett, R.: Online Markov decision processes with Kull-back Leibler control cost. IEEE Trans. on Automatic, Control (2014)
Kárný, M.: Towards fully probabilistic control design. Automatica 32(12), 1719–1722 (1996)
Kárný, M.: Adaptive systems: Local approximators? In: Workshop n Adaptive Systems in Control and Signal Processing, pp. 129–134. IFAC, Glasgow (1998)
Kárný, M.: On approximate fully probabilistic design of decision making strategies. In: Guy, T., Kárný, M. (eds.) Proceedings of the 3rd International Workshop on Scalable Decision Making, ECML/PKDD 2013. UTIA AV ČR, Prague (2013) iSBN 978-80-903834-8-7
Kárný, M.: Approximate bayesian recursive estimation. Information Sciences (2014), doi: 10.1016/j.ins.2014.01.048
Kárný, M., Guy, T.V.: Fully probabilistic control design. Systems & Control Letters 55(4), 259–265 (2006)
Kárný, M., Kroupa, T.: Axiomatisation of fully probabilistic design. Information Sciences 186(1), 105–113 (2012)
Kulhavý, R., Zarrop, M.B.: On a general concept of forgetting. Int. J. of Control 58(4), 905–924 (1993)
Kullback, S., Leibler, R.: On information and sufficiency. Annals of Mathematical Statistics 22, 79–87 (1951)
Li, J., Dong, G., Ramamohanarao, K., Wong, L.: Deeps: a new instance-based lazy discovery and classification system. Machine Learning 54(2), 99–124 (2004)
Loeve, M.: Probability Theory. van Nostrand, Princeton, New Jersey (1962) (Russian translation, Moscow 1962)
Macek, K., Guy, T., Kárný, M.: A lazy-learning concept of fully probabilistic decision making (2014) (unpublished manuscript)
Martín-Sánchez, J., Lemos, J., Rodellar, J.: Survey of industrial optimized adaptive control. Int. J. of Adaptive Control and Signal Processing 26(10), 881–918 (2013).
Peterka, V.: Bayesian system identification. In: Eykhoff, P. (ed.) Trends and Progress in System Identification, pp. 239–304. Pergamon Press, Oxford (1981)
Qin, S., Badgwell, T.: A survey of industrial model predictive control technology. Control Engineering Practice 11(7), 733–764 (2003)
Rao, M.: Measure Theory and Integration. John Wiley, NY (1987)
Roll, J., Nazin, A., Ljung, L.: Nonlinear system identification via direct weight optimization. Automatica 41(3), 475–490 (2004)
Sanov, I.: On probability of large deviations of random variables. Matematičeskij Sbornik 42, 11–44 (in russian), also in selected translations mathematical statistics and probability. I 1961, 213–244 (1957)
Savage, L.: Foundations of Statistics. Wiley, NY (1954)
Schon, T., Gustafsson, F., Nordlund, P.: Marginalized particle filters for mixed linear/nonlinear state-space models. IEEE Tran. on Signal Processing 53(7), 2279–2289 (2005)
Si, J., Barto, A., Powell, W., Wunsch, D. (eds.): Handbook of Learning and Approximate Dynamic Programming. Wiley-IEEE Press, Danvers (2004)
Tishby, N., Polani, D.: Information theory of decisions and actions. In: Cutsuridis, V., Hussain, A., Taylor, J. (eds.) Perception-Action Cycle. Springer Series in Cognitive and Neural Systems, pp. 601–636. Springer, New York (2011)
Todorov, E.: Linearly-solvable Markov decision problems. In: Schölkopf, B., et al. (eds.) Advances in Neural Inf. Processing, pp. 1369–1376. MIT Press, NY (2006)
Zhu, C., Zhu, W.: Feedback control of nonlinear stochastic systems for targeting a specified stationary probability density. Automatica 47(3), 539–544 (2006)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Kárný, M., Macek, K., Guy, T.V. (2014). Lazy Fully Probabilistic Design of Decision Strategies. In: Zeng, Z., Li, Y., King, I. (eds) Advances in Neural Networks – ISNN 2014. ISNN 2014. Lecture Notes in Computer Science(), vol 8866. Springer, Cham. https://doi.org/10.1007/978-3-319-12436-0_16
Download citation
DOI: https://doi.org/10.1007/978-3-319-12436-0_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-12435-3
Online ISBN: 978-3-319-12436-0
eBook Packages: Computer ScienceComputer Science (R0)