Skip to main content
Log in

Simulation-based learning of cost-to-go for control of nonlinear processes

  • Published:
Korean Journal of Chemical Engineering Aims and scope Submit manuscript

Abstract

In this paper, we present a simulation-based dynamic programming method that learns the ‘cost-to-go’ function in an iterative manner. The method is intended to combat two important drawbacks of the conventional Model Predictive Control (MPC) formulation, which are the potentially exorbitant online computational requirement and the inability to consider the future interplay between uncertainty and estimation in the optimal control calculation. We use a nonlinear Van de Vusse reactor to investigate the efficacy of the proposed approach and identify further research issues.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • åström, K. J. and Helmersson, A., “Dual Control of an Integrator with Unknown Gain,”Comp. and Maths. with Appls.,12A, 653 (1986).

    Google Scholar 

  • åström, K. J. and Wittenmark, B., “Adaptive Control,” Addison-Wesley (1989).

  • Bellman, R. E., “Dynamic Programming,” Princeton University Press, New Jersey (1957).

    Google Scholar 

  • Bemporad, A. and Morari, M., “Control of Systems Integrating Logic, Dynamics and Constraints,”Automatica,35, 407 (1999).

    Article  Google Scholar 

  • Bertsekas, D. P., “Dynamic Programming and Optimal Control,” 2nd ed., Athena Scientific, Belmont, MA (2000).

    Google Scholar 

  • Bertsekas, D. P. and Tsitsiklis, J. N., “Neuro-Dynamic Programming,” Athena Scientific, Belmont, MA (1996).

    Google Scholar 

  • Chikkula, Y. and Lee, J.H., “Robust Adaptive Predictive Control of Nonlinear Processes Using Nonlinear Moving Average System Models,”Ind. Eng. Chem. Res.,39, 2010 (2000).

    Article  CAS  Google Scholar 

  • Crites, R. H. and Barto, A. G., “Improving Elevator Performance Using Reinforcement Learning,” Advances in Neural Information Processing Systems 8, Touretzky, D. S., Mozer, M. C. and Haselmo, M. E., eds., MIT Press, Cambridge, MA, 1017 (1996).

    Google Scholar 

  • Henson, M.A., “Nonlinear Model Predictive Control: Current Status and Future Directions,”Computers and Chemical Engineering,23, 187 (1998).

    Article  CAS  Google Scholar 

  • Howard, R.A., “Dynamic Programming and Markov Processes,” MIT Press, Cambridge, MA (1960).

    Google Scholar 

  • Kaisare, N. S., Lee, J. M. and Lee, J.H., “Simulation Based Strategy for Nonlinear Optimal Control: Application to a Microbial Cell Reactor,”International Journal of Robust and Nonlinear Control,13, 347 (2002).

    Article  Google Scholar 

  • Lee, J.H. and Cooley, B., “Recent Advances in Model Predictive Control,”Chemical Process Control-V, 201 (1997).

  • Lee, J. H. and Ricker, N. L., “Extended Kalman Filter Based Nonlinear Model Predictive Control,”Ind. Eng. Chem. Res.,33, 1530 (1994).

    Article  CAS  Google Scholar 

  • Lee, J.H. and Yu, Z., “Worst-Case Formulations of Model Predictive Control for Systems with Bounded Parameters,”Automatica,33, 1530 (1997).

    Google Scholar 

  • Lee, J. M. and Lee, J.H., “Simulation-Based Dual Mode Controller for Nonlinear Processes,” Proceedings of IFAC ADCHEM 2003, accepted (2004).

  • Lee, J.M. and Lee, J. H., “Neuro-Dynamic Programming Approach to Dual Control Problems,” AIChE Annual Meeting, Reno, NV (2001).

  • Marback, P. and Tsitsiklis, J.N., “Simulation-Based Optimization of Markov Reward Processes,”IEEE Transactions on Automatic Control,46, 191 (2001).

    Article  Google Scholar 

  • Mayne, D.Q., Rawlings, J. B., Rao, C.V. and Scokaert, P.O. M., “Constrained Model Predictive Control: Stability and Optimality,”Automatica,36, 789 (2000).

    Article  Google Scholar 

  • Meadows, E. S. and Rawlings, J. B., “Model Predictive Control,” Nonlinear Process Control, Henson, M. A. and Seborg, D. E., eds., Prentice Hall, New Jersey, 233 (1997).

    Google Scholar 

  • Morari, M. and Lee, J.H., “Model Predictive Control: Past, Present and Future,”Computers and Chemical Engineering,23, 667 (1999).

    Article  CAS  Google Scholar 

  • Puterman, M. L., “Markov Decision Processes,” Wiley, New York, NY (1994).

    Google Scholar 

  • Sistu, P.B. and Bequette, B.W., “Model Predictive Control of Processes with Input Multiplicities,”Chemical Engineering Science,50, 921 (1995).

    Article  CAS  Google Scholar 

  • Sutton, R. S. and Barto, A.G., “Reinforcement Learning: An Introduction,” MIT Press, Cambridge, MA (1998).

    Google Scholar 

  • Tesauro, G. J., “Practical Issues in Temporal Difference Learning,”Machine Learning,8, 257 (1992).

    Google Scholar 

  • Van de Vusse, J. G., “Plug-Flow Type Reactor versus Tank Reactor,”Chemical Engineering Science,19, 964 (1964).

    Article  Google Scholar 

  • Zhang, W. and Dietterich, T. G., “A Reinforcement Learning Approach to Job Shop Scheduling,” Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, 1114 (1995).

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jay H. Lee.

Additional information

This paper is dedicated to Professor Hyun-Ku Rhee on the occasion of his retirement from Seoul National University.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lee, J.M., Lee, J.H. Simulation-based learning of cost-to-go for control of nonlinear processes. Korean J. Chem. Eng. 21, 338–344 (2004). https://doi.org/10.1007/BF02705417

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02705417

Key words

Navigation