Intelligent dynamic control policies for serial production lines

Paternina-Arboleda, Carlos D.; Das, Tapas K.

doi:10.1023/A:1007641824604

Intelligent dynamic control policies for serial production lines

Published: January 2001

Volume 33, pages 65–77, (2001)
Cite this article

IIE Transactions

Carlos D. Paternina-Arboleda¹ &
Tapas K. Das²

42 Accesses
Explore all metrics

Abstract

Heuristic production control policies such as CONWIP, kanban, and other hybrid policies have been in use for years as better alternatives to MRP-based push control policies. It is a fact that these policies, although efficient, are far from optimal. Our goal is to develop a methodology that, for a given system, finds a dynamic control policy via intelligent agents. Such a policy while achieving the productivity (i.e., demand service rate) goal of the system will optimize a cost/reward function based on the WIP inventory. To achieve this goal we applied a simulation-based optimization technique called Reinforcement Learning (RL) on a four-station serial line. The control policy attained by the application of a RL algorithm was compared with the other existing policies on the basis of total average WIP and average cost of WIP. We also develop a heuristic control policy in light of our experience gained from a close examination of the policies obtained by the RL algorithm. This heuristic policy named Behavior-Based Control (BBC), although placed second to the RL policy, proved to be a more efficient and leaner control policy than most of the existing policies in the literature. The performance of the BBC policy was found to be comparable to the Extended Kanban Control System (EKCS), which as per our experimentation, turned out to be the best of the existing policies. The numerical results used for comparison purposes were obtained from a four-station serial line with two different (constant and Poisson) demand arrival processes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Abounadi, J. (1998) Stochastic approximation for non-expansive maps: applications to Q-learning algorithms. Unpublished Ph.D. Thesis, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA.
Arkin, R.C. (1998) Behavior-based Robotics, 1st edn, The MIT Press, Cambridge, MA.
Google Scholar
Askin, R.G. and Standridge, C.R. (1993) Modeling and Analysis of Manufacturing Systems, 1st edn, John Wiley & Sons, New York, NY.
Google Scholar
Berkley, B.J. (1992) A review of the kanban production control research literature. Production and Operations Management, 1(4), 393-411.
Google Scholar
Bertsekas, D.P. and Tsitsiklis, J.N. (1996) Neuro-Dynamic Programming, Athena Scientific, Belmont, MA.
Google Scholar
Bonvik, A.M., Couch, C.E. and Gershwin, S.B. (1997) A comparison of production-line control mechanisms. International Journal of Production Research, 35(3), 789-804.
Google Scholar
Buzacott, J.A. and Shantikumar, J.G. (1992) A general approach for coordinating production in multiple cell manufacturing systems. Production and Operation Management, 1(1), 34-52.
Google Scholar
Dallery, Y. and Liberopoulos, G. (2000) Extended kanban control system: combining kanban and base stock. IIE Transactions, 32(4), 369-386.
Google Scholar
Das, T.K., Gosavi, A., Mahadevan, S. and Marchellack, N. (1999) Solving semi-Markov decision problems using average reward reinforcement learning. Management Science, 45(4), 560-574.
Google Scholar
Das, T.K. and Sarkar, S. (1999) Optimal preventive maintenance in a production/inventory system. IIE Transactions, 31(6), 537-551.
Google Scholar
Frein, Y., Di Mascolo, M. and Dallery, Y. (2000) On the design of generalized kanban control systems. International Journal of Operations and Production Management (in press).
Gershwin, S.B. (1994) Manufacturing Systems Engineering, Prentice Hall, Englewoods Cliffs, NJ.
Google Scholar
Gosavi, A. (1999) An algorithm for solving semi-Markov decision problems using reinforcement learning: convergence analysis and numerical results. Unpublished Ph.D. Thesis, Department of Industrial Engineering, University of South Florida, Tampa, FL 33620.
Kaelbling, L.P., Littman, M.L. and Moore, A.W. (1996) Reinforcement learning: a survey. Journal of Artificial Intelligence Research, 4, 237-285.
Google Scholar
Law, A.M. and Kelton, W.D. (1991) Simulation Modeling and Analysis, McGraw-Hill, Inc., New York, NY.
Google Scholar
Lutz, C.M., Davis, K.R. and Sun, M. (1998) Determining buffer location and size in production lines using tabu search. European Journal of Operational Research, 106(2/3), 301-316.
Google Scholar
Mahadevan, S. and Theochaurus, G. (1998) Optimizing production manufacturing using reinforcement learning, in Proceedings of the Eleventh International FLAIRS Conference, AAAI Press, Menlo Park, CA, pp. 372-377.
Google Scholar
Muckstadt, J.A. and Tayur, S.R. (1995a) Comparison of alternative kanban control mechanisms. I. background and structural results. IIE Transactions, 27(2), 140-150.
Google Scholar
Muckstadt, J.A. and Tayur, S.R. (1995b) Comparison of alternative kanban control mechanisms. II. experimental results. IIE Transactions, 27(2), 151-161.
Google Scholar
Putterman, M.L. (1994) Markov Decision Processes, Wiley Interscience, New York, NY.
Google Scholar
Sethi, S. and Zhang, Q. (1994) Hierarchical Decision Making in Stochastic Manufacturing Systems. Birkhäuser, Boston, MA.
Google Scholar
Sethi, S., Zhang, H. and Zhang, Q. (1997) Hierarchical production control in a stochastic manufacturing system with long-run average cost. Journal of Mathematical Analysis and Applications, 214, 151-172.
Google Scholar
So, K.C. and Pinnault, S.C. (1988) Allocating buffer storages in a pull system. International Journal of Production Research, 15(12), 1959-1980.
Google Scholar
Spearman, M.L., Woodruff, D.L. and Hoop, W.J. (1990) CONWIP: a pull alternative to kanban. International Journal of Production Research, 28(5), 879-894.
Google Scholar
Sugimori, Y., Kusunoki, K., Cho, F. and Uchikawa, S. (1977) Toyota production system and kanban system materialization of just-in-time and respect-for-humans systems. International Journal of Production Research, 15(6), 553-564.
Google Scholar
Sutton, R.S. (1988) Learning to predict by the methods of temporal differences. Machine Learning, 3, 9-44.
Google Scholar
Sutton, R.S. and Barto, A.G. (1998) Reinforcement Learning: An Introduction, MIT Press, Cambridge, MA.
Google Scholar
Tabe, T., Muramatsu, R. and Tanaka, Y. (1980) Analysis of production ordering quantities and inventory variations in a multi-stage production ordering system. International Journal of Production Research, 18(2), 245-257.
Google Scholar
Van Ryzin, G., Lou, S.X. and Gershwin, S.B. (1993) Production control for a tandem two-machine system. IIE Transactions, 25(5), 5-20.
Google Scholar
Veatch, M.H. and Wein, L.M. (1994) Optimal control of a two-station tandem production/inventory system. Operations Research, 42(2), 337-350.
Google Scholar
Watkins, C.J. (1989) Learning from delayed rewards. Ph.D. thesis, Kings College, Cambridge, England.
Google Scholar

Download references

Author information

Authors and Affiliations

Departamento de Ingeniería Industrial Universidad del Norte, Barranquilla, Colombia
Carlos D. Paternina-Arboleda
Industrial and Management Systems Engineering Department, University of South Florida, Tampa, FL, 33620, USA
Tapas K. Das

Authors

Carlos D. Paternina-Arboleda
View author publications
You can also search for this author in PubMed Google Scholar
Tapas K. Das
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Paternina-Arboleda, C.D., Das, T.K. Intelligent dynamic control policies for serial production lines. IIE Transactions 33, 65–77 (2001). https://doi.org/10.1023/A:1007641824604

Download citation

Issue Date: January 2001
DOI: https://doi.org/10.1023/A:1007641824604

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Intelligent dynamic control policies for serial production lines

Abstract

Access this article

Similar content being viewed by others

Reinforcement learning for an intelligent and autonomous production control of complex job-shops under time constraints

A Review of Dynamic Scheduling: Context, Techniques and Prospects

Optimizing warehouse logistics scheduling strategy using soft computing and advanced machine learning techniques

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Intelligent dynamic control policies for serial production lines

Abstract

Access this article

Similar content being viewed by others

Reinforcement learning for an intelligent and autonomous production control of complex job-shops under time constraints

A Review of Dynamic Scheduling: Context, Techniques and Prospects

Optimizing warehouse logistics scheduling strategy using soft computing and advanced machine learning techniques

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation