Reinforcement learning applications to machine scheduling problems: a comprehensive literature review

Kayhan, Behice Meltem; Yildiz, Gokalp

doi:10.1007/s10845-021-01847-3

Reinforcement learning applications to machine scheduling problems: a comprehensive literature review

Published: 19 October 2021

Volume 34, pages 905–929, (2023)
Cite this article

Journal of Intelligent Manufacturing Aims and scope Submit manuscript

4953 Accesses
16 Citations
1 Altmetric
Explore all metrics

Abstract

Reinforcement learning (RL) is one of the most remarkable branches of machine learning and attracts the attention of researchers from numerous fields. Especially in recent years, the RL methods have been applied to machine scheduling problems and are among the top five most encouraging methods for scheduling literature. Therefore, in this study, a comprehensive literature review about RL methods applications to machine scheduling problems was conducted. In this regard, Scopus and Web of Science databases were searched very inclusively using the proper keywords. As a result of the comprehensive research, 80 papers were found, published between 1995 and 2020. These papers were analyzed considering different aspects of the problem such as applied algorithms, machine environments, job and machine characteristics, objectives, benchmark methods, and a detailed classification scheme was constructed. Job shop scheduling, unrelated parallel machine scheduling, and single machine scheduling problems were found as the most studied problem type. The main contributions of the study are to examine essential aspects of reinforcement learning in machine scheduling problems, identify the most frequently investigated problem types, objectives, and constraints, and reveal the deficiencies and promising areas in the related literature. This study can help researchers who wish to study in this field through the comprehensive analysis of the related literature.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Solving the Job Shop Scheduling Problem with Reinforcement Learning: A Statistical Analysis

Development of a Reinforcement Learning System to Solve the Job Shop Problem

Multi-Agent Reinforcement Learning Tool for Job Shop Scheduling Problems

References

Ábrahám, G., Auer, P., Dósa, G., Dulai, T., & Werner-Stark, Ã. (2019). A reinforcement learning motivated algorithm for process optimization. Periodica Polytechnica Civil Engineering, 63(4), 961–970. https://doi.org/10.3311/PPci.14295
Article Google Scholar
Aissani, N., Bekrar, A., Trentesaux, D., & Beldjilali, B. (2012). Dynamic scheduling for multi-site companies: A decisional approach based on reinforcement multi-agent learning. Journal of Intelligent Manufacturing, 23(6), 2513–2529. https://doi.org/10.1007/s10845-011-0580-y
Article Google Scholar
Aissani, N., Trentesaux, D., & Beldjilali, B. (2009). Multi-agent reinforcement learning for adaptive scheduling: Application to multi-site company. In IFAC proceedings volumes, (Vol. 42, No. 4, pp. 1102–1107). https://doi.org/10.3182/20090603-3-RU-2001.0280.
Aissani, N., & Trentesaux, D. (2008). Efficient and effective reactive scheduling of manufacturing system using Sarsa-multi-objective agents. In Proceedings of the 7th international conference MOSIM, Paris (pp. 698–707).
Arviv, K., Stern, H., & Edan, Y. (2016). Collaborative reinforcement learning for a two-robot job transfer flow-shop scheduling problem. International Journal of Production Research, 54(4), 1196–1209. https://doi.org/10.1080/00207543.2015.1057297
Article Google Scholar
Atighehchian, A., & Sepehri, M. M. (2013). An environment-driven, function-based approach to dynamic single-machine scheduling. European Journal of Industrial Engineering, 7(1), 100–118. https://doi.org/10.1504/EJIE.2013.051594
Article Google Scholar
Aydin, M. E., & Öztemel, E. (2000). Dynamic job-shop scheduling using reinforcement learning agents. Robotics and Autonomous Systems, 33(2), 169–178. https://doi.org/10.1016/S0921-8890(00)00087-7
Article Google Scholar
Barto, A. G., & Mahadevan, S. (2003). Recent advances in hierarchical reinforcement learning. Discrete Event Dynamic Systems, 13(1), 41–77. https://doi.org/10.1023/A:1022140919877
Article Google Scholar
Bouazza, W., Sallez, Y., & Beldjilali, B. (2017). A distributed approach solving partially flexible job-shop scheduling problem with a Q-learning effect. IFAC-PapersOnLine, 50(1), 15890–15895. https://doi.org/10.1016/j.ifacol.2017.08.2354
Article Google Scholar
Cadavid, J. P. U., Lamouri, S., Grabot, B., Pellerin, R., & Fortin, A. (2020). Machine learning applied in production planning and control: a state-of-the-art in the era of industry 4.0. Journal of Intelligent Manufacturing, 31(6), 1531–1558. https://doi.org/10.1007/s10845-019-01531-7
Article Google Scholar
Csáji, B. C., & Monostori, L. (2005). Stochastic approximate scheduling by neurodynamic learning. In IFAC Proceedings Volumes, (Vol. 38, No. 1, pp. 355–360). https://doi.org/10.3182/20050703-6-CZ-1902.01481
Csáji, B. C., & Monostori, L. (2008). Adaptive stochastic resource control: A machine learning approach. Journal of Artificial Intelligence Research, 32, 453–486. https://doi.org/10.1613/jair.2548
Article Google Scholar
Csáji, B. C., Monostori, L., & Kádár, B. (2006). Reinforcement learning in a distributed market-based production control system. Advanced Engineering Informatics, 20(3), 279–288. https://doi.org/10.1016/j.aei.2006.01.001
Article Google Scholar
Das, T. K., Gosavi, A., Mahadevan, S., & Marchalleck, N. (1999). Solving semi-Markov decision problems using average reward reinforcement learning. Management Science, 45(4), 560–574. https://doi.org/10.1287/mnsc.45.4.560
Article Google Scholar
De Raedt, L. (2008). Logical and relational learning. New York: Springer. https://doi.org/10.1007/978-3-540-68856-3.
Book Google Scholar
Ding, Z., & Dong, H. (2020). Challenges of reinforcement learning. In Deep Reinforcement Learning (pp. 249–272). Singapore: Springer. https://doi.org/10.1007/978-981-15-4095-0_7
Chapter Google Scholar
Dulac-Arnold, G., Mankowitz, D., & Hester, T. (2019). Challenges of real-world reinforcement learning. (Online) https://arxiv.org/abs/1904.12901
Fuchigami, H. Y., & Rangel, S. (2018). A survey of case studies in production scheduling: Analysis and perspectives. Journal of Computational Science, 25, 425–436. https://doi.org/10.1016/j.jocs.2017.06.004
Article Google Scholar
Fang, G., Li, Y., Liu, A., & Liu, Z. (2020). A reinforcement learning method to scheduling problem of steel production process.Journal of Physics: Conference Series, 1486(7), 072035. https://doi.org/10.1088/1742-6596/1486/7/072035
Article Google Scholar
Gabel, T., & Riedmiller, M. (2006a). Reducing policy degradation in neuro-dynamic programming. In ESANN 2006 Proceedings - European Symposium on Artificial Neural Networks (pp. 653–658).
Gabel, T., & Riedmiller, M. (2006b). Multi-agent case-based reasoning for cooperative reinforcement learners. In Roth-Berghofer, T. R., Göker, M. H., & Güvenir, H. A. (Eds.), Advances in case-based reasoning. ECCBR 2006 (4106 vol.). Berlin, Heidelberg: Springer. https://doi.org/10.1007/11805816_5
Chapter Google Scholar
Gabel, T., & Riedmiller, M. (2007a). On a successful application of multi-agent reinforcement learning to operations research benchmarks. In 2007 IEEE international symposium on approximate dynamic programming and reinforcement learning (pp. 68–75). https://doi.org/10.1109/ADPRL.2007.368171
Gabel, T., & Riedmiller, M. (2007b). Scaling adaptive agent-based reactive job-shop scheduling to large-scale problems. In Proceedings of the 2007 IEEE symposium on computational Intelligence in scheduling, CI-Sched 2007 (pp. 259–266). https://doi.org/10.1109/SCIS.2007.367699
Gabel, T., & Riedmiller, M. (2008). Adaptive reactive job-shop scheduling with reinforcement learning agents. International Journal of Information Technology and Intelligent Computing, 24(4), 14–18
Google Scholar
Gabel, T., & Riedmiller, M. (2011). Distributed policy search reinforcement learning for job-shop scheduling tasks. International Journal of Production Research, 50(1), 41–61. https://doi.org/10.1080/00207543.2011.571443
Article Google Scholar
Gosavi, A. (2015). Simulation-based optimization. Berlin: Springer
Book Google Scholar
Graham, R. L., Lawler, E. L., Lenstra, J. K., & Kan, A. H. G. R. (1979). Optimization and approximation in deterministic sequencing and scheduling: A survey. Annals of Discrete Mathematics, 5, 287–326. https://doi.org/10.1016/S0167-5060(08)70356-X
Article Google Scholar
Guo, L., Zhuang, Z., Huang, Z., & Qin, W. (2020). Optimization of dynamic multi-objective non-identical parallel machine scheduling with multi-stage reinforcement learning. In 2020 IEEE 16th international conference on automation science and engineering (CASE) (pp. 1215–1219). https://doi.org/10.1109/CASE48305.2020.9216743
Han, W., Guo, F., & Su, X. (2019). A reinforcement learning method for a hybrid flow-shop scheduling problem. Algorithms, 12(11), https://doi.org/10.3390/a12110222
Heuillet, A., Couthouis, F., & Díaz-Rodríguez, N. (2021). Explainability in deep reinforcement learning. Knowledge-Based Systems, 214, 106685. https://doi.org/10.1016/j.knosys.2020.106685
Article Google Scholar
Hong, J., & Prabhu, V. V. (2004). Distributed reinforcement learning control for batch sequencing and sizing in just-in-time manufacturing systems. Applied Intelligence, 20(1), 71–87. https://doi.org/10.1023/B:APIN.0000011143.95085.74
Article Google Scholar
Idrees, H. D., Sinnokrot, M. O., & Al-Shihabi, S. (2006). A reinforcement learning algorithm to minimize the mean tardiness of a single machine with controlled capacity. In Proceedings - Winter simulation conference (pp. 1765–1769). https://doi.org/10.1109/WSC.2006.322953
Iwamura, K., Mayumi, N., Tanimizu, Y., & Sugimura, N. (2010). A study on real-time scheduling for holonic manufacturing systems - Determination of utility values based on multi-agent reinforcement learning. In International conference on industrial applications of holonic and multi-agent systems (pp. 135–144). Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03668-2_13
Jiménez, Y. M., Palacio, J. C., & Nowé, A. (2020). Multi-agent reinforcement learning tool for job shop scheduling problems. In International conference on optimization and learning (pp. 3–12). https://doi.org/10.1007/978-3-030-41913-4_1
Kaelbling, L., Littman, M. L., Moore, A. W., & Hall, S. (1996). Reinforcement learning: A survey. Journal of Artificial Intelligence Research, 4, 237–285. https://doi.org/10.1613/jair.301
Article Google Scholar
Khadilkar, H. (2018). A scalable reinforcement learning algorithm for scheduling railway lines. IEEE Transactions on Intelligent Transportation Systems, 20(2), 727–736. https://doi.org/10.1109/TITS.2018.2829165
Article Google Scholar
Kim, G. H., & Lee, C. S. G. (1996). Genetic reinforcement learning for scheduling heterogeneous machines. In Proceedings - IEEE International Conference on Robotics and Automation (Vol. 3, pp. 2798–2803). https://doi.org/10.1109/ROBOT.1996.506586
Kim, N., & Shin, H. (2017). The application of actor-critic reinforcement learning for fab dispatching scheduling. In 2017 Winter simulation conference (pp. 4570–4571). https://doi.org/10.1109/WSC.2017.8248209
Kong, L. F., & Wu, J. (2005). Dynamic single machine scheduling using Q-learning agent. In 2005 International conference on machine learning and cybernetics, ICMLC 2005 (pp. 3237–3241). https://doi.org/10.1109/ICMLC.2005.1527501
Lee, S., Cho, Y., & Lee, Y. H. (2020). Injection mold production sustainable scheduling using deep reinforcement learning. Sustainability, 12(20), 8718. https://doi.org/10.3390/su12208718
Article Google Scholar
Lihu, A., & Holban, S. (2009). Top five most promising algorithms in scheduling. In Proceedings – 2009 5th international symposium on applied computational intelligence and informatics, SACI 2009 (pp. 397–404). https://doi.org/10.1109/SACI.2009.5136281
Lin, C. C., Deng, D. J., Chih, Y. L., & Chiu, H. T. (2019). Smart manufacturing scheduling with edge computing using multiclass deep Q network. IEEE Transactions on Industrial Informatics, 15(7), 4276–4284. https://doi.org/10.1109/TII.2019.2908210
Article Google Scholar
Liu, C. C., Jin, H. Y., Tian, Y., & Yu, H. B. (2001). Reinforcement learning approach to re-entrant manufacturing system scheduling. In 2001 International Conferences on Info-Tech and Info-Net: A Key to Better Life, ICII 2001 - Proceedings (Vol. 3, pp. 280–285). https://doi.org/10.1109/ICII.2001.983070
Liu, C. L., Chang, C. C., & Tseng, C. J. (2020). Actor-critic deep reinforcement learning for solving job shop scheduling problems. IEEE Access, 8, 71752–71762. https://doi.org/10.1109/ACCESS.2020.2987820
Article Google Scholar
Liu, W., & Wang, X. (2009). Dynamic decision model in evolutionary games based on reinforcement learning. Systems Engineering - Theory & Practice, 29(3), 28–33. https://doi.org/10.1016/S1874-8651(10)60008-7
Article Google Scholar
Luo, S. (2020). Dynamic scheduling for flexible job shop with new job insertions by deep reinforcement learning. Applied Soft Computing, 91, 106208. https://doi.org/10.1016/j.asoc.2020.106208
Article Google Scholar
Miyashita, K. (2000). Learning scheduling control knowledge through reinforcements. International Transactions in Operational Research, 7(2), 125–138. https://doi.org/10.1016/S0969-6016(00)00014-9
Article Google Scholar
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., & Hassabis, D., …. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529–533. https://doi.org/10.1038/nature14236
Article Google Scholar
Monostori, L., & Csáji, B. C. (2006). Stochastic dynamic production control by neurodynamic programming. CIRP Annals - Manufacturing Technology, 55(1), 473–478. https://doi.org/10.1016/S0007-8506(07)60462-4
Article Google Scholar
Monostori, L., Csáji, B. C., & Kádár, B. (2004). Adaptation and learning in distributed production control. CIRP Annals - Manufacturing Technology, 53(1), 349–352. https://doi.org/10.1016/S0007-8506(07)60714-8
Article Google Scholar
Nahmias, S., & Olsen, T. L. (2015). Production and operations analysis. Long Grove: Waveland Press
Neto, T. R. F., & Godinho Filho, M. (2013). Literature review regarding Ant Colony Optimization applied to scheduling problems: Guidelines for implementation and directions for future research. Engineering Applications of Artificial Intelligence, 26(1), 150–161. https://doi.org/10.1016/j.engappai.2012.03.011
Article Google Scholar
Palombarini, J., & Martínez, E. (2010). Learning to repair plans and schedules using a relational (deictic) representation. In Computer aided chemical engineering (Vol. 27, pp. 1377–1382). Elsevier. https://doi.org/10.1016/s1570-7946(09)70620-0
Palombarini, J., & Martínez, E. (2012a). SmartGantt – An interactive system for generating and updating rescheduling knowledge using relational abstractions. Computers and Chemical Engineering, 47, 202–216. https://doi.org/10.1016/j.compchemeng.2012.06.021
Article Google Scholar
Palombarini, J., & Martínez, E. (2012b). SmartGantt – An intelligent system for real time rescheduling based on relational reinforcement learning. Expert Systems With Applications, 39(11), 10251–10268. https://doi.org/10.1016/j.eswa.2012.02.176
Article Google Scholar
Parente, M., Figueira, G., Amorim, P., & Marques, A. (2020). Production scheduling in the context of Industry 4.0: review and trends. International Journal of Production Research, 58(17), 5401–5431. https://doi.org/10.1080/00207543.2020.1718794
Article Google Scholar
Park, I., Huh, J., Kim, J., & Park, J. (2020). A reinforcement learning approach to robust scheduling of semiconductor manufacturing facilities. IEEE Transactions on Automation Science and Engineering, 17(3), 1420–1431. https://doi.org/10.1109/tase.2019.2956762
Article Google Scholar
Paternina-Arboleda, C. D., & Das, T. K. (2001). Intelligent dynamic control policies for serial production lines. IIE Transactions, 33(1), 65–77. https://doi.org/10.1023/A:1007641824604
Article Google Scholar
Qu, S., Chu, T., Wang, J., Leckie, J., & Jian, W. (2015). A centralized reinforcement learning approach for proactive scheduling in manufacturing. In IEEE international conference on emerging technologies and factory automation, ETFA (Vol. 2015-Octob, pp. 1–8). https://doi.org/10.1109/ETFA.2015.7301417
Qu, S., Wang, J., Govil, S., & Leckie, J. O. (2016a). Optimized adaptive scheduling of a manufacturing process system with multi-skill workforce and multiple machine types: An ontology-based, multi-agent reinforcement learning approach. Procedia CIRP, 57, 55–60. https://doi.org/10.1016/j.procir.2016.11.011
Article Google Scholar
Qu, S., Jie, W., & Shivani, G. (2016b). Learning adaptive dispatching rules for a manufacturing process system by using reinforcement learning approach. In IEEE International Conference on Emerging Technologies and Factory Automation, ETFA (Vol. 2016-Novem, pp. 1–8). https://doi.org/10.1109/etfa.2016.7733712
Qu, G., Wierman, A., & Li, N. (2020). Scalable reinforcement learning of localized policies for multi-agent networked systems. In Learning for Dynamics and Control (pp. 256–266).
Ramírez-Hernández, J. A., & Fernandez, E. (2005). A case study in scheduling reentrant manufacturing lines: Optimal and simulation-based approaches. In Proceedings of the 44th IEEE conference on decision and control (Vol. 2005, pp. 2158–2163). https://doi.org/10.1109/CDC.2005.1582481
Ramírez-Hernández, J. A., & Fernandez, E. (2009). A simulation-based approximate dynamic programming approach for the control of the intel Mini-Fab benchmark model. In Proceedings - Winter simulation conference (pp. 1634–1645). https://doi.org/10.1109/wsc.2009.5429179
Ren, J., Ye, C., & Yang, F. (2020). A novel solution to JSPs based on long short-term memory and policy gradient algorithm. International Journal of Simulation Modelling, 19, 157–168. https://doi.org/10.2507/ijsimm19-1-co4
Article Google Scholar
Reyna, Y. C. F., Cáceres, A. P., Jiménez, Y. M., & Reyes, Y. T. (2019a). An improvement of reinforcement learning approach for permutation of flow-shop scheduling problems. In RISTI - Revista Iberica de Sistemas e Tecnologias de Informacao, (E18), pp. 257–270.
Reyna, Y. C. F., Jiménez, Y. M., Cabrera, A. V., & Sánchez, E. A. (2019b). Optimization of heavily constrained hybrid-flexible flowshop problems using a multi-agent reinforcement learning approach. Investigacion Operacional, 40(1), 100–111
Google Scholar
Reyna, Y. C. F., Jiménez, Y. M., & Nowé, A. (2018). Q-learning algorithm performance for m-machine n-jobs flow shop scheduling to minimize makespan. Investigación Operacional, 38(3), 281–290
Google Scholar
Reyna, Y. C. F., Jiménez, Y. M., Bermúdez Cabrera, J. M., & Méndez Hernández, B. M. (2015). A reinforcement learning approach for scheduling problems. Investigacion Operacional, 36(3), 225–231
Google Scholar
Riedmiller, S., & Riedmiller, M. (1999). A neural reinforcement learning approach to learn local dispatching policies in production scheduling. In IJCAI Iiternational joint conference on artificial intelligence (Vol. 2, pp. 764–769).
Russel, S., & Norvig, P. (2010). Artificial intelligence: A modern approach. London: Pearson.
Google Scholar
Schwartz, A. (1993). A reinforcement learning method for maximizing undiscounted rewards. In Proceedings of the tenth international conference on machine learning (Vol. 298, pp. 298–305). https://doi.org/10.1016/b978-1-55860-307-3.50045-9
Shiue, Y., Lee, K., & Su, C. (2018). Real-time scheduling for a smart factory using a reinforcement learning approach. Computers & Industrial Engineering, 125(101), 604–614. https://doi.org/10.1016/j.cie.2018.03.039
Article Google Scholar
Sigaud, O., & Buffet, O. (2013). Markov Decision Processes in Artificial Intelligence: MDPs, beyond MDPs and applications. New York: Wiley
Book Google Scholar
Stricker, N., Kuhnle, A., Sturm, R., & Friess, S. (2018). Manufacturing technology reinforcement learning for adaptive order dispatching in the semiconductor industry. CIRP Annals, 67(1), 511–514. https://doi.org/10.1016/j.cirp.2018.04.041
Article Google Scholar
Sutton, R. S., & Barto, A. G. (1998). Reinforcement Learning: An Introduction. Cambridge: MIT Press
Google Scholar
Szepesvári, C. (2010). Algorithms for reinforcement learning. Synthesis lectures on artificial intelligence and machine learning, 4(1), 1–103. https://doi.org/10.2200/S00268ED1V01Y201005AIM009
Article Google Scholar
Thomas, T. E., Koo, J., Chaterji, S., & Bagchi, S. (2018). Minerva: A reinforcement learning-based technique for optimal scheduling and bottleneck detection in distributed factory operations. In 2018 10th international conference on communication systems & networks (COMSNETS) (pp. 129–136). https://doi.org/10.1109/COMSNETS.2018.8328189
Van Otterlo, M. (2009). The logic of adaptive behavior: Knowledge representation and algorithms for adaptive sequential decision making under uncertainty in first-order and relational domains. Ios Press
Vapnik, V. N. (2000). Methods of pattern recognition. In The nature of statistical learning theory (pp. 123–180). New York, NY: Springer
Chapter Google Scholar
Wang, H. X., & Yan, H. S. (2013a). An adaptive scheduling system in knowledgeable manufacturing based on multi-agent. In 10th IEEE international conference on control and automation (ICCA) (pp. 496–501). https://doi.org/10.1109/icca.2013.6564866
Wang, H. X., & Yan, H. S. (2013b). An adaptive assembly scheduling approach in knowledgeable manufacturing. Applied Mechanics and Materials, 433–435, 2347–2350. https://doi.org/10.4028/www.scientific.net/AMM.433-435.2347
Article Google Scholar
Wang, H. X., & Yan, H. S. (2016). An interoperable adaptive scheduling strategy for knowledgeable manufacturing based on SMGWQ-learning. Journal of Intelligent Manufacturing, 27(5), 1085–1095. https://doi.org/10.1007/s10845-014-0936-1
Article Google Scholar
Wang, H. X., Sarker, B. R., Li, J., & Li, J. (2020). Adaptive scheduling for assembly job shop with uncertain assembly times based on dual Q- learning. International Journal of Production Research. https://doi.org/10.1080/00207543.2020.1794075
Article Google Scholar
Wang, Y. C., & Usher, J. M. (2004). Learning policies for single machine job dispatching. Robotics and Computer-Integrated Manufacturing, 20(6), 553–562. https://doi.org/10.1016/j.rcim.2004.07.003
Article Google Scholar
Wang, Y. C., & Usher, J. M. (2005). Application of reinforcement learning for agent-based production scheduling. Engineering Applications of Artificial Intelligence, 18(1), 73–82. https://doi.org/10.1016/j.engappai.2004.08.018
Article Google Scholar
Wang, Y. C., & Usher, J. M. (2007). A reinforcement learning approach for developing routing policies in multi-agent production scheduling. International Journal of Advanced Manufacturing Technology, 33(3–4), 323–333. https://doi.org/10.1007/s00170-006-0465-y
Article Google Scholar
Wang, Y. F. (2018). Adaptive job shop scheduling strategy based on weighted Q-learning algorithm. Journal of Intelligent Manufacturing, 31(2), 417–432. https://doi.org/10.1007/s10845-018-1454-3
Article Google Scholar
Waschneck, B., Reichstaller, A., Belzner, L., Altenmüller, T., Bauernhansl, T., Knapp, A., & Kyek, A. (2018a). Optimization of global production scheduling with deep reinforcement learning. Procedia CIRP, 72, 1264–1269. https://doi.org/10.1016/j.procir.2018.03.212
Article Google Scholar
Waschneck, B., Reichstaller, A., Belzner, L., Altenmuller, T., Bauernhansl, T., Knapp, A., & Kyek, A. (2018b). Deep reinforcement learning for semiconductor production scheduling. In 2018 29th annual SEMI advanced semiconductor manufacturing conference, ASMC 2018 (pp. 301–306). https://doi.org/10.1109/asmc.2018.8373191
Wei, Y., & Zhao, M. (2004). Composite rules selection using reinforcement learning for dynamic job-shop scheduling. In 2004 IEEE conference on robotics, automation and mechatronics (Vol. 2, pp. 1083–1088). https://doi.org/10.1109/RAMECH.2004.1438070
Xanthopoulos, A. S., Koulouriotis, D. E., Tourassis, V. D., & Emiris, D. M. (2013). Intelligent controllers for bi-objective dynamic scheduling on a single machine with sequence-dependent setups. Applied Soft Computing Journal, 13(12), 4704–4717. https://doi.org/10.1016/j.asoc.2013.07.015
Article Google Scholar
Xiao, Y., Tan, Q., Zhou, L., & Tang, H. (2017). Stochastic scheduling with compatible job families by an improved Q-learning algorithm. In Chinese Control Conference, CCC (pp. 2657–2662). https://doi.org/10.23919/ChiCC.2017.8027764
Yang, H. B., & Yan, H. S. (2009). An adaptive approach to dynamic scheduling in knowledgeable manufacturing cell. International Journal of Advanced Manufacturing Technology, 42(3–4), 312–320. https://doi.org/10.1007/s00170-008-1588-0
Article Google Scholar
Yang, H. B., & Yan, H. S. (2007). An adaptive policy of dynamic scheduling in knowledgeable manufacturing environment. In Proceedings of the IEEE international conference on automation and logistics, ICAL 2007 (pp. 835–840). https://doi.org/10.1109/ICAL.2007.4338680
Yingzi, W. E. I., Xinli, J., & Pingbo, H. A. O. (2009). Pattern Driven Dynamic Scheduling Approach using Reinforcement Learning. In 2009 IEEE international conference on automation and logistics (pp. 514–519). https://doi.org/10.1109/ICAL.2009.5262867
Yuan, B., Jiang, Z., & Wang, L. (2016). Dynamic parallel machine scheduling with random breakdowns using the learning agent. International Journal of Services Operations and Informatics, 8(2), 94–103. https://doi.org/10.1504/IJSOI.2016.080083
Article Google Scholar
Yuan, B., Wang, L., & Jiang, Z. (2013). Dynamic parallel machine scheduling using the learning agent. In 2013 IEEE international conference on industrial engineering and engineering management (pp. 1565–1569). https://doi.org/10.1109/IEEM.2013.6962673
Zhang, T., Xie, S., & Rose, O. (2017). Real-time job shop scheduling based on simulation and Markov decision processes. In Proceedings - Winter simulation conference (pp. 3899–3907). https://doi.org/10.1109/WSC.2017.8248100
Zhang, T., Xie, S., & Rose, O. (2018). Real-time batching in job shops based on simulation and reinforcement learning. In 2018 Winter simulation conference (WSC) (pp. 3331–3339). https://doi.org/10.1109/WSC.2018.8632524
Zhang, W., & Dietterich, T. G. (1995). A reinforcement learning approach to job-shop scheduling. In 1995 International joint conference on artificial intelligence (pp. 1114–1120).
Zhang, W., & Dietterich, T. G. (1996). High-performance job-shop scheduling with a time-delay TD (λ) network. Advances in Neural Information Processing Systems, 91, 1024–1030
Google Scholar
Zhang, Z., Zheng, L., Hou, F., & Li, N. (2011). Semiconductor final test scheduling with Sarsa(λ, k) algorithm. European Journal of Operational Research, 215(2), 446–458. https://doi.org/10.1016/j.ejor.2011.05.052
Article Google Scholar
Zhang, Z., Zheng, L., Li, N., Wang, W., Zhong, S., & Hu, K. (2012). Minimizing mean weighted tardiness in unrelated parallel machine scheduling with reinforcement learning. Computers and Operations Research, 39(7), 1315–1324. https://doi.org/10.1016/j.cor.2011.07.019
Article Google Scholar
Zhang, Z., Zheng, L., & Weng, M. X. (2007). Dynamic parallel machine scheduling with mean weighted tardiness objective by Q-learning. International Journal of Advanced Manufacturing Technology, 34(9–10), 968–980. https://doi.org/10.1007/s00170-006-0662-8
Article Google Scholar
Zhao, M., Li, X., Gao, L., Wang, L., & Xiao, M. (2019). An improved Q-learning based rescheduling method for flexible job-shops with machine failures. In 2019 IEEE 15th international conference on automation science and engineering (CASE) (pp. 331–337). https://doi.org/10.1109/COASE.2019.8843100
Zhou, L., Zhang, L., & Horn, B. K. P. (2020). Deep reinforcement learning-based dynamic scheduling in smart manufacturing. Procedia CIRP, 93, 383–388. https://doi.org/10.1016/j.procir.2020.05.163
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Industrial Engineering, Dokuz Eylul University, 35397, Buca, Izmir, Turkey
Behice Meltem Kayhan & Gokalp Yildiz

Authors

Behice Meltem Kayhan
View author publications
You can also search for this author in PubMed Google Scholar
Gokalp Yildiz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Behice Meltem Kayhan.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kayhan, B.M., Yildiz, G. Reinforcement learning applications to machine scheduling problems: a comprehensive literature review. J Intell Manuf 34, 905–929 (2023). https://doi.org/10.1007/s10845-021-01847-3

Download citation

Received: 02 December 2020
Accepted: 17 September 2021
Published: 19 October 2021
Issue Date: March 2023
DOI: https://doi.org/10.1007/s10845-021-01847-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Reinforcement learning applications to machine scheduling problems: a comprehensive literature review

Abstract

Access this article

Similar content being viewed by others

Solving the Job Shop Scheduling Problem with Reinforcement Learning: A Statistical Analysis

Development of a Reinforcement Learning System to Solve the Job Shop Problem

Multi-Agent Reinforcement Learning Tool for Job Shop Scheduling Problems

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Reinforcement learning applications to machine scheduling problems: a comprehensive literature review

Abstract

Access this article

Similar content being viewed by others

Solving the Job Shop Scheduling Problem with Reinforcement Learning: A Statistical Analysis

Development of a Reinforcement Learning System to Solve the Job Shop Problem

Multi-Agent Reinforcement Learning Tool for Job Shop Scheduling Problems

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation