Skip to main content
Log in

Reinforcement learning applications to machine scheduling problems: a comprehensive literature review

  • Published:
Journal of Intelligent Manufacturing Aims and scope Submit manuscript

Abstract

Reinforcement learning (RL) is one of the most remarkable branches of machine learning and attracts the attention of researchers from numerous fields. Especially in recent years, the RL methods have been applied to machine scheduling problems and are among the top five most encouraging methods for scheduling literature. Therefore, in this study, a comprehensive literature review about RL methods applications to machine scheduling problems was conducted. In this regard, Scopus and Web of Science databases were searched very inclusively using the proper keywords. As a result of the comprehensive research, 80 papers were found, published between 1995 and 2020. These papers were analyzed considering different aspects of the problem such as applied algorithms, machine environments, job and machine characteristics, objectives, benchmark methods, and a detailed classification scheme was constructed. Job shop scheduling, unrelated parallel machine scheduling, and single machine scheduling problems were found as the most studied problem type. The main contributions of the study are to examine essential aspects of reinforcement learning in machine scheduling problems, identify the most frequently investigated problem types, objectives, and constraints, and reveal the deficiencies and promising areas in the related literature. This study can help researchers who wish to study in this field through the comprehensive analysis of the related literature.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  • Ábrahám, G., Auer, P., Dósa, G., Dulai, T., & Werner-Stark, Ã. (2019). A reinforcement learning motivated algorithm for process optimization. Periodica Polytechnica Civil Engineering, 63(4), 961–970. https://doi.org/10.3311/PPci.14295

    Article  Google Scholar 

  • Aissani, N., Bekrar, A., Trentesaux, D., & Beldjilali, B. (2012). Dynamic scheduling for multi-site companies: A decisional approach based on reinforcement multi-agent learning. Journal of Intelligent Manufacturing, 23(6), 2513–2529. https://doi.org/10.1007/s10845-011-0580-y

    Article  Google Scholar 

  • Aissani, N., Trentesaux, D., & Beldjilali, B. (2009). Multi-agent reinforcement learning for adaptive scheduling: Application to multi-site company. In IFAC proceedings volumes, (Vol. 42, No. 4, pp. 1102–1107). https://doi.org/10.3182/20090603-3-RU-2001.0280.

  • Aissani, N., & Trentesaux, D. (2008). Efficient and effective reactive scheduling of manufacturing system using Sarsa-multi-objective agents. In Proceedings of the 7th international conference MOSIM, Paris (pp. 698–707).

  • Arviv, K., Stern, H., & Edan, Y. (2016). Collaborative reinforcement learning for a two-robot job transfer flow-shop scheduling problem. International Journal of Production Research, 54(4), 1196–1209. https://doi.org/10.1080/00207543.2015.1057297

    Article  Google Scholar 

  • Atighehchian, A., & Sepehri, M. M. (2013). An environment-driven, function-based approach to dynamic single-machine scheduling. European Journal of Industrial Engineering, 7(1), 100–118. https://doi.org/10.1504/EJIE.2013.051594

    Article  Google Scholar 

  • Aydin, M. E., & Öztemel, E. (2000). Dynamic job-shop scheduling using reinforcement learning agents. Robotics and Autonomous Systems, 33(2), 169–178. https://doi.org/10.1016/S0921-8890(00)00087-7

    Article  Google Scholar 

  • Barto, A. G., & Mahadevan, S. (2003). Recent advances in hierarchical reinforcement learning. Discrete Event Dynamic Systems, 13(1), 41–77. https://doi.org/10.1023/A:1022140919877

    Article  Google Scholar 

  • Bouazza, W., Sallez, Y., & Beldjilali, B. (2017). A distributed approach solving partially flexible job-shop scheduling problem with a Q-learning effect. IFAC-PapersOnLine, 50(1), 15890–15895. https://doi.org/10.1016/j.ifacol.2017.08.2354

    Article  Google Scholar 

  • Cadavid, J. P. U., Lamouri, S., Grabot, B., Pellerin, R., & Fortin, A. (2020). Machine learning applied in production planning and control: a state-of-the-art in the era of industry 4.0. Journal of Intelligent Manufacturing, 31(6), 1531–1558. https://doi.org/10.1007/s10845-019-01531-7

    Article  Google Scholar 

  • Csáji, B. C., & Monostori, L. (2005). Stochastic approximate scheduling by neurodynamic learning. In IFAC Proceedings Volumes, (Vol. 38, No. 1, pp. 355–360). https://doi.org/10.3182/20050703-6-CZ-1902.01481

  • Csáji, B. C., & Monostori, L. (2008). Adaptive stochastic resource control: A machine learning approach. Journal of Artificial Intelligence Research, 32, 453–486. https://doi.org/10.1613/jair.2548

    Article  Google Scholar 

  • Csáji, B. C., Monostori, L., & Kádár, B. (2006). Reinforcement learning in a distributed market-based production control system. Advanced Engineering Informatics, 20(3), 279–288. https://doi.org/10.1016/j.aei.2006.01.001

    Article  Google Scholar 

  • Das, T. K., Gosavi, A., Mahadevan, S., & Marchalleck, N. (1999). Solving semi-Markov decision problems using average reward reinforcement learning. Management Science, 45(4), 560–574. https://doi.org/10.1287/mnsc.45.4.560

    Article  Google Scholar 

  • De Raedt, L. (2008). Logical and relational learning. New York: Springer. https://doi.org/10.1007/978-3-540-68856-3.

    Book  Google Scholar 

  • Ding, Z., & Dong, H. (2020). Challenges of reinforcement learning. In Deep Reinforcement Learning (pp. 249–272). Singapore: Springer. https://doi.org/10.1007/978-981-15-4095-0_7

    Chapter  Google Scholar 

  • Dulac-Arnold, G., Mankowitz, D., & Hester, T. (2019). Challenges of real-world reinforcement learning. (Online) https://arxiv.org/abs/1904.12901

  • Fuchigami, H. Y., & Rangel, S. (2018). A survey of case studies in production scheduling: Analysis and perspectives. Journal of Computational Science, 25, 425–436. https://doi.org/10.1016/j.jocs.2017.06.004

    Article  Google Scholar 

  • Fang, G., Li, Y., Liu, A., & Liu, Z. (2020). A reinforcement learning method to scheduling problem of steel production process.Journal of Physics: Conference Series, 1486(7), 072035. https://doi.org/10.1088/1742-6596/1486/7/072035

    Article  Google Scholar 

  • Gabel, T., & Riedmiller, M. (2006a). Reducing policy degradation in neuro-dynamic programming. In ESANN 2006 Proceedings - European Symposium on Artificial Neural Networks (pp. 653–658).

  • Gabel, T., & Riedmiller, M. (2006b). Multi-agent case-based reasoning for cooperative reinforcement learners. In Roth-Berghofer, T. R., Göker, M. H., & Güvenir, H. A. (Eds.), Advances in case-based reasoning. ECCBR 2006 (4106 vol.). Berlin, Heidelberg: Springer. https://doi.org/10.1007/11805816_5

    Chapter  Google Scholar 

  • Gabel, T., & Riedmiller, M. (2007a). On a successful application of multi-agent reinforcement learning to operations research benchmarks. In 2007 IEEE international symposium on approximate dynamic programming and reinforcement learning (pp. 68–75). https://doi.org/10.1109/ADPRL.2007.368171

  • Gabel, T., & Riedmiller, M. (2007b). Scaling adaptive agent-based reactive job-shop scheduling to large-scale problems. In Proceedings of the 2007 IEEE symposium on computational Intelligence in scheduling, CI-Sched 2007 (pp. 259–266). https://doi.org/10.1109/SCIS.2007.367699

  • Gabel, T., & Riedmiller, M. (2008). Adaptive reactive job-shop scheduling with reinforcement learning agents. International Journal of Information Technology and Intelligent Computing, 24(4), 14–18

    Google Scholar 

  • Gabel, T., & Riedmiller, M. (2011). Distributed policy search reinforcement learning for job-shop scheduling tasks. International Journal of Production Research, 50(1), 41–61. https://doi.org/10.1080/00207543.2011.571443

    Article  Google Scholar 

  • Gosavi, A. (2015). Simulation-based optimization. Berlin: Springer

    Book  Google Scholar 

  • Graham, R. L., Lawler, E. L., Lenstra, J. K., & Kan, A. H. G. R. (1979). Optimization and approximation in deterministic sequencing and scheduling: A survey. Annals of Discrete Mathematics, 5, 287–326. https://doi.org/10.1016/S0167-5060(08)70356-X

    Article  Google Scholar 

  • Guo, L., Zhuang, Z., Huang, Z., & Qin, W. (2020). Optimization of dynamic multi-objective non-identical parallel machine scheduling with multi-stage reinforcement learning. In 2020 IEEE 16th international conference on automation science and engineering (CASE) (pp. 1215–1219). https://doi.org/10.1109/CASE48305.2020.9216743

  • Han, W., Guo, F., & Su, X. (2019). A reinforcement learning method for a hybrid flow-shop scheduling problem. Algorithms, 12(11), https://doi.org/10.3390/a12110222

  • Heuillet, A., Couthouis, F., & Díaz-Rodríguez, N. (2021). Explainability in deep reinforcement learning. Knowledge-Based Systems, 214, 106685. https://doi.org/10.1016/j.knosys.2020.106685

    Article  Google Scholar 

  • Hong, J., & Prabhu, V. V. (2004). Distributed reinforcement learning control for batch sequencing and sizing in just-in-time manufacturing systems. Applied Intelligence, 20(1), 71–87. https://doi.org/10.1023/B:APIN.0000011143.95085.74

    Article  Google Scholar 

  • Idrees, H. D., Sinnokrot, M. O., & Al-Shihabi, S. (2006). A reinforcement learning algorithm to minimize the mean tardiness of a single machine with controlled capacity. In Proceedings - Winter simulation conference (pp. 1765–1769). https://doi.org/10.1109/WSC.2006.322953

  • Iwamura, K., Mayumi, N., Tanimizu, Y., & Sugimura, N. (2010). A study on real-time scheduling for holonic manufacturing systems - Determination of utility values based on multi-agent reinforcement learning. In International conference on industrial applications of holonic and multi-agent systems (pp. 135–144). Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03668-2_13

  • Jiménez, Y. M., Palacio, J. C., & Nowé, A. (2020). Multi-agent reinforcement learning tool for job shop scheduling problems. In International conference on optimization and learning (pp. 3–12). https://doi.org/10.1007/978-3-030-41913-4_1

  • Kaelbling, L., Littman, M. L., Moore, A. W., & Hall, S. (1996). Reinforcement learning: A survey. Journal of Artificial Intelligence Research, 4, 237–285. https://doi.org/10.1613/jair.301

    Article  Google Scholar 

  • Khadilkar, H. (2018). A scalable reinforcement learning algorithm for scheduling railway lines. IEEE Transactions on Intelligent Transportation Systems, 20(2), 727–736. https://doi.org/10.1109/TITS.2018.2829165

    Article  Google Scholar 

  • Kim, G. H., & Lee, C. S. G. (1996). Genetic reinforcement learning for scheduling heterogeneous machines. In Proceedings - IEEE International Conference on Robotics and Automation (Vol. 3, pp. 2798–2803). https://doi.org/10.1109/ROBOT.1996.506586

  • Kim, N., & Shin, H. (2017). The application of actor-critic reinforcement learning for fab dispatching scheduling. In 2017 Winter simulation conference (pp. 4570–4571). https://doi.org/10.1109/WSC.2017.8248209

  • Kong, L. F., & Wu, J. (2005). Dynamic single machine scheduling using Q-learning agent. In 2005 International conference on machine learning and cybernetics, ICMLC 2005 (pp. 3237–3241). https://doi.org/10.1109/ICMLC.2005.1527501

  • Lee, S., Cho, Y., & Lee, Y. H. (2020). Injection mold production sustainable scheduling using deep reinforcement learning. Sustainability, 12(20), 8718. https://doi.org/10.3390/su12208718

    Article  Google Scholar 

  • Lihu, A., & Holban, S. (2009). Top five most promising algorithms in scheduling. In Proceedings – 2009 5th international symposium on applied computational intelligence and informatics, SACI 2009 (pp. 397–404). https://doi.org/10.1109/SACI.2009.5136281

  • Lin, C. C., Deng, D. J., Chih, Y. L., & Chiu, H. T. (2019). Smart manufacturing scheduling with edge computing using multiclass deep Q network. IEEE Transactions on Industrial Informatics, 15(7), 4276–4284. https://doi.org/10.1109/TII.2019.2908210

    Article  Google Scholar 

  • Liu, C. C., Jin, H. Y., Tian, Y., & Yu, H. B. (2001). Reinforcement learning approach to re-entrant manufacturing system scheduling. In 2001 International Conferences on Info-Tech and Info-Net: A Key to Better Life, ICII 2001 - Proceedings (Vol. 3, pp. 280–285). https://doi.org/10.1109/ICII.2001.983070

  • Liu, C. L., Chang, C. C., & Tseng, C. J. (2020). Actor-critic deep reinforcement learning for solving job shop scheduling problems. IEEE Access, 8, 71752–71762. https://doi.org/10.1109/ACCESS.2020.2987820

    Article  Google Scholar 

  • Liu, W., & Wang, X. (2009). Dynamic decision model in evolutionary games based on reinforcement learning. Systems Engineering - Theory & Practice, 29(3), 28–33. https://doi.org/10.1016/S1874-8651(10)60008-7

    Article  Google Scholar 

  • Luo, S. (2020). Dynamic scheduling for flexible job shop with new job insertions by deep reinforcement learning. Applied Soft Computing, 91, 106208. https://doi.org/10.1016/j.asoc.2020.106208

    Article  Google Scholar 

  • Miyashita, K. (2000). Learning scheduling control knowledge through reinforcements. International Transactions in Operational Research, 7(2), 125–138. https://doi.org/10.1016/S0969-6016(00)00014-9

    Article  Google Scholar 

  • Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., & Hassabis, D., …. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529–533. https://doi.org/10.1038/nature14236

    Article  Google Scholar 

  • Monostori, L., & Csáji, B. C. (2006). Stochastic dynamic production control by neurodynamic programming. CIRP Annals - Manufacturing Technology, 55(1), 473–478. https://doi.org/10.1016/S0007-8506(07)60462-4

    Article  Google Scholar 

  • Monostori, L., Csáji, B. C., & Kádár, B. (2004). Adaptation and learning in distributed production control. CIRP Annals - Manufacturing Technology, 53(1), 349–352. https://doi.org/10.1016/S0007-8506(07)60714-8

    Article  Google Scholar 

  • Nahmias, S., & Olsen, T. L. (2015). Production and operations analysis. Long Grove: Waveland Press

  • Neto, T. R. F., & Godinho Filho, M. (2013). Literature review regarding Ant Colony Optimization applied to scheduling problems: Guidelines for implementation and directions for future research. Engineering Applications of Artificial Intelligence, 26(1), 150–161. https://doi.org/10.1016/j.engappai.2012.03.011

    Article  Google Scholar 

  • Palombarini, J., & Martínez, E. (2010). Learning to repair plans and schedules using a relational (deictic) representation. In Computer aided chemical engineering (Vol. 27, pp. 1377–1382). Elsevier. https://doi.org/10.1016/s1570-7946(09)70620-0

  • Palombarini, J., & Martínez, E. (2012a). SmartGantt – An interactive system for generating and updating rescheduling knowledge using relational abstractions. Computers and Chemical Engineering, 47, 202–216. https://doi.org/10.1016/j.compchemeng.2012.06.021

    Article  Google Scholar 

  • Palombarini, J., & Martínez, E. (2012b). SmartGantt – An intelligent system for real time rescheduling based on relational reinforcement learning. Expert Systems With Applications, 39(11), 10251–10268. https://doi.org/10.1016/j.eswa.2012.02.176

    Article  Google Scholar 

  • Parente, M., Figueira, G., Amorim, P., & Marques, A. (2020). Production scheduling in the context of Industry 4.0: review and trends. International Journal of Production Research, 58(17), 5401–5431. https://doi.org/10.1080/00207543.2020.1718794

    Article  Google Scholar 

  • Park, I., Huh, J., Kim, J., & Park, J. (2020). A reinforcement learning approach to robust scheduling of semiconductor manufacturing facilities. IEEE Transactions on Automation Science and Engineering, 17(3), 1420–1431. https://doi.org/10.1109/tase.2019.2956762

    Article  Google Scholar 

  • Paternina-Arboleda, C. D., & Das, T. K. (2001). Intelligent dynamic control policies for serial production lines. IIE Transactions, 33(1), 65–77. https://doi.org/10.1023/A:1007641824604

    Article  Google Scholar 

  • Qu, S., Chu, T., Wang, J., Leckie, J., & Jian, W. (2015). A centralized reinforcement learning approach for proactive scheduling in manufacturing. In IEEE international conference on emerging technologies and factory automation, ETFA (Vol. 2015-Octob, pp. 1–8). https://doi.org/10.1109/ETFA.2015.7301417

  • Qu, S., Wang, J., Govil, S., & Leckie, J. O. (2016a). Optimized adaptive scheduling of a manufacturing process system with multi-skill workforce and multiple machine types: An ontology-based, multi-agent reinforcement learning approach. Procedia CIRP, 57, 55–60. https://doi.org/10.1016/j.procir.2016.11.011

    Article  Google Scholar 

  • Qu, S., Jie, W., & Shivani, G. (2016b). Learning adaptive dispatching rules for a manufacturing process system by using reinforcement learning approach. In IEEE International Conference on Emerging Technologies and Factory Automation, ETFA (Vol. 2016-Novem, pp. 1–8). https://doi.org/10.1109/etfa.2016.7733712

  • Qu, G., Wierman, A., & Li, N. (2020). Scalable reinforcement learning of localized policies for multi-agent networked systems. In Learning for Dynamics and Control (pp. 256–266).

  • Ramírez-Hernández, J. A., & Fernandez, E. (2005). A case study in scheduling reentrant manufacturing lines: Optimal and simulation-based approaches. In Proceedings of the 44th IEEE conference on decision and control (Vol. 2005, pp. 2158–2163). https://doi.org/10.1109/CDC.2005.1582481

  • Ramírez-Hernández, J. A., & Fernandez, E. (2009). A simulation-based approximate dynamic programming approach for the control of the intel Mini-Fab benchmark model. In Proceedings - Winter simulation conference (pp. 1634–1645). https://doi.org/10.1109/wsc.2009.5429179

  • Ren, J., Ye, C., & Yang, F. (2020). A novel solution to JSPs based on long short-term memory and policy gradient algorithm. International Journal of Simulation Modelling, 19, 157–168. https://doi.org/10.2507/ijsimm19-1-co4

    Article  Google Scholar 

  • Reyna, Y. C. F., Cáceres, A. P., Jiménez, Y. M., & Reyes, Y. T. (2019a). An improvement of reinforcement learning approach for permutation of flow-shop scheduling problems. In RISTI - Revista Iberica de Sistemas e Tecnologias de Informacao, (E18), pp. 257–270.

  • Reyna, Y. C. F., Jiménez, Y. M., Cabrera, A. V., & Sánchez, E. A. (2019b). Optimization of heavily constrained hybrid-flexible flowshop problems using a multi-agent reinforcement learning approach. Investigacion Operacional, 40(1), 100–111

    Google Scholar 

  • Reyna, Y. C. F., Jiménez, Y. M., & Nowé, A. (2018). Q-learning algorithm performance for m-machine n-jobs flow shop scheduling to minimize makespan. Investigación Operacional, 38(3), 281–290

    Google Scholar 

  • Reyna, Y. C. F., Jiménez, Y. M., Bermúdez Cabrera, J. M., & Méndez Hernández, B. M. (2015). A reinforcement learning approach for scheduling problems. Investigacion Operacional, 36(3), 225–231

    Google Scholar 

  • Riedmiller, S., & Riedmiller, M. (1999). A neural reinforcement learning approach to learn local dispatching policies in production scheduling. In IJCAI Iiternational joint conference on artificial intelligence (Vol. 2, pp. 764–769).

  • Russel, S., & Norvig, P. (2010). Artificial intelligence: A modern approach. London: Pearson.

    Google Scholar 

  • Schwartz, A. (1993). A reinforcement learning method for maximizing undiscounted rewards. In Proceedings of the tenth international conference on machine learning (Vol. 298, pp. 298–305). https://doi.org/10.1016/b978-1-55860-307-3.50045-9

  • Shiue, Y., Lee, K., & Su, C. (2018). Real-time scheduling for a smart factory using a reinforcement learning approach. Computers & Industrial Engineering, 125(101), 604–614. https://doi.org/10.1016/j.cie.2018.03.039

    Article  Google Scholar 

  • Sigaud, O., & Buffet, O. (2013). Markov Decision Processes in Artificial Intelligence: MDPs, beyond MDPs and applications. New York: Wiley

    Book  Google Scholar 

  • Stricker, N., Kuhnle, A., Sturm, R., & Friess, S. (2018). Manufacturing technology reinforcement learning for adaptive order dispatching in the semiconductor industry. CIRP Annals, 67(1), 511–514. https://doi.org/10.1016/j.cirp.2018.04.041

    Article  Google Scholar 

  • Sutton, R. S., & Barto, A. G. (1998). Reinforcement Learning: An Introduction. Cambridge: MIT Press

    Google Scholar 

  • Szepesvári, C. (2010). Algorithms for reinforcement learning. Synthesis lectures on artificial intelligence and machine learning, 4(1), 1–103. https://doi.org/10.2200/S00268ED1V01Y201005AIM009

    Article  Google Scholar 

  • Thomas, T. E., Koo, J., Chaterji, S., & Bagchi, S. (2018). Minerva: A reinforcement learning-based technique for optimal scheduling and bottleneck detection in distributed factory operations. In 2018 10th international conference on communication systems & networks (COMSNETS) (pp. 129–136). https://doi.org/10.1109/COMSNETS.2018.8328189

  • Van Otterlo, M. (2009). The logic of adaptive behavior: Knowledge representation and algorithms for adaptive sequential decision making under uncertainty in first-order and relational domains. Ios Press

  • Vapnik, V. N. (2000). Methods of pattern recognition. In The nature of statistical learning theory (pp. 123–180). New York, NY: Springer

    Chapter  Google Scholar 

  • Wang, H. X., & Yan, H. S. (2013a). An adaptive scheduling system in knowledgeable manufacturing based on multi-agent. In 10th IEEE international conference on control and automation (ICCA) (pp. 496–501). https://doi.org/10.1109/icca.2013.6564866

  • Wang, H. X., & Yan, H. S. (2013b). An adaptive assembly scheduling approach in knowledgeable manufacturing. Applied Mechanics and Materials, 433–435, 2347–2350. https://doi.org/10.4028/www.scientific.net/AMM.433-435.2347

    Article  Google Scholar 

  • Wang, H. X., & Yan, H. S. (2016). An interoperable adaptive scheduling strategy for knowledgeable manufacturing based on SMGWQ-learning. Journal of Intelligent Manufacturing, 27(5), 1085–1095. https://doi.org/10.1007/s10845-014-0936-1

    Article  Google Scholar 

  • Wang, H. X., Sarker, B. R., Li, J., & Li, J. (2020). Adaptive scheduling for assembly job shop with uncertain assembly times based on dual Q- learning. International Journal of Production Research. https://doi.org/10.1080/00207543.2020.1794075

    Article  Google Scholar 

  • Wang, Y. C., & Usher, J. M. (2004). Learning policies for single machine job dispatching. Robotics and Computer-Integrated Manufacturing, 20(6), 553–562. https://doi.org/10.1016/j.rcim.2004.07.003

    Article  Google Scholar 

  • Wang, Y. C., & Usher, J. M. (2005). Application of reinforcement learning for agent-based production scheduling. Engineering Applications of Artificial Intelligence, 18(1), 73–82. https://doi.org/10.1016/j.engappai.2004.08.018

    Article  Google Scholar 

  • Wang, Y. C., & Usher, J. M. (2007). A reinforcement learning approach for developing routing policies in multi-agent production scheduling. International Journal of Advanced Manufacturing Technology, 33(3–4), 323–333. https://doi.org/10.1007/s00170-006-0465-y

    Article  Google Scholar 

  • Wang, Y. F. (2018). Adaptive job shop scheduling strategy based on weighted Q-learning algorithm. Journal of Intelligent Manufacturing, 31(2), 417–432. https://doi.org/10.1007/s10845-018-1454-3

    Article  Google Scholar 

  • Waschneck, B., Reichstaller, A., Belzner, L., Altenmüller, T., Bauernhansl, T., Knapp, A., & Kyek, A. (2018a). Optimization of global production scheduling with deep reinforcement learning. Procedia CIRP, 72, 1264–1269. https://doi.org/10.1016/j.procir.2018.03.212

    Article  Google Scholar 

  • Waschneck, B., Reichstaller, A., Belzner, L., Altenmuller, T., Bauernhansl, T., Knapp, A., & Kyek, A. (2018b). Deep reinforcement learning for semiconductor production scheduling. In 2018 29th annual SEMI advanced semiconductor manufacturing conference, ASMC 2018 (pp. 301–306). https://doi.org/10.1109/asmc.2018.8373191

  • Wei, Y., & Zhao, M. (2004). Composite rules selection using reinforcement learning for dynamic job-shop scheduling. In 2004 IEEE conference on robotics, automation and mechatronics (Vol. 2, pp. 1083–1088). https://doi.org/10.1109/RAMECH.2004.1438070

  • Xanthopoulos, A. S., Koulouriotis, D. E., Tourassis, V. D., & Emiris, D. M. (2013). Intelligent controllers for bi-objective dynamic scheduling on a single machine with sequence-dependent setups. Applied Soft Computing Journal, 13(12), 4704–4717. https://doi.org/10.1016/j.asoc.2013.07.015

    Article  Google Scholar 

  • Xiao, Y., Tan, Q., Zhou, L., & Tang, H. (2017). Stochastic scheduling with compatible job families by an improved Q-learning algorithm. In Chinese Control Conference, CCC (pp. 2657–2662). https://doi.org/10.23919/ChiCC.2017.8027764

  • Yang, H. B., & Yan, H. S. (2009). An adaptive approach to dynamic scheduling in knowledgeable manufacturing cell. International Journal of Advanced Manufacturing Technology, 42(3–4), 312–320. https://doi.org/10.1007/s00170-008-1588-0

    Article  Google Scholar 

  • Yang, H. B., & Yan, H. S. (2007). An adaptive policy of dynamic scheduling in knowledgeable manufacturing environment. In Proceedings of the IEEE international conference on automation and logistics, ICAL 2007 (pp. 835–840). https://doi.org/10.1109/ICAL.2007.4338680

  • Yingzi, W. E. I., Xinli, J., & Pingbo, H. A. O. (2009). Pattern Driven Dynamic Scheduling Approach using Reinforcement Learning. In 2009 IEEE international conference on automation and logistics (pp. 514–519). https://doi.org/10.1109/ICAL.2009.5262867

  • Yuan, B., Jiang, Z., & Wang, L. (2016). Dynamic parallel machine scheduling with random breakdowns using the learning agent. International Journal of Services Operations and Informatics, 8(2), 94–103. https://doi.org/10.1504/IJSOI.2016.080083

    Article  Google Scholar 

  • Yuan, B., Wang, L., & Jiang, Z. (2013). Dynamic parallel machine scheduling using the learning agent. In 2013 IEEE international conference on industrial engineering and engineering management (pp. 1565–1569). https://doi.org/10.1109/IEEM.2013.6962673

  • Zhang, T., Xie, S., & Rose, O. (2017). Real-time job shop scheduling based on simulation and Markov decision processes. In Proceedings - Winter simulation conference (pp. 3899–3907). https://doi.org/10.1109/WSC.2017.8248100

  • Zhang, T., Xie, S., & Rose, O. (2018). Real-time batching in job shops based on simulation and reinforcement learning. In 2018 Winter simulation conference (WSC) (pp. 3331–3339). https://doi.org/10.1109/WSC.2018.8632524

  • Zhang, W., & Dietterich, T. G. (1995). A reinforcement learning approach to job-shop scheduling. In 1995 International joint conference on artificial intelligence (pp. 1114–1120).

  • Zhang, W., & Dietterich, T. G. (1996). High-performance job-shop scheduling with a time-delay TD (λ) network. Advances in Neural Information Processing Systems, 91, 1024–1030

    Google Scholar 

  • Zhang, Z., Zheng, L., Hou, F., & Li, N. (2011). Semiconductor final test scheduling with Sarsa(λ, k) algorithm. European Journal of Operational Research, 215(2), 446–458. https://doi.org/10.1016/j.ejor.2011.05.052

    Article  Google Scholar 

  • Zhang, Z., Zheng, L., Li, N., Wang, W., Zhong, S., & Hu, K. (2012). Minimizing mean weighted tardiness in unrelated parallel machine scheduling with reinforcement learning. Computers and Operations Research, 39(7), 1315–1324. https://doi.org/10.1016/j.cor.2011.07.019

    Article  Google Scholar 

  • Zhang, Z., Zheng, L., & Weng, M. X. (2007). Dynamic parallel machine scheduling with mean weighted tardiness objective by Q-learning. International Journal of Advanced Manufacturing Technology, 34(9–10), 968–980. https://doi.org/10.1007/s00170-006-0662-8

    Article  Google Scholar 

  • Zhao, M., Li, X., Gao, L., Wang, L., & Xiao, M. (2019). An improved Q-learning based rescheduling method for flexible job-shops with machine failures. In 2019 IEEE 15th international conference on automation science and engineering (CASE) (pp. 331–337). https://doi.org/10.1109/COASE.2019.8843100

  • Zhou, L., Zhang, L., & Horn, B. K. P. (2020). Deep reinforcement learning-based dynamic scheduling in smart manufacturing. Procedia CIRP, 93, 383–388. https://doi.org/10.1016/j.procir.2020.05.163

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Behice Meltem Kayhan.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kayhan, B.M., Yildiz, G. Reinforcement learning applications to machine scheduling problems: a comprehensive literature review. J Intell Manuf 34, 905–929 (2023). https://doi.org/10.1007/s10845-021-01847-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10845-021-01847-3

Keywords

Navigation