Deep Reinforcement Learning and Optimization Approach for Multi-echelon Supply Chain with Uncertain Demands

Alves, Júlio César; Mateus, Geraldo Robson

doi:10.1007/978-3-030-59747-4_38

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12433))

Included in the following conference series:

International Conference on Computational Logistics

3023 Accesses
11 Citations

Abstract

Deep Reinforcement Learning (RL) has been used recently in many areas achieving successful results. A multi-period supply chain operation can be viewed as a sequential decision-making problem for which Deep RL may be appropriate. Previous uses of such approach on related problems consider only serial or two-echelon supply chains with limited decision possibilities. In this research a four-echelon supply chain with two nodes per echelon and stochastic customer demands is considered. An MDP formulation and a Non-Linear Programming model of the problem are presented. Proximal Policy Optimization (PPO2) is used in order to find a good policy to operate the entire supply chain and minimize total operating costs. An agent based on a linearized model is used as a baseline. Experimental results indicate that PPO2 is a suitable and competitive approach for the proposed problem.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Applying and Comparing Policy Gradient Methods to Multi-echelon Supply Chains with Uncertain Demands and Lead Times

Supply Chain Synchronization Through Deep Reinforcement Learning

Multi-echelon inventory optimization using deep reinforcement learning

Article Open access 19 July 2023

Notes

1.
Unmet demand cost \(c^d \) was considered as \(3c^q \) where \(c^q\) is the total operating cost of delivering one unity of product. The value of \(c^q\) was calculated as 72 for the presented scenario (considering the highest supply and processing costs, total transportation costs and inventory costs over eight periods).

References

Brockman, G., et al.: Openai gym (2016)
Google Scholar
Chaharsooghi, S.K., Heydari, J., Zegordi, S.H.: A reinforcement learning model for supply chain ordering management: an application to the beer game. Dec. Supp. Syst. 45(4), 949–959 (2008). https://doi.org/10.1016/j.dss.2008.03.007
Article Google Scholar
Giannoccaro, I., Pontrandolfo, P.: Inventory management in supply chains: a reinforcement learning approach. Int. J. Prod. Econ. 78(2), 153–161 (2002). https://doi.org/10.1016/S0925-5273(00)00156-0
Article Google Scholar
Gijsbrechts, J., Boute, R.N., Van Mieghem, J.A., Zhang, D.: Can deep reinforcement learning improve inventory management? performance on dual sourcing, lost sales and multi-echelon problems (2019). https://doi.org/10.2139/ssrn.3302881
Hill, A., et al.: Stable baselines. https://github.com/hill-a/stable-baselines
Hutse, V.: Reinforcement Learning for Inventory Optimisation in multi-echelon supply chains. Master in business engineering, Ghent University (2019)
Google Scholar
Kemmer, L., von Kleist, H., de Rochebouët, D., Tziortziotis, N., Read, J.: Reinforcement learning for supply chain optimization. In: European Workshop on Reinforcement Learning, vol. 14 (2018)
Google Scholar
Laumanns, M., Woerner, S.: Multi-echelon supply chain optimization: methods and application examples. In: Póvoa, A.P.B., Corominas, A., de Miranda, J.L. (eds.) Optimization and Decision Support Systems for Supply Chains, pp. 131–138. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-42421-7_9
Chapter Google Scholar
Mortazavi, A., Arshadi Khamseh, A., Azimi, P.: Designing of an intelligent self-adaptive model for supply chain ordering management system. Eng. Appl. Artif. Intell 37, 207–220 (2015)
Article Google Scholar
OpenAI Baselines: Blog. https://openai.com/blog/openai-baselines-ppo/. Accessed 23 May 2020
Oroojlooyjadid, A.: Applications of Machine Learning in Supply Chains. Ph.D. thesis, Lehigh University (2019). https://preserve.lehigh.edu/etd/4364
Peng, Z., Zhang, Y., Feng, Y., Zhang, T., Wu, Z., Su, H.: Deep reinforcement learning approach for capacitated supply chain optimization under demand uncertainty. In: 2019 Chinese Automation Congress (CAC), pp. 3512–3517 (2019). https://doi.org/10.1109/CAC48633.2019.8997498
Raffin, A.: Rl baselines zoo (2018). https://github.com/araffin/rl-baselines-zoo
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (2018)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Universidade Federal de Lavras, Lavras, MG, Brazil
Júlio César Alves
Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
Júlio César Alves & Geraldo Robson Mateus

Authors

Júlio César Alves
View author publications
You can also search for this author in PubMed Google Scholar
Geraldo Robson Mateus
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Júlio César Alves .

Editor information

Editors and Affiliations

University of Twente, Enschede, The Netherlands
Eduardo Lalla-Ruiz
University of Twente, Enschede, The Netherlands
Martijn Mes
University of Hamburg, Hamburg, Germany
Stefan Voß

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Alves, J.C., Mateus, G.R. (2020). Deep Reinforcement Learning and Optimization Approach for Multi-echelon Supply Chain with Uncertain Demands. In: Lalla-Ruiz, E., Mes, M., Voß, S. (eds) Computational Logistics. ICCL 2020. Lecture Notes in Computer Science(), vol 12433. Springer, Cham. https://doi.org/10.1007/978-3-030-59747-4_38

Download citation

DOI: https://doi.org/10.1007/978-3-030-59747-4_38
Published: 22 September 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59746-7
Online ISBN: 978-3-030-59747-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Deep Reinforcement Learning and Optimization Approach for Multi-echelon Supply Chain with Uncertain Demands

Abstract

Access this chapter

Similar content being viewed by others

Applying and Comparing Policy Gradient Methods to Multi-echelon Supply Chains with Uncertain Demands and Lead Times

Supply Chain Synchronization Through Deep Reinforcement Learning

Multi-echelon inventory optimization using deep reinforcement learning

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Deep Reinforcement Learning and Optimization Approach for Multi-echelon Supply Chain with Uncertain Demands

Abstract

Access this chapter

Similar content being viewed by others

Applying and Comparing Policy Gradient Methods to Multi-echelon Supply Chains with Uncertain Demands and Lead Times

Supply Chain Synchronization Through Deep Reinforcement Learning

Multi-echelon inventory optimization using deep reinforcement learning

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation