Skip to main content

A Reinforcement Learning Approach to Inventory Management

  • Conference paper
  • First Online:
Advances in Artificial Intelligence and Data Engineering (AIDE 2019)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1133))

Abstract

This paper presents our approach for the control of a centralized distributed inventory management system using reinforcement learning (RL). We propose the application of policy-based reinforcement learning algorithms to tackle this problem in an effective manner. We have formulated the problem as a Markov decision process (MDP) and have created an environment that keeps track of multiple products across multiple warehouses returning a reward signal that directly corresponds to the total revenue across all warehouses at every time step. In this environment, we have applied various policy-based reinforcement learning algorithms such as Advantage Actor-Critic, Trust Region Policy Optimization and Proximal Policy Optimization to decide the amount of each product to be stocked in every warehouse. The performance of these algorithms in maximizing average revenue over time has been evaluated considering various statistical distributions from which we sample demand per time step per episode of training. We also compare these approaches to an existing approach involving a fixed replenishment scheme. In conclusion, we elaborate upon the results of our evaluation and the scope for future work on the topic.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 219.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 279.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 279.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Sutton R, Barto A (1995) Reinforcement learning: an introduction. MIT Press, Cambridge, MA. http://www.incompleteideas.net/book/the-book-2nd.html

  2. Stockheim T, Schwind M, Koenig W (2003) A reinforcement learning approach for supply chain management

    Google Scholar 

  3. Kara A, Dogan I (2018) Reinforcement learning approaches for specifying ordering policies of perishable inventory systems. Expert Syst Appl 91:150–158

    Google Scholar 

  4. Katanyukul T, Chong EKP (2014) Intelligent inventory control via ruminative reinforcement learning. https://doi.org/10.1007/11823285_121

  5. Pontrandolfo P, Gosavi A, Okogbaa OG, Das TK (2002) Global supply chain management: a reinforcement learning approach. Int J Prod Res 40:1299–1317

    Google Scholar 

  6. Yuan Yu J (2017) Quality assurance in supply chain management lesson 4, INSE6300 supply chain management. https://users.encs.concordia.ca/~jiayuan/scm17/reinforcement.pdf

  7. Kim CO, Jun J, Baek JK, Smith RL, Kim YD (2005) Adaptive inventory control models for supply chain management. Int J Adv Manuf Technol 26:1184–1192. https://doi.org/10.1007/s00170-004-2069-8

  8. Giannoccaro I, Pontrandolfo P: Inventory management in supply chains: a reinforcement learning approach. Int J Prod Econ 78:153–161. https://doi.org/10.1016/S0925-5273(00)00156-0

  9. Oroojlooyjadid A, Snyder LV, Takac M: Applying deep learning to the newsvendor problem. arXiv:1607.02177

  10. Mehta D, Yamparala D (2014) Policy gradient reinforcement learning for solving supply-chain management problems. https://doi.org/10.1145/2662117.2662129

  11. Brockman G, Cheung V, Pettersson L, Schneider J, Schulman J, Tang J, Zaremba W (2016) OpenAI Gym. arXiv:1606.01540 [cs.LG]

  12. Mnih V, Kavukcuoglu K, Silver D, Rusu A, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland A, Ostrovski G, Petersen S, Beattie C, Sadik A, Antonoglou I, King H, Kumaran D, Wierstra D, Legg S, Hassabis D (2015) Human level control through deep reinforcement learning. Nature 518:529–533. https://doi.org/10.1038/nature14236

  13. Lagoudakis M, Parr R (2003) Least-squares policy iteration

    Google Scholar 

  14. Sutton RS, McAllester D, Singh S, Mansour Y (2000) Policy gradient methods for reinforcement learning with function approximation. http://dl.acm.org/citation.cfm?id=3009657.3009806

  15. Schulman J, Levine S, Moritz P, Jordan M, Abbeel P (2015) Trust region policy optimization. arXiv:1502.05477

  16. Schulman J, Wolski F, Dhariwal P, Radford A, Klimov O (2017) Proximal policy optimization algorithms. arXiv:1707.06347

  17. Adobe Magento eCommerce Software: threshold-based inventory replenishment. https://docs.magento.com/m1/ce/user_guide/catalog/inventory-out-of-stock-threshold.html

  18. National Programme for Technology Enhanced Learning, Reinforcement Learning, Ravindran B. https://onlinecourses.nptel.ac.in/noc16_cs09

  19. Deep RL Bootcamp lecture series. Berkeley, CA (2017). https://sites.google.com/view/deep-rl-bootcamp/lectures

  20. Analytics Vidhya website. https://www.analyticsvidhya.com/blog/2018/09/reinforcement-learning-model-based-planning-dynamic-programming

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Apoorva Gokhale .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Gokhale, A., Trasikar, C., Shah, A., Hegde, A., Naik, S.R. (2021). A Reinforcement Learning Approach to Inventory Management. In: Chiplunkar, N.N., Fukao, T. (eds) Advances in Artificial Intelligence and Data Engineering. AIDE 2019. Advances in Intelligent Systems and Computing, vol 1133. Springer, Singapore. https://doi.org/10.1007/978-981-15-3514-7_23

Download citation

Publish with us

Policies and ethics