Abstract
In recent years, consumer credit has flourished in China. A credit factory is an important mode to speed up the loan application process. Order scheduling in credit factories belongs to the np-hard problem and it has great significance for credit factory efficiency. In this work, we formulate order scheduling in credit factories as a multi-agent reinforcement learning (MARL) task. In the proposed MARL algorithm, we explore a new reward mechanism, including reward calculation and reward assignment, which is suitable for this task. Moreover, we use a convolutional auto-encoder to generate multi-agent state. To avoid physical costs during MARL training, we establish a simulator, named Virtual Credit Factory, to pre-train the MARL algorithm. Through experiments in Virtual Credit Factory and an A/B test in a real application, we compare the performance of the proposed MARL approach and some classic heuristic approaches. In both cases, the results demonstrate that the MARL approach has better performance and strong robustness.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Liu, K., Ma, B.: China’s Small Micro Enterprise Financing Problems and Countermeasures, dtem, no. icem, November 2016. https://doi.org/10.12783/dtem/icem2016/4033.
Wang, G., Cheng, T.C.E.: Customer order scheduling to minimize total weighted completion time. Omega 35(5), 623–626 (2007). https://doi.org/10.1016/j.omega.2005.09.007
Silver, D., et al.: Mastering the game of Go without human knowledge. Nature 550(7676), 354–359 (2017). https://doi.org/10.1038/nature24270
Hwangbo, J., et al.: Learning agile and dynamic motor skills for legged robots. Sci. Robot. 4(26), eaau5872 (2019) https://doi.org/10.1126/scirobotics.aau5872.
Su, P.-H., et al.: On-line Active Reward Learning for Policy Optimisation in Spoken Dialogue Systems, arXiv:1605.07669 [cs], June 2016, Accessed 08 Jan 2020. https://arxiv.org/abs/1605.07669.
Xie, S., Zhang, T., Rose, O.: Online Single Machine Scheduling Based on Simulation and Reinforcement Learning. Simulation in Produktion und Logistik 2019, 10 (2019)
Shahrabi, J., Adibi, M.A., Mahootchi, M.: A reinforcement learning approach to parameter estimation in dynamic job shop scheduling. Comput. Ind. Eng. 110, 75–82 (2017)
Li, X., Wang, J., Sawhney, R.: Reinforcement learning for joint pricing, lead-time and scheduling decisions in make-to-order systems. Eur. J. Oper. Res. 221(1), 99–109 (2012). https://doi.org/10.1016/j.ejor.2012.03.020
Waschneck, B., et al.: Deep reinforcement learning for semiconductor production scheduling, pp. 301–306 (2018)
Shiue, Y.-R., Lee, K.-C., Su, C.-T.: Real-time scheduling for a smart factory using a reinforcement learning approach. Comput. Ind. Eng. 125, 604–614 (2018). https://doi.org/10.1016/j.cie.2018.03.039
Gabel, T., Riedmiller, M.: Adaptive reactive job-shop scheduling with reinforcement learning agents. Int. J. Inf. Technol. Intell. Comput. 24(4) (2008)
Qu, S., Wang, J., Shivani, G.: Learning adaptive dispatching rules for a manufacturing process system by using reinforcement learning approach, pp. 1–8 (2016)
Liu, K., Fu, Y., Wang, P., Wu, L., Bo, R., Li, X.: Automating feature subspace exploration via multi-agent reinforcement learning. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining - KDD ’19, Anchorage, AK, USA, 2019, pp. 207–215 (2019). https://doi.org/10.1145/3292500.3330868.
Shl, G.: A genetic algorithm applied to a classic job-shop scheduling problem. Int. J. Syst. Sci. 28(1), 25–32 (1997). https://doi.org/10.1080/00207729708929359.
Panwalkar, S.S., Iskander, W.: A Survey of Scheduling Rules. Oper. Res. 25(1), 45–61 (1977). https://doi.org/10.1287/opre.25.1.45
Hinton, G.E.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 ( 2006). https://doi.org/10.1126/science.1127647
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Huang, C., Cui, R., Deng, J., Jia, N. (2021). Real-Time Order Scheduling in Credit Factories: A Multi-agent Reinforcement Learning Approach. In: Gao, W., et al. Intelligent Computing and Block Chain. FICC 2020. Communications in Computer and Information Science, vol 1385. Springer, Singapore. https://doi.org/10.1007/978-981-16-1160-5_36
Download citation
DOI: https://doi.org/10.1007/978-981-16-1160-5_36
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-1159-9
Online ISBN: 978-981-16-1160-5
eBook Packages: Computer ScienceComputer Science (R0)