Abstract
Smart modular freight containers – as propagated in the Physical Internet paradigm – are equipped with sensors, data storage capability and intelligence that enable them to route themselves from origin to destination without manual intervention or central governance. In this self-organizing setting, containers may autonomously place bids on transport services in a spot market setting. However, for individual containers it might be difficult to learn good bidding policies due to their short lifespan. By sharing information and costs between one another, smart containers can jointly learn bidding policies, even though simultaneously competing for the same transport capacity. We replicate this behavior by learning stochastic bidding policies in a semi-cooperative multi-agent setting. To this end, we develop a reinforcement learning algorithm based on the policy gradient framework. Numerical experiments show that sharing solely bids and acceptance decisions leads to stable bidding policies. Real-time system information only marginally improves performance; individual job properties suffice to place appropriate bids. Furthermore, we find that carriers may have incentives not to share information with the smart containers. The experiments give rise to several directions for follow-up research, particularly addressing the interaction between smart containers and transport services in self-organizing logistics.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Ambra, T., Caris, A., Macharis, C.: Towards freight transport system unification: reviewing and combining the advancements in the physical internet and synchromodal transport research. Int. J. Prod. Res. 57(6), 1606–1623 (2019)
Boukhtouta, A., Berger, J., Powell, W.B., George, A.: An adaptive-learning framework for semi-cooperative multi-agent coordination. In: 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, pp. 324–331. Institute of Electrical and Electronics Engineers Inc, Piscataway, New Jersey (2011)
Kellerer, H., Pferschy, U., Pisinger, D.: Multidimensional knapsack problems. In: Floudas, C., Pardalos, P. (eds.) Knapsack problems, pp. 235–283. Springer, Boston (2004). https://doi.org/10.1007/978-0-387-74759-0_412
Klapp, M.A., Erera, A.L., Toriello, A.: The one-dimensional dynamic dispatch waves problem. Transp. Sci. 52(2), 402–415 (2018)
Miller, J., Nie, Y.M.: Dynamic trucking equilibrium through a freight exchange. Transp. Res. Part C: Emer. Technol. 113, 193–212 (2019)
Minkoff, A.S.: A Markov decision model and decomposition heuristic for dynamic vehicle dispatching. Oper. Res. 41(1), 77–90 (1993)
Montreuil, B.: Toward a Physical Internet: meeting the global logistics sustainability grand challenge. Logist. Res. 71–87 (2011). https://doi.org/10.1007/s12159-011-0045-x
Montreuil, B., Meller, R.D., Ballot, E.: Physical internet foundations. In: Borangiu, T., Thomas, A., Trentesaux, D. (eds.) Service Orientation in Holonic and Multi Agent Manufacturing and Robotics, pp. 151–166. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-35852-4_10
Qiao, B., Pan, S., Ballot, E.: Dynamic pricing model for less-than-truckload carriers in the Physical Internet. J. Intell. Manuf. 30(7), 2631–2643 (2016). https://doi.org/10.1007/s10845-016-1289-8
Sallez, Y., Pan, S., Montreuil, B., Berger, T., Ballot, E.: On the activeness of intelligent Physical Internet containers. Comput. Ind. 81, 96–104 (2016)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, 2nd edn. The MIT Press, Cambridge (2018)
Van Heeswijk, W.J.A., La Poutré, H.: Scalability and performance of decentralized planning in flexible transport networks. In: 2018 IEEE International Conference on Systems. Man, and Cybernetics, pp. 292–297. Institute of Electrical and Electronics Engineers Inc, Piscataway (2018)
van Heeswijk, W., Mes, M., Schutten, M.: Transportation management. In: Zijm, H., Klumpp, M., Regattieri, A., Heragu, S. (eds.) Operations, Logistics and Supply Chain Management. LNL, pp. 469–491. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-92447-2_21
Van Heeswijk, W.J.A., Mes, M.R.K., Schutten, J.M.J.: The delivery dispatching problem with time windows for urban consolidation centers. Transp. Sci. 53(1), 203–221 (2019)
Voccia, S.A., Campbell, A.M., Thomas, B.W.: The same-day delivery problem for online purchases. Transp. Sci. 53(1), 167–184 (2019)
Wang, Y., Nascimento, J.M.D., Powell, W.B.: Reinforcement learning for dynamic bidding in truckload markets: an application to large-scale fleet management with advance commitments. arXiv preprint arXiv:1802.08976 (2018)
Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8(3–4), 229–256 (1992)
Yan, F., Ma, Y., Xu, M., Ge, X.: Transportation service procurement bid construction problem from less than truckload perspective. Math. Probl. Eng. (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
van Heeswijk, W. (2020). Smart Containers with Bidding Capacity: A Policy Gradient Algorithm for Semi-cooperative Learning. In: Lalla-Ruiz, E., Mes, M., Voß, S. (eds) Computational Logistics. ICCL 2020. Lecture Notes in Computer Science(), vol 12433. Springer, Cham. https://doi.org/10.1007/978-3-030-59747-4_4
Download citation
DOI: https://doi.org/10.1007/978-3-030-59747-4_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59746-7
Online ISBN: 978-3-030-59747-4
eBook Packages: Computer ScienceComputer Science (R0)