Smart Containers with Bidding Capacity: A Policy Gradient Algorithm for Semi-cooperative Learning

van Heeswijk, Wouter

doi:10.1007/978-3-030-59747-4_4

Smart Containers with Bidding Capacity: A Policy Gradient Algorithm for Semi-cooperative Learning

Wouter van Heeswijk¹¹

Conference paper
First Online: 22 September 2020

2309 Accesses
3 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12433))

Abstract

Smart modular freight containers – as propagated in the Physical Internet paradigm – are equipped with sensors, data storage capability and intelligence that enable them to route themselves from origin to destination without manual intervention or central governance. In this self-organizing setting, containers may autonomously place bids on transport services in a spot market setting. However, for individual containers it might be difficult to learn good bidding policies due to their short lifespan. By sharing information and costs between one another, smart containers can jointly learn bidding policies, even though simultaneously competing for the same transport capacity. We replicate this behavior by learning stochastic bidding policies in a semi-cooperative multi-agent setting. To this end, we develop a reinforcement learning algorithm based on the policy gradient framework. Numerical experiments show that sharing solely bids and acceptance decisions leads to stable bidding policies. Real-time system information only marginally improves performance; individual job properties suffice to place appropriate bids. Furthermore, we find that carriers may have incentives not to share information with the smart containers. The experiments give rise to several directions for follow-up research, particularly addressing the interaction between smart containers and transport services in self-organizing logistics.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
https://github.com/woutervanheeswijk/policygradientsmartcontainers.

References

Ambra, T., Caris, A., Macharis, C.: Towards freight transport system unification: reviewing and combining the advancements in the physical internet and synchromodal transport research. Int. J. Prod. Res. 57(6), 1606–1623 (2019)
Article Google Scholar
Boukhtouta, A., Berger, J., Powell, W.B., George, A.: An adaptive-learning framework for semi-cooperative multi-agent coordination. In: 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, pp. 324–331. Institute of Electrical and Electronics Engineers Inc, Piscataway, New Jersey (2011)
Google Scholar
Kellerer, H., Pferschy, U., Pisinger, D.: Multidimensional knapsack problems. In: Floudas, C., Pardalos, P. (eds.) Knapsack problems, pp. 235–283. Springer, Boston (2004). https://doi.org/10.1007/978-0-387-74759-0_412
Chapter MATH Google Scholar
Klapp, M.A., Erera, A.L., Toriello, A.: The one-dimensional dynamic dispatch waves problem. Transp. Sci. 52(2), 402–415 (2018)
Article Google Scholar
Miller, J., Nie, Y.M.: Dynamic trucking equilibrium through a freight exchange. Transp. Res. Part C: Emer. Technol. 113, 193–212 (2019)
Article Google Scholar
Minkoff, A.S.: A Markov decision model and decomposition heuristic for dynamic vehicle dispatching. Oper. Res. 41(1), 77–90 (1993)
Article Google Scholar
Montreuil, B.: Toward a Physical Internet: meeting the global logistics sustainability grand challenge. Logist. Res. 71–87 (2011). https://doi.org/10.1007/s12159-011-0045-x
Montreuil, B., Meller, R.D., Ballot, E.: Physical internet foundations. In: Borangiu, T., Thomas, A., Trentesaux, D. (eds.) Service Orientation in Holonic and Multi Agent Manufacturing and Robotics, pp. 151–166. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-35852-4_10
Chapter Google Scholar
Qiao, B., Pan, S., Ballot, E.: Dynamic pricing model for less-than-truckload carriers in the Physical Internet. J. Intell. Manuf. 30(7), 2631–2643 (2016). https://doi.org/10.1007/s10845-016-1289-8
Article Google Scholar
Sallez, Y., Pan, S., Montreuil, B., Berger, T., Ballot, E.: On the activeness of intelligent Physical Internet containers. Comput. Ind. 81, 96–104 (2016)
Article Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, 2nd edn. The MIT Press, Cambridge (2018)
MATH Google Scholar
Van Heeswijk, W.J.A., La Poutré, H.: Scalability and performance of decentralized planning in flexible transport networks. In: 2018 IEEE International Conference on Systems. Man, and Cybernetics, pp. 292–297. Institute of Electrical and Electronics Engineers Inc, Piscataway (2018)
Google Scholar
van Heeswijk, W., Mes, M., Schutten, M.: Transportation management. In: Zijm, H., Klumpp, M., Regattieri, A., Heragu, S. (eds.) Operations, Logistics and Supply Chain Management. LNL, pp. 469–491. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-92447-2_21
Chapter Google Scholar
Van Heeswijk, W.J.A., Mes, M.R.K., Schutten, J.M.J.: The delivery dispatching problem with time windows for urban consolidation centers. Transp. Sci. 53(1), 203–221 (2019)
Article Google Scholar
Voccia, S.A., Campbell, A.M., Thomas, B.W.: The same-day delivery problem for online purchases. Transp. Sci. 53(1), 167–184 (2019)
Article Google Scholar
Wang, Y., Nascimento, J.M.D., Powell, W.B.: Reinforcement learning for dynamic bidding in truckload markets: an application to large-scale fleet management with advance commitments. arXiv preprint arXiv:1802.08976 (2018)
Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8(3–4), 229–256 (1992)
MATH Google Scholar
Yan, F., Ma, Y., Xu, M., Ge, X.: Transportation service procurement bid construction problem from less than truckload perspective. Math. Probl. Eng. (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Industrial Engineering and Business Information Systems, University of Twente, P.O. Box 217, 7500, Enschede, AE, The Netherlands
Wouter van Heeswijk

Authors

Wouter van Heeswijk
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wouter van Heeswijk .

Editor information

Editors and Affiliations

University of Twente, Enschede, The Netherlands
Eduardo Lalla-Ruiz
University of Twente, Enschede, The Netherlands
Martijn Mes
University of Hamburg, Hamburg, Germany
Stefan Voß

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

van Heeswijk, W. (2020). Smart Containers with Bidding Capacity: A Policy Gradient Algorithm for Semi-cooperative Learning. In: Lalla-Ruiz, E., Mes, M., Voß, S. (eds) Computational Logistics. ICCL 2020. Lecture Notes in Computer Science(), vol 12433. Springer, Cham. https://doi.org/10.1007/978-3-030-59747-4_4

Download citation

DOI: https://doi.org/10.1007/978-3-030-59747-4_4
Published: 22 September 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59746-7
Online ISBN: 978-3-030-59747-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics