Skip to main content

Smart Containers with Bidding Capacity: A Policy Gradient Algorithm for Semi-cooperative Learning

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12433))

Abstract

Smart modular freight containers – as propagated in the Physical Internet paradigm – are equipped with sensors, data storage capability and intelligence that enable them to route themselves from origin to destination without manual intervention or central governance. In this self-organizing setting, containers may autonomously place bids on transport services in a spot market setting. However, for individual containers it might be difficult to learn good bidding policies due to their short lifespan. By sharing information and costs between one another, smart containers can jointly learn bidding policies, even though simultaneously competing for the same transport capacity. We replicate this behavior by learning stochastic bidding policies in a semi-cooperative multi-agent setting. To this end, we develop a reinforcement learning algorithm based on the policy gradient framework. Numerical experiments show that sharing solely bids and acceptance decisions leads to stable bidding policies. Real-time system information only marginally improves performance; individual job properties suffice to place appropriate bids. Furthermore, we find that carriers may have incentives not to share information with the smart containers. The experiments give rise to several directions for follow-up research, particularly addressing the interaction between smart containers and transport services in self-organizing logistics.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://github.com/woutervanheeswijk/policygradientsmartcontainers.

References

  1. Ambra, T., Caris, A., Macharis, C.: Towards freight transport system unification: reviewing and combining the advancements in the physical internet and synchromodal transport research. Int. J. Prod. Res. 57(6), 1606–1623 (2019)

    Article  Google Scholar 

  2. Boukhtouta, A., Berger, J., Powell, W.B., George, A.: An adaptive-learning framework for semi-cooperative multi-agent coordination. In: 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, pp. 324–331. Institute of Electrical and Electronics Engineers Inc, Piscataway, New Jersey (2011)

    Google Scholar 

  3. Kellerer, H., Pferschy, U., Pisinger, D.: Multidimensional knapsack problems. In: Floudas, C., Pardalos, P. (eds.) Knapsack problems, pp. 235–283. Springer, Boston (2004). https://doi.org/10.1007/978-0-387-74759-0_412

    Chapter  MATH  Google Scholar 

  4. Klapp, M.A., Erera, A.L., Toriello, A.: The one-dimensional dynamic dispatch waves problem. Transp. Sci. 52(2), 402–415 (2018)

    Article  Google Scholar 

  5. Miller, J., Nie, Y.M.: Dynamic trucking equilibrium through a freight exchange. Transp. Res. Part C: Emer. Technol. 113, 193–212 (2019)

    Article  Google Scholar 

  6. Minkoff, A.S.: A Markov decision model and decomposition heuristic for dynamic vehicle dispatching. Oper. Res. 41(1), 77–90 (1993)

    Article  Google Scholar 

  7. Montreuil, B.: Toward a Physical Internet: meeting the global logistics sustainability grand challenge. Logist. Res. 71–87 (2011). https://doi.org/10.1007/s12159-011-0045-x

  8. Montreuil, B., Meller, R.D., Ballot, E.: Physical internet foundations. In: Borangiu, T., Thomas, A., Trentesaux, D. (eds.) Service Orientation in Holonic and Multi Agent Manufacturing and Robotics, pp. 151–166. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-35852-4_10

    Chapter  Google Scholar 

  9. Qiao, B., Pan, S., Ballot, E.: Dynamic pricing model for less-than-truckload carriers in the Physical Internet. J. Intell. Manuf. 30(7), 2631–2643 (2016). https://doi.org/10.1007/s10845-016-1289-8

    Article  Google Scholar 

  10. Sallez, Y., Pan, S., Montreuil, B., Berger, T., Ballot, E.: On the activeness of intelligent Physical Internet containers. Comput. Ind. 81, 96–104 (2016)

    Article  Google Scholar 

  11. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, 2nd edn. The MIT Press, Cambridge (2018)

    MATH  Google Scholar 

  12. Van Heeswijk, W.J.A., La Poutré, H.: Scalability and performance of decentralized planning in flexible transport networks. In: 2018 IEEE International Conference on Systems. Man, and Cybernetics, pp. 292–297. Institute of Electrical and Electronics Engineers Inc, Piscataway (2018)

    Google Scholar 

  13. van Heeswijk, W., Mes, M., Schutten, M.: Transportation management. In: Zijm, H., Klumpp, M., Regattieri, A., Heragu, S. (eds.) Operations, Logistics and Supply Chain Management. LNL, pp. 469–491. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-92447-2_21

    Chapter  Google Scholar 

  14. Van Heeswijk, W.J.A., Mes, M.R.K., Schutten, J.M.J.: The delivery dispatching problem with time windows for urban consolidation centers. Transp. Sci. 53(1), 203–221 (2019)

    Article  Google Scholar 

  15. Voccia, S.A., Campbell, A.M., Thomas, B.W.: The same-day delivery problem for online purchases. Transp. Sci. 53(1), 167–184 (2019)

    Article  Google Scholar 

  16. Wang, Y., Nascimento, J.M.D., Powell, W.B.: Reinforcement learning for dynamic bidding in truckload markets: an application to large-scale fleet management with advance commitments. arXiv preprint arXiv:1802.08976 (2018)

  17. Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8(3–4), 229–256 (1992)

    MATH  Google Scholar 

  18. Yan, F., Ma, Y., Xu, M., Ge, X.: Transportation service procurement bid construction problem from less than truckload perspective. Math. Probl. Eng. (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wouter van Heeswijk .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

van Heeswijk, W. (2020). Smart Containers with Bidding Capacity: A Policy Gradient Algorithm for Semi-cooperative Learning. In: Lalla-Ruiz, E., Mes, M., Voß, S. (eds) Computational Logistics. ICCL 2020. Lecture Notes in Computer Science(), vol 12433. Springer, Cham. https://doi.org/10.1007/978-3-030-59747-4_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-59747-4_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-59746-7

  • Online ISBN: 978-3-030-59747-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics