Skip to main content

Effective Cross-Region Courier-Displacement for Instant Delivery via Reinforcement Learning

  • Conference paper
  • First Online:
Wireless Algorithms, Systems, and Applications (WASA 2021)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12937))

Abstract

With the rapid development of mobile phones and the Internet of Things, instant delivery services (e.g., UberEats and MeiTuan) have become a popular choice for people to order foods, fruits, and other groceries online, especially after the impact of COVID-19. In instant delivery services, it is important to dispatch massive orders to limited couriers, especially in rush hours. To meet this need, an efficient courier displacement mechanism not only can balance the demand (picking up orders) and supply (couriers’ capacity) but also improve the efficiency of order delivery by reducing idle displacing time. Existing studies on fleet management of rider-sharing or bike rebalancing cannot apply to courier displacement problems in instant delivery due to unique practical factors of instant delivery including region difference and strict delivery time constraints. In this work, we propose an efficient cross-region courier displacement method Courier Displacement Reinforcement Learning (short for CDRL), based on multi-agent actor-critic, considering the dynamic demand and supply at the region level and strict time constraints. Specifically, the multi-agent actor-critic reinforcement learning-based courier displacement framework utilizes a policy network to generate displacement decisions considering multiple practical factors and designs a value network to evaluate decisions of the policy network. One month of real-world order records data-set of Shanghai collecting from Eleme (i.e., one of the biggest instant delivery services in China) are utilized in the evaluation and the results show that our method offering up to 36% increase in courier displacement performance and reduce idle ride time by 17%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Consulting statistics (2020). http://www.bigdata-research.cn/content/201912/1026.html. Accessed 29 Jan 2020

  2. Amazon-Prime-Now: Amazon prime now (2020). https://primenow.amazon.com/. Accessed 20 Apr 2020

  3. Chen, J., et al.: A hybrid differential evolution algorithm for the online meal delivery problem. In: 2020 IEEE Congress on Evolutionary Computation (CEC), pp. 1–8 (2020). https://doi.org/10.1109/CEC48606.2020.9185792

  4. Contardo, C., Morency, C., Rousseau, L.: Balancing a dynamic public bike-sharing system. CIRRELT (2012)

    Google Scholar 

  5. Deliveroo: Deliveroo (2020). https://deliveroo.co.uk. Accessed 20 Apr 2020

  6. DoorDash: Doordash (2020). https://www.doordash.com/en-US. Accessed 3 May 2020

  7. Ele.me: Ele.me 2008. ele.me website (2020). http://www.ele.me/. Accessed 29 Oct 2020

  8. He, S., Shin, K.G.: Spatio-temporal capsule-based reinforcement learning for mobility-on-demand network coordination. In: The World Wide Web Conference, WWW 2019, San Francisco, CA, USA, May 13–17, 2019, pp. 2806–2813 (2019). https://doi.org/10.1145/3308558.3313401

  9. Ji, S., Zheng, Y., Wang, Z., Li, T.: Alleviating users’ pain of waiting: Effective task grouping for online-to-offline food delivery services, pp. 773–783 (2019)

    Google Scholar 

  10. Lin, K., Zhao, R., Xu, Z., Zhou, J.: Efficient large-scale fleet management via multi-agent deep reinforcement learning. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & #38; Data Mining, pp. 1774–1783. KDD ’18, ACM, New York, NY, USA (2018). https://doi.org/10.1145/3219819.3219993

  11. Liu, J., Sun, L., Chen, W., Xiong, H.: Rebalancing bike sharing systems: a multi-source data smart optimization. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1005–1014 (2016)

    Google Scholar 

  12. Contardo, C., Morency, C., Rousseau, L.M.: Balancing a dynamic public bike-sharing system. RAIRO - Oper. Res. 45(1), 37–61 (2011)

    Google Scholar 

  13. MeiTuan: Meituan (2021). https://www.meituan.com/

  14. Oda, T., Joe-Wong, C.: Movi: a model-free approach to dynamic fleet management. In: IEEE International Conference on Computer Communications, vol. abs/1804.04758, pp. 2708–2716 (2018)

    Google Scholar 

  15. Raviv, T., Michal, T., Forma, I.: Static repositioning in a bike-sharing system: models and solution approaches. EURO J. Transp. Logistics 2(3), 187–229 (2013)

    Google Scholar 

  16. Ropke, S., Cordeau, J.F.: Branch and cut and price for the pickup and delivery problem with time windows. Transp. Sci. 43(3), 267–286 (2009). https://doi.org/10.1287/trsc.1090.0272

  17. Ubereats: Ubereats (2020). https://www.ubereats.com/hk. Accessed 20 Apr 2020

  18. Wang, S., He, T., Zhang, D., Liu, Y., Son, H.S.: Towards efficient sharing: a usage balancing mechanism for bike sharing systems. In: The World Wide Web Conference, pp. 2011–2021. Association for Computing Machinery, New York, NY, USA (2019). https://doi.org/10.1145/3308558.3313441

  19. Xie, X., Zhang, F., Zhang, D.: Privatehunt: multi-source data-driven dispatching in for-hire vehicle systems. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 2, 45:1–45:26 (2018)

    Google Scholar 

  20. Yang, Z., Hu, J., Shu, Y., Cheng, P., Chen, J., Moscibroda, T.: Mobility modeling and prediction in bike-sharing systems. In: Proceedings of the 14th Annual International Conference on Mobile Systems, Applications, and Services, pp. 165–178 (2016)

    Google Scholar 

  21. Zheng, J., et al.: A two-stage algorithm for fuzzy online order dispatching problem. In: 2020 IEEE Congress on Evolutionary Computation (CEC), pp. 1–8 (2020). https://doi.org/10.1109/CEC48606.2020.9185858

  22. Zhou, Q., et al.: Two fast heuristics for online order dispatching. In: 2020 IEEE Congress on Evolutionary Computation (CEC), pp. 1–8 (2020). https://doi.org/10.1109/CEC48606.2020.9185791

Download references

Acknowledgement

This work was supported in part by National Natural Science Foundation of China under Grant No. 61902066, Natural Science Foundation of Jiangsu Province under Grant No. BK20190336, China National Key R&D Program 2018YFB2100302 and Fundamental Research Funds for the Central Universities under Grant No. 2242021R41068.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaolei Zhou .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hu, S., Guo, B., Wang, S., Zhou, X. (2021). Effective Cross-Region Courier-Displacement for Instant Delivery via Reinforcement Learning. In: Liu, Z., Wu, F., Das, S.K. (eds) Wireless Algorithms, Systems, and Applications. WASA 2021. Lecture Notes in Computer Science(), vol 12937. Springer, Cham. https://doi.org/10.1007/978-3-030-85928-2_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-85928-2_23

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-85927-5

  • Online ISBN: 978-3-030-85928-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics