Deep Reinforcement Learning for the Capacitated Pickup and Delivery Problem with Time Windows

Soroka, A. G.; Meshcheryakov, A. V.; Gerasimov, S. V.

doi:10.1134/S1054661823020165

Deep Reinforcement Learning for the Capacitated Pickup and Delivery Problem with Time Windows

SELECTED CONFERENCE PAPERS
Published: 03 July 2023

Volume 33, pages 169–178, (2023)
Cite this article

Pattern Recognition and Image Analysis Aims and scope Submit manuscript

A. G. Soroka¹,
A. V. Meshcheryakov^1,2 &
S. V. Gerasimov¹

239 Accesses
1 Citation
Explore all metrics

Abstract

The vehicle routing problem with pickup and delivery is one of the most important problems in the context of global urban population growth. Although these kinds of small-size problems can be solved using various classical approaches, a fast (or real-time) route optimizer under real-world constraints (such as throughput and time window constraints) for medium- and large-size problems is still a challenge. In this work, we first successfully applied a deep reinforcement learning approach (a modified JAMPR model) to solve the capacitated pickup and delivery problem with time windows (CPDPTW). We obtained a robust model that gives a fast optimal solution for small- to medium-size problems and gives a fast suboptimal solution for large-size (>200) problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

REFERENCES

B. Balaji, J. Bell-Masterson, E. Bilgin, A. Damianou, P. M. Garcia, A. Jain, R. Luo, A. Maggiar, B. Narayanaswamy, and Ch. Ye, “ORL: Reinforcement learning benchmarks for online stochastic optimization problems,” (2019). arXiv:1911.10641 [cs.LG]
K. Braekers, K. Ramaekers, I. Van Nieuwenhuyse, “The vehicle routing problem: State of the art classification and review,” Comput. Ind. Eng. 99, 300–313 (2016). https://doi.org/10.1016/j.cie.2015.12.007
Article Google Scholar
O. Bräysy and M. Gendreau, “Vehicle routing problem with time windows, Part I: Route construction and local search algorithms,” Transp. Sci. 39, 104–118 (2005). https://doi.org/10.1287/trsc.1030.0056
Article Google Scholar
X. Chen and Yu. Tian, “Learning to perform local rewriting for combinatorial optimization,” Adv. Neural Inf. Process. Syst. 32 (2019).
G. Clarke and J. W. Wright, “Scheduling of vehicles from a central depot to a number of delivery points,” Oper. Res. 12, 568–581 (1964). https://doi.org/10.1287/opre.12.4.568
Article Google Scholar
G. Dantzig, R. Fulkerson, and S. Johnson, “Solution of a large-scale traveling-salesman problem,” J. Oper. Res. Soc. Am. 2, 393–410 (1954). https://doi.org/10.1007/978-3-540-68279-0_1
Article MathSciNet MATH Google Scholar
G. B. Dantzig and J. H. Ramser, “The truck dispatching problem,” Manage. Sci. 6, 80–91 (1959). https://doi.org/10.1287/mnsc.6.1.80
Article MathSciNet MATH Google Scholar
J. K. Falkner and L. Schmidt-Thieme, “Learning to solve vehicle routing problems with time windows through joint attention,” (2020). arXiv:2006.09100 [cs.LG]
W. Kool, H. Van Hoof, and M. Welling, “Attention, learn to solve routing problems!,” (2018). arXiv:1803.08475 [stat.ML]
S. Li, Zh. Yan, and C. Wu, “Learning to delegate for large-scale vehicle routing,” Adv. Neural Inf. Process. Syst. 34 (2021). https://doi.org/10.48550/arXiv.2107.04139
S. Lin and B. W. Kernighan, “An effective heuristic algorithm for the traveling-salesman problem,” Oper. Res. 21, 498–516 (1973). https://doi.org/10.1287/opre.21.2.498
Article MathSciNet MATH Google Scholar
J. D. Little, K. G. Murty, D. W. Sweeney, and C. Karel, “An algorithm for the traveling salesman problem,” Oper. Res. 11, 972–989 (1963). https://doi.org/10.1287/opre.11.6.972
Article MATH Google Scholar
H. Lu, X. Zhang, and Sh. Yang, “A learning-based iterative method for solving vehicle routing problems,” in Int. Conf. on Learning Representations (2019).
M. Nazari, A. Oroojlooy, L. Snyder, and M. Takác, “Reinforcement learning for solving the vehicle routing problem,” Adv. Neural Inf. Process. Syst. 31 (2018). https://doi.org/10.48550/arXiv.1802.04240
I. Or, “Traveling salesman type combinatorial problems and their relation to the logistics of regional blood banking,” PhD Thesis (Northwestern Univ., 1976)
S. N. Parragh, K. F. Doerner, and R. F. Hartl, “A survey on pickup and delivery problems,” J. Betriebswirtschaft 58 (1), 21–51 (2008). https://doi.org/10.1007/s11301-008-0033-7
Article Google Scholar
L. Perron, “Operations research and constraint programming at Google,” in Principles and Practice of Constraint Programming—CP 2011, Lecture Notes in Computer Science, Vol. 6876 (Springer, Berlin, 2011), p. 2. https://doi.org/10.1007/978-3-642-23786-7_2
Book Google Scholar
Zh. T. Qin, H. Zhu, and J. Ye, “Reinforcement learning for ridesharing: An extended survey,” Transp. Res. Part C: Emerging Technol. 144, 103852 (2022). https://doi.org/10.1016/j.trc.2022.103852
Article Google Scholar
M. W. Savelsbergh, “The vehicle routing problem with time windows: Minimizing route duration,” ORSA J. Comput. 4, 146–154 (1992). https://doi.org/10.1287/ijoc.4.2.146
Article MATH Google Scholar
M. M. Solomon, “Algorithms for the vehicle routing and scheduling problems with time window constraints,” Oper. Res. 35, 254–265 (1987). https://doi.org/10.1287/opre.35.2.254
Article MathSciNet MATH Google Scholar
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” Adv. Neural Inf. Process. Syst. 30 (2017). https://doi.org/10.48550/arXiv.1706.03762
T. Vidal, “Hybrid genetic search for the CVRP: Open-source implementation and swap* neighborhood,” Comput. Oper. Res. 140, 105643 (2022). https://doi.org/10.48550/arXiv.2012.10384
Article MathSciNet MATH Google Scholar
T. Vidal, T. G. Crainic, M. Gendreau, N. Lahrichi, and W. Rei, “A hybrid genetic algorithm for multidepot and periodic vehicle routing problems,” Oper. Res. 60, 611–624 (2012). https://doi.org/10.1287/opre.1120.1048
Article MathSciNet MATH Google Scholar
O. Vinyals, M. Fortunato, and N. Jaitly, “Pointer networks,” Adv. Neural Inf. Process. Syst. 28 (2015). https://doi.org/10.48550/arXiv.1506.03134
G. Nemhauser and L. Wolsey, Integer and Combinatorial Optimization (John Wiley and Sons, 1999). https://doi.org/10.1002/9781118627372
Book MATH Google Scholar

Download references

Author information

Authors and Affiliations

Lomonosov Moscow State University, 119991, Moscow, Russian Federation
A. G. Soroka, A. V. Meshcheryakov & S. V. Gerasimov
Space Research Institute of the Russian Academy of Sciences, 117997, Moscow, Russian Federation
A. V. Meshcheryakov

Authors

A. G. Soroka
View author publications
You can also search for this author in PubMed Google Scholar
A. V. Meshcheryakov
View author publications
You can also search for this author in PubMed Google Scholar
S. V. Gerasimov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to A. G. Soroka, A. V. Meshcheryakov or S. V. Gerasimov.

Ethics declarations

The authors declare that they have no conflicts of interest.

Additional information

Soroka, Andrew Gennad’evich. Postgraduate student of the Faculty of Computational Mathematics and Cybernetics of Moscow State University. Graduated with bachelor’s degree with honors in 2018, graduated with master’s degree with honors in 2020. Scientific interests: reinforcement learning in combinatorial optimization problems, face biometrics, representation learning, deep learning in astrophysics problems with limited markup.

Meshcheryakov, Alex Valer’evich. Candidate of Physical and Mathematical Sciences, mathematician of the Intelligent Information Technologies Department of the Faculty of Computational Mathematics and Cybernetics of Moscow State University, senior researcher at the Space Research Institute of the Russian Academy of Sciences. Graduated from the Physics Faculty of Moscow State University in 2002. In 2011, received the degree of candidate of physical and mathematical sciences. In 2006–2007, received a 2-year training at the Center for Astrophysics at Harvard University. Web of Science ResearcherID U-4496-2017. Since 2014, has been engaged in research in the field of application of methods of machine learning and deep learning in the problems of observational astrophysics, a member of the Russian consortium of the SRG/EROSITA space mission. Research interests: astroinformatics, machine learning and deep learning in problems with bounded markup, and reinforcement learning for combinatorial optimization problems.

Gerasimov, Sergey Valer’evich. An employee of the Intelligent Information Technologies Department of the Faculty of Computational Mathematics and Cybernetics of Moscow State University, the author of a course of lectures Systems of Distributed Storage and Big Data Processing in the master’s program Intellectual Analysis of Big Data at the Faculty of Computational Mathematics and Cybernetics of Moscow State University. Research interests: machine learning and optimization methods in finance, DevOps, MLOps, and ModelOps technologies.

Translated by O. Pismenov

Rights and permissions

Reprints and permissions

About this article

Cite this article

Soroka, A.G., Meshcheryakov, A.V. & Gerasimov, S.V. Deep Reinforcement Learning for the Capacitated Pickup and Delivery Problem with Time Windows. Pattern Recognit. Image Anal. 33, 169–178 (2023). https://doi.org/10.1134/S1054661823020165

Download citation

Received: 26 December 2022
Revised: 26 December 2022
Accepted: 26 December 2022
Published: 03 July 2023
Issue Date: June 2023
DOI: https://doi.org/10.1134/S1054661823020165

Keywords:

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep Reinforcement Learning for the Capacitated Pickup and Delivery Problem with Time Windows

Abstract

Access this article

REFERENCES

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords:

Search

Navigation