Skip to main content
Log in

Adaptive Learning Algorithms for Simulation-Based Dynamic Traffic User Equilibrium

  • Published:
International Journal of Intelligent Transportation Systems Research Aims and scope Submit manuscript

Abstract

We propose two classes of algorithms for achieving the user equilibrium in simulation-based dynamic traffic assignment with special attention given to the interactions between travel information and route choice behavior. A driver is assumed to perform day-to-day route choice repeatedly and experience payoffs with unknown noise. The driver adaptively changes his/her action in accordance with the payoffs that are initially unknown and must be estimated over time due to noisy observations. To solve this problem, we develop a multi-agent version of Q-learning to estimate the payoff functions using novel forms of the ε−greedy learning policy. We apply this Q-learning scheme for the simulation-based DTA in which traffic flow and travel times of routes in the traffic network are generated by a microscopic traffic simulator based on cellular automaton. Finally, we provide simulation examples to show convergence of our algorithms to Nash equilibrium and effectiveness of the best-route provision services.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

References

  1. Treiber, M., Kesting, A.: Traffic Flow Dynamics. Springer-Verlag, Berlin, Heidelberg (2013)

    Book  MATH  Google Scholar 

  2. Miyagi, T., Peque, G.: Informed user algorithm that converge to a pure Nash equilibrium in traffic games. Procedia Soc. Behav. Sci. 54, 438–449 (2012)

    Article  Google Scholar 

  3. Miyagi, T., Peque, G., Fukumoto, J.: Adaptive learning algorithms for traffic games with naive users. Procedia Soc. Behav. Sci. 80, 806–817 (2013)

    Article  Google Scholar 

  4. Chapman, A.C., Leslie, D.S., Rogers, A., Jennings, N.R.: Convergent learning algorithms for unknown rewards games. SIAM J. Control. Optim. 5(4), 3154–3180 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  5. Cominetti, R., Melo, E., Sorin, S.: A payoff-based learning procedure and its application to traffic games. Games Econ. Behav. 70, 71–83 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  6. Leslie, D., Collins, E.: Individual Q-learning in normal form games. SIAM J. Control. Optim. 44(2), 495–514 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  7. Leslie, D., Collins, E.: Generalized weakened fictitious play. Games Econ. Behav. 56, 285–298 (2006)

    Article  MATH  Google Scholar 

  8. Marden, J., Young, P., Arslan, G., Shamma, J.S.: Payoff–based dynamics for multi–player weakly acyclic games. SIAM J. Control. Optim. 48(1), 373–396 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  9. Young, P.: Learning by trial and error. Games Econ. Behav. 65, 626–643 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  10. Miyagi, T.: A reinforcement learning model with endogenously determined learning-efficiency parameters, The Proceedins of CIS/SIS Conference, Keio University (2004)

  11. Miyagi, T.: Stochastic fictitious play, reinforcement learning and the user equilibrium in transportation networks. A Paper Presented at the IVth Meeting on “Mathematics in Transport”, University College London (2005)

  12. Miyagi, T.: Multi-agent learning models for route choices in transportation networks: An integrated approach of regret-based strategy and reinforcement learning. Proceedings of the 11th International Conference on Travel Behavior Research, Kyoto (2006)

  13. Miyagi, T., Ishiguro, M.: Modelling of route choice behaviours of car-drivers under imperfect travel information. Urban Transp. 14, 551–560 (2008), WIT Press

  14. Gawron, C.: Continuous limit of the Nagel–Schreckenberg–model. Phys. Rev. E. 54, 3707 (1996)

    Article  Google Scholar 

  15. Nagel, K., Schreckenberg, M.: A cellular automaton model for freeway traffic. J. Phys. I France. 2, 2221–2229 (1992)

    Article  Google Scholar 

  16. Rosenthal, R.: A class of games possessing pure-strategy Nash equilibria. Int. J. Game Theory. 2, 65–67 (1973)

    Article  MathSciNet  MATH  Google Scholar 

  17. Monderer, D., Shapley, L.: Potential games. Games Econ. Behav. 14, 124–143 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  18. Beckmann, M.J., McGuire, C.B., Winsten, C.B.: Studies in the Economics of Transportation. Yale University Press (1956)

  19. Singh, S.P., Jaakola, T., Littman, M.L., Szepesvari, C.: Convergence results for single-step on-policy reinforcement-learning algorithm. Mach. Learn. 38(3), 287–308 (2000)

    Article  MATH  Google Scholar 

Download references

Acknowledgments

This research is supported by MEXT Grants-in-Aid for Scientific Research, No. 26420511, for the term 2014-2016.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Genaro Peque Jr.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Peque, G., Miyagi, T. & Kurauchi, F. Adaptive Learning Algorithms for Simulation-Based Dynamic Traffic User Equilibrium. Int. J. ITS Res. 16, 215–226 (2018). https://doi.org/10.1007/s13177-017-0150-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13177-017-0150-6

Keywords

Navigation