Applied Intelligence

, Volume 40, Issue 2, pp 201–213 | Cite as

Hierarchical control of traffic signals using Q-learning with tile coding

  • Monireh Abdoos
  • Nasser Mozayani
  • Ana L. C. Bazzan
Article

Abstract

Multi-agent systems are rapidly growing as powerful tools for Intelligent Transportation Systems (ITS). It is desirable that traffic signals control, as a part of ITS, is performed in a distributed model. Therefore agent-based technologies can be efficiently used for traffic signals control. For traffic networks which are composed of multiple intersections, distributed control achieves better results in comparison to centralized methods. Hierarchical structures are useful to decompose the network into multiple sub-networks and provide a mechanism for distributed control of the traffic signals.

In this paper, a two-level hierarchical control of traffic signals based on Q-learning is presented. Traffic signal controllers, located at intersections, can be seen as autonomous agents in the first level (at the bottom of the hierarchy) which use Q-learning to learn a control policy. The network is divided into some regions where an agent is assigned to control each region at the second level (top of the hierarchy). Due to the combinational explosion in the number of states and actions, i.e. features, the use of Q-learning is impractical. Therefore, in the top level, tile coding is used as a linear function approximation method.

A network composed of 9 intersections arranged in a 3×3 grid is used for the simulation. Experimental results show that the proposed hierarchical control improves the Q-learning efficiency of the bottom level agents. The impact of the parameters used in tile coding is also analyzed.

Keywords

Multi-agent systems Hierarchical control Traffic signals Q-learning Tile coding 

References

  1. 1.
    Li H, Li Z, White RT, Wu X (2013) A real-time transportation prediction system. Int J Appl Intell, published online Google Scholar
  2. 2.
    Taniguchi E, Shimamoto H (2004) Intelligent transportation system based dynamic vehicle routing and scheduling with variable travel times. Transp Res, Part C, Emerg Technol 12(3):235–250 CrossRefGoogle Scholar
  3. 3.
    Tomás VR, García LA (2005) Agent-based management of nonurban road meteorological incidents. In: Multi-agent systems and applications IV. Springer, Berlin, pp 213–222 CrossRefGoogle Scholar
  4. 4.
    Bielli M, Ambrosino G, Boero M (1994) Artificial intelligence applications to traffic engineering. VSP, Vermont Google Scholar
  5. 5.
    Chen B, Cheng H (2010) A review of the applications of agent technology in traffic and transportation systems. IEEE Trans Intell Transp Syst 11(2):485–497 CrossRefGoogle Scholar
  6. 6.
    Horling B, Lesser V (2004) A survey of multi-agent organizational paradigms. Knowl Eng Rev 19(4):281–316 CrossRefGoogle Scholar
  7. 7.
    Chen B, Cheng H, Palen J (2009) Integrating mobile agent technology with multi-agent systems for distributed traffic detection and management systems. Transp Res, Part C, Emerg Technol 17(1):1–10 CrossRefGoogle Scholar
  8. 8.
    Bazzan A (2009) Opportunities for multiagent systems and multiagent reinforcement learning in traffic control. Auton Agents Multi-Agent Syst 18(3):342–375 CrossRefGoogle Scholar
  9. 9.
    Roozemond DA (2001) Using intelligent agents for pro-active, real-time urban intersection control. Eur J Oper Res 131(2):293–301 CrossRefMATHGoogle Scholar
  10. 10.
    Cai C, Yang Z (2007) Study on urban traffic management based on multi-agent system. In: Proceedings of the sixth international conference on machine learning and cybernetics. IEEE, Hong Kong, pp 25–29 Google Scholar
  11. 11.
    Chen C, Li Z (2012) A hierarchical networked urban traffic signal control system based on multi-agent. In: 9th IEEE international conference on networking, sensing and control (ICNSC). IEEE, New York, pp 28–33 Google Scholar
  12. 12.
    Choy M, Srinivasan D, Cheu R (2003) Cooperative, hybrid agent architecture for real-time traffic signal control. IEEE Trans Syst Man Cybern, Part A, Syst Hum 33(5):597–607 CrossRefGoogle Scholar
  13. 13.
    Srinivasan D, Choy M, Cheu R (2006) Neural networks for real-time traffic signal control. IEEE Trans Intell Transp Syst 7(3):261–272 CrossRefGoogle Scholar
  14. 14.
    Grégoire P, Desjardins C, Laumônier J, Chaib-draa B (2007) Urban traffic control based on learning agents. In: Intelligent transportation systems conference. IEEE, New York, pp 916–921 Google Scholar
  15. 15.
    Weiring M (2000) Multi-agent reinforcement learning for traffic light control. In: Proceedings of the seventh international conference on machine learning, pp 1151–1158 Google Scholar
  16. 16.
    Steingröver M, Schouten R, Peelen S, Nijhuis E, Bakker B (2005) Reinforcement learning of traffic light controllers adapting to traffic congestion. In: Proceedings of the 17th Belgium-Netherlands conference on artificial intelligence (BNAIC 2005), Citeseer, 2005, pp 216–223 Google Scholar
  17. 17.
    Silva BBCd, Basso EW, Bazzan ALC, Engel PM (2006) Improving reinforcement learning with context detection. In: Proceedings of the 5th international joint conference on autonomous agents and multiagent systems (AAMAS 2006), Hakodate, Japan. ACM Press, New York, pp 811–812. Available online: www.inf.ufrgs.br/maslab/pergamus/pubs/Silva+2006.pdf Google Scholar
  18. 18.
    Wen K, Qu S, Zhang Y (2008) A stochastic adaptive control model for isolated intersections. In: Proceedings of the 2007 IEEE international conference on robotics and biomimetics. Sanya, China. IEEE, New York, pp 2256–2260 Google Scholar
  19. 19.
    Arel I, Liu C, Urbanik T, Kohls A (2010) Reinforcement learning-based multi-agent system for network traffic signal control. IET Intell Transp Syst 4(2):128–135 CrossRefGoogle Scholar
  20. 20.
    Box S, Waterson B (2012) An automated signalized junction controller that learns strategies from a human expert. Eng Appl Artif Intell 25:107–118 CrossRefGoogle Scholar
  21. 21.
    Box S, Waterson b (2013) An automated signalized junction controller that learns strategies by temporal difference reinforcement learning. Eng Appl Artif Intell 26(1):652–659 CrossRefGoogle Scholar
  22. 22.
    Vien NA, Wolfgang E, Chung TC (2013) Learning via human feedback in continuous state and action spaces. Int J Appl Intell, published online Google Scholar
  23. 23.
    Prashanth L, Bhatnagar S (2011) Reinforcement learning with function approximation for traffic signal control. IEEE Trans Intell Transp Syst 12(2):412–421 CrossRefGoogle Scholar
  24. 24.
    Sutton R, Barto A (1998) Reinforcement learning—an introduction. MIT Press, Cambridge Google Scholar
  25. 25.
    Watkins CJCH, Dayan P (1992) Q-learning. Mach Learn 8(3):279–292 MATHGoogle Scholar
  26. 26.
    Reynolds S (2002) Reinforcement learning with exploration. PhD dissertation, School of Computer Science, The University of Birmingham, Birmingham Google Scholar
  27. 27.
    Sutton R (1996) Generalization in reinforcement learning: successful examples using sparse coarse coding. Adv Neural Inf Process Syst 8:1038–1044 Google Scholar
  28. 28.
    Haykin S (2002) Adaptive filter theory. Prentice-Hall information and system sciences series Google Scholar
  29. 29.
    Abdoos M, Esmaeili A, Mozayani N (2012) Holonification of a network of agents based on graph theory. In: International KES conference on agents and multi-agent systems—technologies and applications. IEEE, New York, pp 379–388 CrossRefGoogle Scholar
  30. 30.
    Abdoos M, Mozayani N, Bazzan A (2011) Traffic light control in non-stationary environments based on multi agent q-learning. In: 14th international IEEE conference on intelligent transportation systems (ITSC). IEEE, New York, pp 1580–1585 CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  • Monireh Abdoos
    • 1
  • Nasser Mozayani
    • 1
  • Ana L. C. Bazzan
    • 2
  1. 1.School of Computer EngineeringIran University of Science and TechnologyTehranIran
  2. 2.Institute of InformaticsFederal University of Rio Grande do SulPorto AlegreBrazil

Personalised recommendations