Evolutionary Intelligence

, Volume 12, Issue 4, pp 689–712 | Cite as

UCRLF: unified constrained reinforcement learning framework for phase-aware architectures for autonomous vehicle signaling and trajectory optimization

  • Chiranjib SurEmail author
Research Paper


Signaling and trajectory optimization work as contention and researchers have debated on what should be the best for the vehicle, but it seems that both components are complement to each other and there can be combined situations with bounds where maximum optimization can be achieved. This paper introduces a novel approach called Phase-Aware Deep Learning and Constrained Reinforcement Learning for optimization and constant improvement of signal and trajectory for autonomous vehicle operation modules for an intersection. It deals with all the components required for the signaling system to operate, communicate and also navigate the vehicle with proper trajectory so that it faces less waiting time and the overall system operates with minimum waiting time and comparable throughput rate. We have done analysis on the operating time and the vehicle movement as these are vital for pollution and energy consumption. Our methodologies are not only efficient in time and computation but also have incorporated highly optimized data representation to reduce the overhead of maintaining and accessing the data. This ensures very efficient time complexity and theoretical computation time and better lower bounds. Constrained Reinforcement Learning concept is the main contribution of this work and it helped in decreasing 84% of the waiting time for the vehicles.


Combinatorial prediction Deep learning based signaling Autonomous vehicle trajectory Data driven approach Time-energy efficient 



  1. 1.
    El-Tantawy S, Abdulhai B (2012) Multi-agent reinforcement learning for integrated network of adaptive TRAC signal controllers (MARLIN-ATSC). In: 2012 15th international IEEE conference on intelligent transportation systems (ITSC), pp 319–326. IEEE. 5, 18, 19Google Scholar
  2. 2.
    Au T-C et al (2012) Evasion planning for autonomous vehicles at intersections. In: 2012 IEEE/RSJ international conference on intelligent robots and systems (IROS), IEEEGoogle Scholar
  3. 3.
    Khamis MA, Gomaa W (2014) Adaptive multi-objective reinforcement learning with hybrid exploration for traffic signal control based on cooperative multi-agent framework. Eng Appl Artif Intell 29:134–151CrossRefGoogle Scholar
  4. 4.
    Garavello M, Piccoli B (2006) Traffic flow on networks, vol 1. American Institute of Mathematical Sciences, SpringfieldzbMATHGoogle Scholar
  5. 5.
    Cesme B, Furth PG (2014) Self-organizing traffic signals using secondary extension and dynamic coordination. Transp Res Part C: Emerg Technol 48:1–15CrossRefGoogle Scholar
  6. 6.
    Bhaskar A, Tsubota T, Chung E (2014) Urban traffic state estimation: fusing point and zone based data. Transp Res Part C: Emerg Technol 48:120–142CrossRefGoogle Scholar
  7. 7.
    Ahmane M et al (2013) Modeling and controlling an isolated urban intersection based on cooperative vehicles. Transp Res Part C: Emerg Technol 28:44–62CrossRefGoogle Scholar
  8. 8.
    Fok C-L et al (2012) A platform for evaluating autonomous intersection management policies. In: Proceedings of the 2012 IEEE/ACM third international conference on cyber-physical systems, IEEE Computer SocietyGoogle Scholar
  9. 9.
    VanMiddlesworth M, Dresner K, Stone P (2008) Replacing the stop sign: unmanaged intersection control for autonomous vehicles. In: Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems-Volume 3, International Foundation for Autonomous Agents and Multiagent SystemsGoogle Scholar
  10. 10.
    Dresner K, Stone P (2005) Multiagent traffic management: an improved intersection control mechanism. In: Proceedings of the fourth international joint conference on autonomous agents and multiagent systems, ACMGoogle Scholar
  11. 11.
    Cascone A, D’Apice C, Piccoli B, Rarità L (2008) Circulation of car traffic in congested urban areas. Commun Math Sci 6(3):765–784CrossRefMathSciNetzbMATHGoogle Scholar
  12. 12.
    Duwaer DA (2016) On deep reinforcement learning for data-driven traffic control. Student thesis, MasterGoogle Scholar
  13. 13.
    Zolfpour-Arokhlo M et al (2014) Modeling of route planning system based on Q-value-based dynamic programming with multi-agent reinforcement learning algorithms. Eng Appl Artif Intell 29:163–177CrossRefGoogle Scholar
  14. 14.
    Mousavi SS, et al (2017) Traffic light control using deep policy-gradient and value-function based reinforcement learning. arXiv preprint arXiv:1704.08883
  15. 15.
    Dresner K, Stone P (2008) A multiagent approach to autonomous intersection management. J Artif Intell Res 31:591–656CrossRefGoogle Scholar
  16. 16.
    Hausknecht M, Au T-C, Stone P (2011) Autonomous intersection management: multi-intersection optimization. In: 2011 IEEE/RSJ international conference on intelligent robots and systems (IROS), IEEEGoogle Scholar
  17. 17.
    D’Apice C, Manzo R, Rarità L (2011) Splitting of traffic flows to control congestion in special events. Int J Math Math Sci.
  18. 18.
    Fajardo D et al (2011) Automated intersection control: performance of future innovation versus current traffic signal control. Transp Res Rec: J Transp Res Board 2259:223–232CrossRefGoogle Scholar
  19. 19.
    Au T-C, Zhang S, Stone P (2015) Autonomous intersection management for semi-autonomous vehicles. The Routledge handbook of transportation. Routledge, London, pp 88–104 Google Scholar
  20. 20.
    Carlino D, Boyles SD, Stone P (2013) Auction-based autonomous intersection management. In: 2013 16th international IEEE conference on intelligent transportation systems-(ITSC), IEEEGoogle Scholar
  21. 21.
    Wu X, Liu HX (2014) Using high-resolution event-based data for traffic modeling and control: an overview. Transp Res Part C: Emerg Technol 42:28–43CrossRefGoogle Scholar
  22. 22.
    Daganzo C, Daganzo CF (1997) Fundamentals of transportation and traffic operations, vol 30. Pergamon, OxfordCrossRefzbMATHGoogle Scholar
  23. 23.
    Kerner BS (2004) The physics of traffic. Springer, New YorkCrossRefGoogle Scholar
  24. 24.
    Dresner K, Stone P (2006) Human-usable and emergency vehicle-aware control policies for autonomous intersection management. In: Fourth international workshop on agents in traffic and transportation (ATT), Hakodate, JapanGoogle Scholar
  25. 25.
    Gilmore JF, Elibiary KJ, Abe N (1993) Traffic management applications of neural networks. In: Working notes, AAAI-93 Workshop on AI in intelligent vehicle highway systemsGoogle Scholar
  26. 26.
    Fagan D, Meier R (2009) Using context and behavioral patterns for intelligent traffic management. In: Proceedings of the 1st international workshop on context-aware middleware and services: affiliated with the 4th international conference on communication system software and middleware (COMSWARE 2009), ACMGoogle Scholar
  27. 27.
    McKenney D, White T (2013) Distributed and adaptive traffic signal control within a realistic traffic simulation. Eng Appl Artif Intell 26(1):574–583CrossRefGoogle Scholar
  28. 28.
    Mirchandani P, Head L (2001) A real-time traffic signal control system: architecture, algorithms, and analysis. Transp Res Part C: Emerg Technol 9(6):415–432CrossRefGoogle Scholar
  29. 29.
    Zheng X, Recker W (2013) An adaptive control algorithm for traffic-actuated signals. Transp Res Part C: Emerg Technol 30:93–115CrossRefGoogle Scholar
  30. 30.
    De Oliveira LB, Camponogara E (2010) Multi-agent model predictive control of signaling split in urban traffic networks. Transp Res Part C: Emerg Technol 18(1):120–139CrossRefGoogle Scholar
  31. 31.
    Viti F, Van Zuylen HJ (2010) A probabilistic model for traffic at actuated control signals. Transp Res Part C: Emerg Technol 18(3):299–310CrossRefGoogle Scholar
  32. 32.
    Guler SI, Menendez M, Meier L (2014) Using connected vehicle technology to improve the efficiency of intersections. Transp Res Part C: Emerg Technol 46:121–131CrossRefGoogle Scholar
  33. 33.
    Kim J, Mahmassani HS (2014) A finite mixture model of vehicle-to-vehicle and day-to-day variability of traffic network travel times. Transp Res Part C: Emerg Technol 46:83–97CrossRefGoogle Scholar
  34. 34.
    Eichler M, Daganzo CF (2006) Bus lanes with intermittent priority: strategy formulae and an evaluation. Transp Res Part B: Methodol 40(9):731–744CrossRefGoogle Scholar
  35. 35.
    Guler SI, Menendez M (2014) Analytical formulation and empirical evaluation of pre-signals for bus priority. Transp Res Part B: Methodol 64:41–53CrossRefGoogle Scholar
  36. 36.
    Negenborn RR, Schutter BD, Hellendoorn J (2008) Multi-agent model predictive control for transportation networks: serial versus parallel schemes. Eng Appl Artif Intell 21(3):353–366CrossRefGoogle Scholar
  37. 37.
    Wen W (2008) A dynamic and automatic traffic light control expert system for solving the road congestion problem. Expert Syst Appl 34(4):2370–2381CrossRefGoogle Scholar
  38. 38.
    Araghi S, Khosravi A, Creighton D (2015) A review on computational intelligence methods for controlling traffic signal timing. Expert Syst Appl 42(3):1538–1550CrossRefGoogle Scholar
  39. 39.
    Scattolini R (2009) Architectures for distributed and hierarchical model predictive control—a review. J Process Control 19(5):723–731CrossRefGoogle Scholar
  40. 40.
    Dotoli M, Fanti MP, Meloni C (2006) A signal timing plan formulation for urban traffic control. Control Eng Pract 14(11):1297–1311CrossRefGoogle Scholar
  41. 41.
    Maslekar N et al (2013) CATS: an adaptive traffic signal system based on car-to-car communication. J Netw Comput Appl 36(5):1308–1315CrossRefGoogle Scholar
  42. 42.
    Ma X, Jin J, Lei W (2014) Multi-criteria analysis of optimal signal plans using microscopic traffic models. Transp Res Part D: Transp Environ 32:1–14CrossRefGoogle Scholar
  43. 43.
    Schepperle H., Böhm K. (2008) Auction-based traffic management: towards effective concurrent utilization of road intersections. In: 2008 10th IEEE conference on E-commerce technology and the fifth IEEE conference on enterprise computing, E-Commerce and E-Services, IEEEGoogle Scholar
  44. 44.
    Yan F, Dridi M, El Moudni A (2013) Autonomous vehicle sequencing problem for a multi-intersection network: a genetic algorithm approach. In: 2013 international conference on advanced logistics and transport (ICALT), IEEEGoogle Scholar
  45. 45.
    Lv Y et al (2015) Traffic flow prediction with big data: a deep learning approach. IEEE Trans Intell Transp Syst 16(2):865–873Google Scholar
  46. 46.
    Chevion D, Shehory O, Shimony Y (2009) Automated collaboration among communicating semiautonomous vehicles. Technical ReportGoogle Scholar
  47. 47.
    Dever C et al (2006) Nonlinear trajectory generation for autonomous vehicles via parameterized maneuver classes. J Guid Control Dyn 29(2):289–302CrossRefGoogle Scholar
  48. 48.
    Paden B et al (2016) A survey of motion planning and control techniques for self-driving urban vehicles. IEEE Trans Intell Veh 1(1):33–55CrossRefGoogle Scholar
  49. 49.
    Au T-C, Quinlan M, Stone P (2012) Setpoint scheduling for autonomous vehicle controllers. In: 2012 IEEE international conference on robotics and automation (ICRA), IEEEGoogle Scholar
  50. 50.
    Mnih V, et al (2013) Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602
  51. 51.
    Kaelbling LP, Littman ML, Moore AW (1996) Reinforcement learning: a survey. J Artif Intell Res 4:237–285CrossRefGoogle Scholar
  52. 52.
    Hu J, Wellman MP (1998) Multiagent reinforcement learning: theoretical framework and an algorithm. In: ICML, vol 98Google Scholar
  53. 53.
    Wiering MA (2000) Multi-agent reinforcement learning for traffic light control. In: Machine learning: proceedings of the seventeenth international conference (ICML’2000)Google Scholar
  54. 54.
    Watkins CJCH, Dayan P (1992) Q-learning. Mach Learn 8(3–4):279–292zbMATHGoogle Scholar
  55. 55.
    Bi Y et al (2014) Type-2 fuzzy multi-intersection traffic signal control with differential evolution optimization. Expert Syst Appl 41(16):7338–7349CrossRefGoogle Scholar
  56. 56.
    Galvin R (2017) Energy consumption effects of speed and acceleration in electric vehicles: laboratory case studies and implications for drivers and policymakers. Transp Res Part D: Transp Environ 53:234–248CrossRefGoogle Scholar
  57. 57.
    He Q, Head KL, Ding J (2014) Multi-modal traffic signal control with priority, signal actuation and coordination. Transp Res Part C: Emerg Technol 46:65–82CrossRefGoogle Scholar
  58. 58.
    Yu X-H, Recker WW (2006) Stochastic adaptive control model for traffic signal systems. Transp Res Part C: Emerg Technol 14(4):263–282CrossRefGoogle Scholar
  59. 59.
    Abdulhai B, Pringle R, Karakoulas GJ (2003) Reinforcement learning for true adaptive traffic signal control. J Transp Eng 129(3):278–285CrossRefGoogle Scholar
  60. 60.
    Myrvoll TA, Soong FK (2003) On divergence based clustering of normal distributions and its application to HMM adaptation. In: Eighth European conference on speech communication and technologyGoogle Scholar
  61. 61.
    Murphy Kevin P (2006) Naive Bayes classifiers. University of British Columbia, VancouverGoogle Scholar
  62. 62.
    Suykens JAK, Vandewalle J (1999) Least squares support vector machine classifiers. Neural Process Lett 9(3):293–300CrossRefGoogle Scholar
  63. 63.
    Liaw A, Wiener M (2002) Classification and regression by randomForest. R news 2(3):18–22Google Scholar
  64. 64.
    Menard S (2002) Applied logistic regression analysis, vol 106. Sage, Beverly HillsCrossRefGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Computer and Information Science and Engineering DepartmentUniversity of FloridaGainesvilleUSA

Personalised recommendations