Skip to main content
Log in

Dynamic Weight-based Multi-Objective Reward Architecture for Adaptive Traffic Signal Control System

  • Published:
International Journal of Intelligent Transportation Systems Research Aims and scope Submit manuscript

Abstract

An Adaptive Traffic Signal Control (ATSC) system uses real-time traffic information to control traffic lights and makes the public transport system more reliable and accessible. Deep Reinforcement Learning (DRL) has recently demonstrated its use in resolving traffic signal control problems. However, designing a good reward function is one of the most crucial aspects of DRL since the system learns to make proper decisions based on reward. Furthermore, the multi-objective reward function is preferable for the ATSC system, which is more challenging than designing a single objective. The existing multi-objective reward functions use pre-defined fixed weights to combine the multiple parameters, which requires rigorous training and cannot represent the actual impact of the parameters. To solve this problem, we proposed a new reward architecture called Dynamic Weights Multi-objective Reward Architecture (DWMORA) for ATSC. It calculates the weights instantly based on the current traffic condition to ensure the actual impact of the parameters. A comparative result study of the proposed approach with several existing reward functions shows the improvement of the road traffic in terms of waiting time, travel time, and halting number.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Abels, A., Roijers, D., Lenaerts, T., Nowé, A., Steckelmacher, D.: Dynamic weights in multi-objective deep reinforcement learning. In: International Conference on Machine Learning, pp. 11–20. PMLR (2019)

  2. Brys, T., Harutyunyan, A., Vrancx, P., Taylor, M.E., Kudenko, D., Nowé, A.: Multi-objectivization of reinforcement learning problems by reward shaping. In: 2014 international joint conference on neural networks (IJCNN), pp. 2315–2322. IEEE (2014)

  3. Das, I., Dennis, J.E.: A closer look at drawbacks of minimizing weighted sums of objectives for pareto set generation in multicriteria optimization problems. Structural optimization 14(1), 63–69 (1997)

    Article  Google Scholar 

  4. Hossan, S., Nower, N.: Fog-based dynamic traffic light control system for improving public transport. Public Transport pp. 1–24 (2020)

  5. Jamil, A.R.M., Ganguly, K.K., Nower, N.: Adaptive traffic signal control system using composite reward architecture based deep reinforcement learning. IET Intel. Transport Syst. 14(14), 2030–2041 (2021)

    Article  Google Scholar 

  6. Khamis, M.A., Gomaa, W.: Enhanced multiagent multi-objective reinforcement learning for urban traffic light control. In: 2012 11th International Conference on Machine Learning and Applications, vol. 1, pp. 586–591. IEEE (2012)

  7. Khamis, M.A., Gomaa, W.: Adaptive multi-objective reinforcement learning with hybrid exploration for traffic signal control based on cooperative multi-agent framework. Eng. Appl. Artif. Intell. 29, 134–151 (2014)

    Article  Google Scholar 

  8. Khamis, M.A., Gomaa, W., El-Shishiny, H.: Multi-objective traffic light control system based on bayesian probability interpretation. In: 2012 15th International IEEE Conference on Intelligent Transportation Systems, pp. 995–1000. IEEE (2012)

  9. Krajzewicz, D., Erdmann, J., Behrisch, M., Bieker, L.: Recent development and applications of sumo-simulation of urban mobility. International Journal On Advances in Systems and Measurements 5(3spsampsps4),(2012)

  10. Lin, Y., Dai, X., Li, L., Wang, F.Y.: An efficient deep reinforcement learning model for urban traffic control. arXiv preprint arXiv:1808.01876 (2018)

  11. Van der Pol, E., Oliehoek, F.A.: Coordinated deep reinforcement learners for traffic light control. Proceedings of Learning, Inference and Control of Multi-Agent Systems (at NIPS 2016) (2016)

  12. Scorecard, U.M.: The texas a&m transportation institute and inrix. Inc., USA 9(2015), 10 (2015)

  13. Sutton, R.S., Barto, A.G., et al.: Introduction to reinforcement learning, vol. 135. MIT press Cambridge (1998)

  14. Van Moffaert, K., Brys, T., Chandra, A., Esterle, L., Lewis, P.R., Nowé, A.: A novel adaptive weight selection algorithm for multi-objective multi-agent reinforcement learning. In: 2014 International joint conference on neural networks (IJCNN), pp. 2306–2314. IEEE (2014)

  15. Van Moffaert, K., Drugan, M.M., Nowé, A.: Scalarized multi-objective reinforcement learning: Novel design techniques. In: 2013 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL), pp. 191–199. IEEE (2013)

  16. Van Seijen, H., Fatemi, M., Romoff, J., Laroche, R., Barnes, T., Tsang, J.: Hybrid reward architecture for reinforcement learning. In: Advances in Neural Information Processing Systems, pp. 5392–5402 (2017)

  17. Vidali, A., Crociani, L., Vizzari, G., Bandini, S.: A deep reinforcement learning approach to adaptive traffic lights management. In: Proceedings of the 20th Workshop” From Objects to Agents”, Parma, Italy (2019)

  18. Wei, H., Chen, C., Zheng, G., Wu, K., Gayah, V., Xu, K., Li, Z.: Presslight: Learning max pressure control to coordinate traffic signals in arterial network. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1290–1298 (2019)

  19. Wei, H., Zheng, G., Yao, H., Li, Z.: Intellilight: A reinforcement learning approach for intelligent traffic light control. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 2496–2505 (2018)

  20. Xu, J., Tian, Y., Ma, P., Rus, D., Sueda, S., Matusik, W.: Prediction-guided multi-objective reinforcement learning for continuous robot control. In: International Conference on Machine Learning, pp. 10607–10616. PMLR (2020)

  21. Yang, R., Sun, X., Narasimhan, K.: A generalized algorithm for multi-objective reinforcement learning and policy adaptation. arXiv preprint arXiv:1908.08342 (2019)

Download references

Acknowledgements

This research is funded by the University Grants Commission (UGC) (grant no. 3/12354) Bangladesh.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Abu Rafe Md Jamil.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jamil, A.R.M., Nower, N. Dynamic Weight-based Multi-Objective Reward Architecture for Adaptive Traffic Signal Control System. Int. J. ITS Res. 20, 495–507 (2022). https://doi.org/10.1007/s13177-022-00305-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13177-022-00305-5

Keywords

Navigation