Decentralized multi-agent control of a three-tank hybrid system based on twin delayed deep deterministic policy gradient reinforcement learning algorithm

Rajasekhar , N.; Radhakrishnan, T. K.; Samsudeen, N.

doi:10.1007/s40435-023-01227-0

Decentralized multi-agent control of a three-tank hybrid system based on twin delayed deep deterministic policy gradient reinforcement learning algorithm

Published: 20 June 2023

Volume 12, pages 1098–1115, (2024)
Cite this article

International Journal of Dynamics and Control Aims and scope Submit manuscript

444 Accesses
Explore all metrics

Abstract

In this study, a reinforcement learning (RL) method called twin delayed deep deterministic policy gradient (TD3) is used to tune the parameters of the proportional-integral (PI) controller parameters for a nonlinear three-tank hybrid (TTH) system in a decentralized multi-agent manner. The proposed multi-agent reinforcement learning method trains the agents to be more intelligent and capable than single-agent reinforcement learning, which makes tasks simpler. The TTH has continuous and discrete dynamics and provides a benchmark for process traits such as variable dynamics, interactions, and nonlinear dynamics. The real-time TTH setup is subjected to a Pseudorandom Binary Signal (PRBS), and the obtained excitation data are used to validate the system's first principles model. The TD3 is relatively slow to explore the optimal policy without having initial or previous knowledge of the states. In this study, PI controller parameters tuned using internal model controller (IMC) tuning rules are used as an initial guess so that the exploration of the parameters from scratch is eliminated, significantly improving the training speed and convergence accuracy. The DDPG algorithm uses the exact initial guess of the parameters as the TD3 algorithm. It is also compared with the traditional PI controller methods such as IMC, SIMC, IIMC, and AMIGO. The results show that the RL-TD3-tuned PI controller performs better than the RL-DDPG-tuned PI controller and other traditional PI controller methods in terms of integral squared error and control effort.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A review of PID control, tuning methods and applications

Article 17 July 2020

Red-billed blue magpie optimizer: a novel metaheuristic algorithm for 2D/3D UAV path planning and engineering design problems

Article Open access 03 May 2024

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

References

Spielberg SPK, Gopaluni RB, Loewen PD (2017) Deep reinforcement learning approaches for process control. In: 2017 6th international symposium advance control Ind process AdCONIP, pp 201–206. https://doi.org/10.1109/ADCONIP.2017.7983780
Nian R, Liu J, Huang B (2020) A review On reinforcement learning: introduction and applications in industrial process control. Comput Chem Eng 139:106886. https://doi.org/10.1016/j.compchemeng.2020.106886
Article Google Scholar
Buşoniu L, de Bruin T, Tolić D et al (2018) Reinforcement learning for control: performance, stability, and deep approximators. Annu Rev Control 46:8–28. https://doi.org/10.1016/j.arcontrol.2018.09.005
Article MathSciNet Google Scholar
Deisenroth MP (2011) A survey on policy search for robotics. Found Trends Robot 2:1–142. https://doi.org/10.1561/2300000021
Article Google Scholar
Kober J, Bagnell JA, Peters J (2013) Reinforcement learning in robotics: a survey. Int J Rob Res 32:1238–1274. https://doi.org/10.1177/0278364913495721
Article Google Scholar
Mnih V, Kavukcuoglu K, Silver D et al (2013) Playing atari with deep reinforcement. Learning 5:1–9
Google Scholar
Mnih V, Kavukcuoglu K, Silver D et al (2015) Human-level control through deep reinforcement learning. Nature 518:529–533. https://doi.org/10.1038/nature14236
Article Google Scholar
Silver D, Huang A, Maddison CJ et al (2016) Mastering the game of Go with deep neural networks and tree search. Nature 529:484–489. https://doi.org/10.1038/nature16961
Article Google Scholar
Barrett TD, Clements WR, Foerster JN, Lvovsky AI (2020) Exploratory combinatorial optimization with reinforcement learning. AAAI Conf Artif Intell. https://doi.org/10.1609/aaai.v34i04.5723
Article Google Scholar
Powell BKM, Machalek D, Quah T (2020) Real-time optimization using reinforcement learning. Comput Chem Eng 143:107077. https://doi.org/10.1016/j.compchemeng.2020.107077
Article Google Scholar
He W, Gao H, Zhou C et al (2021) Reinforcement learning control of a flexible two-link manipulator: an experimental investigation. IEEE Trans Syst Man, Cybern Syst 51:7326–7336. https://doi.org/10.1109/TSMC.2020.2975232
Article Google Scholar
Rizvi SAA, Lin Z (2018) Output feedback reinforcement Q-learning control for the discrete-time linear quadratic regulator problem. Annu Conf Decis Control CDC. https://doi.org/10.1109/CDC.2017.8263836
Article Google Scholar
Lewis FL, Vrabie D (2009) Adaptive dynamic programming for feedback control. In: Proceedings of 2009 7th Asian Control conferences ASCC, pp 1402–1409
Botvinick M, Wang JX, Dabney W et al (2020) Deep Reinforcement Learning and Its Neuroscientific Implications. Neuron 107:603–616. https://doi.org/10.1016/j.neuron.2020.06.014
Article Google Scholar
Zhong W, Wang M, Wei Q, Lu J (2022) A new neuro-optimal nonlinear tracking control method via integral reinforcement learning with applications to nuclear systems. Neurocomputing 483:361–369. https://doi.org/10.1016/j.neucom.2022.01.034
Article Google Scholar
Silver D, Lever G, Heess N et al (2014) Deterministic policy gradient algorithms. Int Conf Mach Learn ICML 1:605–619
Google Scholar
Lillicrap TP, Hunt JJ, Pritzel A, et al (2016) Continuous control with deep reinforcement learning. IN: 4th international conference learning represention ICLR
Xu J, Zhang H, Qiu J (2022) A deep deterministic policy gradient algorithm based on averaged state-action estimation. Comput Electr Eng 101:108015. https://doi.org/10.1016/j.compeleceng.2022.108015
Article Google Scholar
Li B, Yang ZP, Chen DQ et al (2021) Maneuvering target tracking of UAV based on MN-DDPG and transfer learning. Def Technol 17:457–466. https://doi.org/10.1016/j.dt.2020.11.014
Article Google Scholar
Luo S, Lin X, Zheng Z (2019) A novel CNN-DDPG based AI-trader: performance and roles in business operations. Transp Res Part E Logist Transp Rev 131:68–79. https://doi.org/10.1016/j.tre.2019.09.013
Article Google Scholar
Liu Z, Liu Y, Xu H et al (2022) Dynamic economic dispatch of power system based on DDPG algorithm. Energy Rep 8:1122–1129. https://doi.org/10.1016/j.egyr.2022.02.231
Article Google Scholar
Liu Y, Liang H, Xiao Y et al (2022) Logistics-involved service composition in a dynamic cloud manufacturing environment: a DDPG-based approach. Robot Comput Integr Manuf 76:102323. https://doi.org/10.1016/j.rcim.2022.102323
Article Google Scholar
Pandian BJ, Noel MM (2018) Control of a bioreactor using a new partially supervised reinforcement learning algorithm. J Process Control 69:16–29. https://doi.org/10.1016/j.jprocont.2018.07.013
Article Google Scholar
Ma Y, Zhu W, Benton MG, Romagnoli J (2019) Continuous control of a polymerization system with deep reinforcement learning. J Process Control 75:40–47. https://doi.org/10.1016/j.jprocont.2018.11.004
Article Google Scholar
Pandian BJ, Noel MM (2018) Tracking control of a continuous stirred tank reactor using direct and tuned reinforcement learning based controllers. Chem Prod Process Model 13:1–10. https://doi.org/10.1515/cppm-2017-0040
Article Google Scholar
Hariprasad K, Bhartiya S, Gudi RD (2012) A gap metric based multiple model approach for nonlinear switched systems. J Process Control 22:1743–1754. https://doi.org/10.1016/j.jprocont.2012.07.005
Article Google Scholar
Kroll A, Schulte H (2014) Benchmark problems for nonlinear system identification and control using Soft Computing methods: need and overview. Appl Soft Comput J 25:496–513. https://doi.org/10.1016/j.asoc.2014.08.034
Article Google Scholar
Decarlo RA, Branicky MS, Pettersson S, Lennartson B (2000) Perspectives and results on the stability and stabilizability of hybrid systems. Proc IEEE 88:1069–1082. https://doi.org/10.1109/5.871309
Article Google Scholar
Branicky MS, Borkar VS, Mitter SK (1998) A unified framework for hybrid control: model and optimal control theory. IEEE Trans Automat Contr 43:31–45. https://doi.org/10.1109/9.654885
Article MathSciNet Google Scholar
Sathishkumar K, Kirubakaran V, Radhakrishnan TK (2018) Real time modeling and control of three tank hybrid system. Chem Prod Process Model 13:1–10. https://doi.org/10.1515/cppm-2017-0016
Article Google Scholar
Rammal R, Airimitoaie TB, Melchior P, Cazaurang F (2022) Nonlinear three-tank system fault detection and isolation using differential flatness. IFAC J Syst Control. https://doi.org/10.1016/j.ifacsc.2022.100197
Article MathSciNet Google Scholar
Hosokawa A, Mitsuhashi Y, Satoh K, Yang Z (2022) Output feedback full-order sliding mode control for a three-tank system. ISA Trans. https://doi.org/10.1016/j.isatra.2022.06.038
Article Google Scholar
Sarailoo M, Rahmani Z, Rezaie B (2015) A novel model predictive control scheme based on bees algorithm in a class of nonlinear systems: application to a three tank system. Neurocomputing 152:294–304. https://doi.org/10.1016/j.neucom.2014.10.066
Article Google Scholar
Emebu S, Kubalčík M, Backi CJ, Janáčová D (2023) A comparative study of linear and nonlinear optimal control of a three-tank system. ISA Trans 132:419–427. https://doi.org/10.1016/j.isatra.2022.06.002
Article Google Scholar
Anbumani K, Hemamalini RR (2020) Optimal state feedback controller for three tank cylindrical interacting system using Grey Wolf Algorithm. Microprocess Microsyst 79:103269. https://doi.org/10.1016/j.micpro.2020.103269
Article Google Scholar
Yu S, Lu X, Zhou Y et al (2020) Liquid level tracking control of three-tank systems. Int J Control Autom Syst 18:2630–2640. https://doi.org/10.1007/s12555-018-0895-y
Article Google Scholar
Kouadri A, Namoun A, Zelmat M, Aitouche MA (2013) A statistical-based approach for fault detection in a three tank system. Int J Syst Sci 44:1783–1792. https://doi.org/10.1080/00207721.2012.670292
Article Google Scholar
Bahita M, Belarbi K (2018) Real-time application of a fuzzy adaptive control to one level in a three-tank system. Proc Inst Mech Eng Part I J Syst Control Eng 232:845–856. https://doi.org/10.1177/0959651818764205
Article Google Scholar
Jendoubi I, Bouffard F (2023) Multi-agent hierarchical reinforcement learning for energy management. Appl Energy 332:120500. https://doi.org/10.1016/j.apenergy.2022.120500
Article Google Scholar
Bequette BW (2002) Master process control
Dahlin EB (1968) Designing and tuning digital controllers. Instrum Control Syst 8:77–84
Google Scholar
Skogestad S (2004) Simple analytic rules for model reduction and PID controller tuning. Model Identif Control 25:85–120. https://doi.org/10.4173/mic.2004.2.2
Article MathSciNet Google Scholar
Astrom KJ, HÄgglund T (2006) Advanced PID control
Morales EF, Zaragoza JH (2011) An introduction to reinforcement learning. Decis Theory Model Appl Artif Intell Concepts Solut. https://doi.org/10.4018/978-1-60960-165-2.ch004
Article Google Scholar
Fujimoto S, Van Hoof H, Meger D (2018) Addressing function approximation error in actor-critic methods. Int Conf Mach Learn ICML 4:2587–2601
Google Scholar
Seborg DE, Edgar TF, Mellichamp DA and Doyle III FJ (2016) Process dynamics and control, 4th edn. John Wiley & Sons
Google Scholar

Download references

Funding

No external funding was received in association with this research.

Author information

Authors and Affiliations

Department of Chemical Engineering, National Institute of Technology, Tiruchirappalli, Tamilnadu, 620015, India
N. Rajasekhar , T. K. Radhakrishnan & N. Samsudeen

Authors

N. Rajasekhar
View author publications
You can also search for this author in PubMed Google Scholar
T. K. Radhakrishnan
View author publications
You can also search for this author in PubMed Google Scholar
N. Samsudeen
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

N.R. contributed to conceptualization, data curation, formal analysis, investigation, methodology, validation, visualization, software, and writing—original draft. T.K.R. contributed to conceptualization, formal analysis, investigation, methodology, project administration, resources, software, supervision, validation, visualization, writing—original draft, and writing—review and editing. N.S. contributed to conceptualization, formal analysis, investigation, methodology, project administration, supervision, validation, visualization, and writing—review and editing.

Corresponding author

Correspondence to T. K. Radhakrishnan.

Ethics declarations

Conflict of interest

All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or nonfinancial interest in the subject matter or materials discussed in this manuscript.

Appendices

Appendix

Appendix-A: Technical data of three-tank hybrid system

S. no	Component	Specifications
	Level Transmitter (LT)	Built-in sensor: piezoelectric with \(\mathrm{\mu C}\) based Input: (0–6500) mm H₂O Output: 4–20 mA
2	Rotameter	Range: (44–440) L/Hour
3	Pneumatic control valve	Range: (500/1000) L/Hour Characteristics: Equal % Valve action: air to open
4	Electro-pneumatic converter	Input pneumatic signal: 20 psi constant Input current signal: (4–20) mA Signal: (3–15) psi
5	Solenoidal valve	Type: magnet Current rating: 100 mA Line connection:\(\frac{1}{2}^{"}\mathrm{BSP }\left(\mathrm{F}\right)\mathrm{ Thread}\)
6	Air regulator	Input: 10.6 kg/cm² Output:2.1 kg/cm² Special feature: air regulator with filter

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Rajasekhar , N., Radhakrishnan, T.K. & Samsudeen, N. Decentralized multi-agent control of a three-tank hybrid system based on twin delayed deep deterministic policy gradient reinforcement learning algorithm. Int. J. Dynam. Control 12, 1098–1115 (2024). https://doi.org/10.1007/s40435-023-01227-0

Download citation

Received: 17 February 2023
Revised: 15 May 2023
Accepted: 16 May 2023
Published: 20 June 2023
Issue Date: April 2024
DOI: https://doi.org/10.1007/s40435-023-01227-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Decentralized multi-agent control of a three-tank hybrid system based on twin delayed deep deterministic policy gradient reinforcement learning algorithm

Abstract

Access this article

Similar content being viewed by others

A review of PID control, tuning methods and applications

Red-billed blue magpie optimizer: a novel metaheuristic algorithm for 2D/3D UAV path planning and engineering design problems

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Appendices

Appendix

Appendix-A: Technical data of three-tank hybrid system

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Decentralized multi-agent control of a three-tank hybrid system based on twin delayed deep deterministic policy gradient reinforcement learning algorithm

Abstract

Access this article

Similar content being viewed by others

A review of PID control, tuning methods and applications

Red-billed blue magpie optimizer: a novel metaheuristic algorithm for 2D/3D UAV path planning and engineering design problems

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Appendices

Appendix

Appendix-A: Technical data of three-tank hybrid system

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation