Skip to main content
Log in

Reinforcement learning optimal control with semi-continuous reward function and fuzzy-rules networks for drug administration of cancer treatment

  • Application of soft computing
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

In this article, the dynamics of cell populations for cancer patients under the treatment of chemotherapy drug administration are reformulated by pseudopartial derivative of input–output data. By utilizing only the tumor cells population as the output and the drug administration as the output data, the model-free adaptive controller is established with two fuzzy-rules emulated networks based on a reinforcement learning scheme with the convergence analysis of internal signals. As a result, the optimal drug administration is derived according to the robustness of individual patients and delaying treatments. The rigorous numerical system is employed to validate the effectiveness of the proposed scheme.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

Data availability

Enquiries about data availability should be directed to the authors.

References

  • Babaei N, Salamci MU (2015) Personalized drug administration for cancer treatment using Model Reference Adaptive Control. J Theor Biol 371:24–44

    Article  MathSciNet  MATH  Google Scholar 

  • Batmani Y, Khaloozadeh H (2013) Optimal chemotherapy in cancer treatment: state dependent Riccati equation control and extended Kalman filter. Optim Control Appl Methods 34(5):562

    Article  MathSciNet  MATH  Google Scholar 

  • Bermudez-Contreras E (2021) Deep reinforcement learning to study spatial navigation, learning and memory in artificial and biological agents. Biol Cybern 115:131–134

    Article  Google Scholar 

  • Cetin O, Temurtas F (2021) A comparative study on classification of magnetoencephalography signals using probabilistic neural network and multilayer neural network. Soft Comput 25:2267–2275

    Article  Google Scholar 

  • Chen T, Kirkby NF, Jena R (2012) Optimal dosing of cancer chemotherapy using model predictive control and moving horizon state/parameter estimation. Comput Methods Progr Biomed 108(3):973–983

    Article  Google Scholar 

  • Chi R, Hui Y, Zhang S, Huang B, Hou Z (2020) Discrete-time extended state observer-based model-free adaptive control via local dynamic linearization. IEEE Trans Ind Electron 67(10):8691–8701

    Article  Google Scholar 

  • Coelho F, Braga AP, Natowicz R, Rouzier R (2011) Semi-supervised model applied to the prediction of the response to preoperative chemotherapy for breast cancer. Soft Comput 15:1137–1144

    Article  Google Scholar 

  • Dorosti S, Ghoushchi SJ, Sobhrakhshankhah E, Ahmadi M, Sharifi A (2020) Application of gene expression programming and sensitivity analyses in analyzing effective parameters in gastric cancer tumor size and location. Soft Comput 24:9943–9964

    Article  Google Scholar 

  • Ekpenyong ME, Etebong PI, Jackson TC, Udofa EM (2020) Modelling drugs interaction in treatment-experienced patients on antiretroviral therapy. Soft Comput 24:17349–17364

    Article  Google Scholar 

  • Friston K, Samothrakis S, Montague R (2012) Active inference and agency: optimal control without cost functions. Biol Cybern 106:523–541

    Article  MathSciNet  MATH  Google Scholar 

  • Hou Z, Chi R, Gao H (2017) An overview of dynamic-linearization-based data-driven control and applications. IEEE Trans Ind Electron 64(5):4076–4090

    Article  Google Scholar 

  • Jin Y, Ding J (2017) Special issue on “Data-driven evolutionary optimization. Soft Comput 21:5867–5868

    Article  Google Scholar 

  • Liu J, Wang XS (2019) Numerical optimal control of a size-structured PDE model for metastatic cancer treatment. Math Biosci 314:28–42

    Article  MathSciNet  MATH  Google Scholar 

  • Monga B, Wilson D, Matchen T, Moehlis J (2019) Phase reduction and phase-based optimal control for biological systems: a tutorial. Biol Cybern 113:11–46

    Article  MathSciNet  MATH  Google Scholar 

  • Mu C, Wang D, He H (2018) Data-driven finite-horizon approximate optimal dontrol for discrete-time nonlinear systems using iterative HDP approach. IEEE Trans Cybern 48(10):2948–2961

    Article  Google Scholar 

  • Noble SL, Sherer E, Hannemann RE, Ramkrishna D, Vik T, Rundell AE (2010) Using adaptive model predictive control to customize maintenance therapy chemotherapeutic dosing for childhood acute lymphoblastic leukemia. J Theor Biol 264(3):990–1002

    Article  MathSciNet  MATH  Google Scholar 

  • Padmanabhan R, Meskin N, Haddad WM (2017) Reinforcement learning-based control of drug dosing for cancer chemotherapy treatment. Math Biosci 293:11–20

  • Qiu R, Sun Y, Fan Z, Sun M (2020) Robust multi-product inventory optimization under support vector clustering-based data-driven demand uncertainty set. Soft Comput 24:6259–6275

  • Rihan FA, Velmurugan G (2020) Dynamics of fractional-order delay differential model for tumor-immune system. Chaos Solitons Fractals 132:109592

    Article  MathSciNet  MATH  Google Scholar 

  • Robertson-Tessi M, El-Kareh A, Goriely A (2011) A mathematical model of tumor-immune interactions. J Theor Biol 294:56

    Article  MathSciNet  MATH  Google Scholar 

  • Shang C, Chen WH, Stroock AD, You F (2020) Robust model predictive control of irrigation systems with active uncertainty learning and data analytics. IEEE Trans Control Syst Technol 28(4):1493–1504

    Article  Google Scholar 

  • Sharifi M, Moradi H (2019) Nonlinear composite adaptive control of cancer chemotherapy with online identification of uncertain parameters. Biomed Signal Process Control 49:360–4

    Article  Google Scholar 

  • Sharma PJ, Patel PL, Jothiprakash V (2021) Data-driven modelling framework for streamflow prediction in a physio-climatically heterogeneous river basin. Soft Comput 25:5951–5978

    Article  Google Scholar 

  • Sweilam NH, AL-Mekhlafi SM, Albalawi AO, Tenreiro-Machado JA (2021) Optimal control of variable-order fractional model for delay cancer treatments. Appl Math Modell 89:1557–1574

    Article  MathSciNet  MATH  Google Scholar 

  • Treesatayapun C (2020) Prescribed performance of discrete-time controller based on the dynamic equivalent data model. Appl Math Model 78:366–382

    Article  MathSciNet  MATH  Google Scholar 

  • Wieser E, Cheng G (2020) EO-MTRNN: evolutionary optimization of hyperparameters for a neuro-inspired computational model of spatiotemporal learning. Biol Cybern 114:363–387

    Article  MATH  Google Scholar 

  • Yazdjerdi P, Meskin N, Al-Naemi M, Moustafa AE, Kovacs L (2019) Reinforcement learning-based control of tumor growth under anti-angiogenic therapy. Comput Methods Progr Biomed 173:15–26

    Article  Google Scholar 

  • Zhang M, Gan MG (2019) Data-driven adaptive optimal control for linear systems with structured time-varying uncertainty. IEEE Access 7:9215–9224

    Article  Google Scholar 

  • Zhang M, Gan MG (2020) Kernel-based Hamilton Jacobi equations for data-driven optimal and H-infinity control. IEEE Access 8:131047–131062

    Article  Google Scholar 

Download references

Acknowledgements

The author gratefully acknowledges the contribution of Mexican Research Organization CONACyT Grant # 257253 and CINVESTAV-IPN.

Funding

The authors have not disclosed any funding.

Author information

Authors and Affiliations

Authors

Contributions

CT helped in conceptualization, formal analysis, research, MiFREN methodology, validation results, writing, review editing. AJMV contributed to conceptualization, formal analysis, research, controller design, simulations, writing, editing. NS performed validation results, simulations, writing, review editing.

Corresponding author

Correspondence to Chidentree Treesatayapun.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Informed consent

Written informed consent for publication was obtained from all participants.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Treesatayapun, C., Muñoz-Vázquez, A.J. & Suyaroj, N. Reinforcement learning optimal control with semi-continuous reward function and fuzzy-rules networks for drug administration of cancer treatment. Soft Comput 27, 17347–17356 (2023). https://doi.org/10.1007/s00500-023-08068-1

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-023-08068-1

Keywords

Navigation