Use of Reinforcement Learning in Two Real Applications

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5323)


In this paper, we present two sucessful applications of Reinforcement Learning (RL) in real life. First, the optimization of anemia management in patients undergoing Chronic Renal Failure is presented. The aim is to individualize the treatment (Erythropoietin dosages) in order to stabilize patients within a targeted range of Hemoglobin (Hb). Results show that the use of RL increases the ratio of patients within the desired range of Hb. Thus, patients’ quality of life is increased, and additionally, Health Care System reduces its expenses in anemia management. Second, RL is applied to modify a marketing campaign in order to maximize long-term profits. RL obtains an individualized policy depending on customer characteristics that increases long-term profits at the end of the campaign. Results in both problems show the robustness of the obtained policies and suggest their use in other real-life problems.


Optimal Policy Marketing Campaign Company Policy Reinforcement Learn Algorithm Best Match Unit 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Lynne Peterson, L.: FDA Oncologic Drugs Advisory Committee (ODAC) meeting on the safety of erythropoietin in oncology. Trends in Medicine, pp. 1–4 (May 2004)Google Scholar
  2. 2.
    National Kidney Foundation, K.D.O.Q.I.: Guidelines for anemia of chronic kidney disease. NKF K/DOQI Guidelines (2000),
  3. 3.
    Steensma, D., Molina, R., Sloan, J., Nikcevich, D., Schaefer, P., Rowland, K.J., Dentchev, T., Novotny, P., Tschetter, L., Alberts, S., Hogan, T., Law, A., Loprinzi, C.L.: Phase III study of two different dosing schedules of erythropoietin in anemic patients with cancer. Journal of Clinical Oncology 24(7), 1079–1089 (2006)CrossRefGoogle Scholar
  4. 4.
    Bellazzi, R.: Drug delivery optimization through bayesian networks: an application to erythropoietin therapy in uremic anemia. Computers and Biomedical Research 26, 274–293 (1992)CrossRefGoogle Scholar
  5. 5.
    Bellazzi, R., Siviero, C., Bellazzi, R.: Mathematical modeling of erythropoietin therapy in uremic anemia. Does it improve cost-effectiveness? Haematologica 79, 154–164 (1994)Google Scholar
  6. 6.
    Jacobs, A.A., Lada, P., Zurada, J.M., Brier, M.E., Aronoff, G.: Predictors of hematocrit in hemodialysis patients as determined by artificial neural networks. Journal of American Nephrology 12, 387A (2001)Google Scholar
  7. 7.
    Martín, J.D., Soria, E., Camps, G., Serrano, A., Pérez, J., Jiménez, N.: Use of neural networks for dosage individualisation of erythropoietin in patients with secondary anemia to chronic renal failure. Computers in Biology and Medicine 33(4), 361–373 (2003)CrossRefGoogle Scholar
  8. 8.
    Sutton, R., Barto, A.: Reinforcement Learning: An Introducion. MIT Press, Cambridge (1998)Google Scholar
  9. 9.
    Martín, J.D., Soria, E., Chorro, V., Climente, M., Jiménez, N.V.: Reinforcement learning for anemia management in hemodialysis patients treated with erythropoietic stimulating factors. In: European Conference on Artificial Intelligence 2006, Proceedings of the Workshop Planning, Learning and Monitoring with uncertainty and dynamic worlds, Riva del Garda, Italy, pp. 19–24 (2006)Google Scholar
  10. 10.
    Martín, J.D., Soria, E., Martínez, M., Climente, M., De Diego, T., Jiménez, N.V.: Validation of a reinforcement learning policy for dosage optimization of erythropoietin. In: Orgun, M.A., Thornton, J. (eds.) AI 2007. LNCS, vol. 4830, pp. 732–738. Springer, Heidelberg (2007)Google Scholar
  11. 11.
    Haykin, S.: Neural Networks: A Comprehensive Foundation, 2nd edn. Prentice Hall, Upper Saddle River (1999)zbMATHGoogle Scholar
  12. 12.
    Reichheld, F.F.: The loyalty effect: the hidden force behind growth, profits, and lasting value. Harvard Business School Press, Boston (2001)Google Scholar
  13. 13.
    Abe, N., Verma, N., Schroko, R., Apte, C.: Cross channel optimized marketing by reinforcement learning. In: Proceedings of the KDD, pp. 767–772 (2004)Google Scholar
  14. 14.
    Sun, P.: Constructing Learning Models from Data: The Dynamic Catalog Mailing Problem. Ph.D thesis, Tsinghua University (2003)Google Scholar
  15. 15.
    Pfeifer, P.E., Carraway, R.L.: Modelling customer relationships as markov chains. Journal of Interactive Marketing 14(2), 43–55 (2000)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  1. 1.Intelligent Data Analysis Laboratory, Department of Electronic EngineeringUniversity of Valencia, Email: idal@uv.esSpain

Personalised recommendations