Combining data augmentation, EDAs and grammatical evolution for blood glucose forecasting


The ideal solution for diabetes mellitus type 1 patients is the generalization of artificial pancreas systems. Artificial pancreas will control blood glucose levels of diabetics, improving their quality of live. At the core of the system, an algorithm will forecast future glucose levels as a function of food ingestion and insulin bolus sizes. In previous works several evolutionary computation techniques has been proposed as modeling or identification techniques in this area. One of the main obstacles that researchers have found for training the models is the lack of significant amounts of data. As in many other fields in medicine, the collection of data from real patients is not an easy task, since it is necessary to control the environmental and patient conditions. In this paper, we propose three evolutionary algorithms that generate synthetic glucose time series using real data from a patient. This way, the models can be trained with an augmented data set. The synthetic time series are used to train grammatical evolution models that work together in an ensemble. Experimental results show that, in a scarce data context, grammatical evolution models can get more accurate and robust predictions using data augmentation. In particular we reduce the number of potentially dangerous predictions to 0 for a 30 min horizon, 2.5% for 60 min, 3.6% on 90 min and 5.5% for 2 h. The Ensemble approach presented in this paper showed excellent performance when compared to not only a classical approach such as ARIMA, but also with other grammatical evolution approaches. We tested our techniques with data from real patients.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6


  1. 1.

    On 6 June 2012, the Clinical Research Ethics Committee of the Hospital of Alcalá de Henares (Spain) authorized the use of the data collected, provided that the privacy of the data is ensured and the informed consent of patients is made.


  1. 1.

    Krinsley JS, Jones RL (2006) Cost analysis of intensive glycemic control in critically ill adult patients. Chest 129(3):644–650

    Article  Google Scholar 

  2. 2.

    Mays L (2015) Diabetes mellitus standards of care. Nurs Clin North Am 50(4):703–711 (Pathophysiology and Care Protocols for Nursing Management)

    Article  Google Scholar 

  3. 3.

    De Nicolao G, Magni L, Dalla Man C, Cobelli C (2011) Modeling and control of diabetes: towards the artificial pancreas. IFAC Proc Vol 44(1):7092–7101

    Article  Google Scholar 

  4. 4.

    Córcoles EP, Boutelle MG (2013) Biosensors and invasive monitoring in clinical applications. Springer—briefs in applied sciences and technology. Springer, Berlin

    Google Scholar 

  5. 5.

    Yoo E-H, Lee S-Y (2010) Glucose biosensors: an overview of use in clinical practice. Sensors 10(5):4558–4576

    Article  Google Scholar 

  6. 6.

    Hansen AH, Duun-Henriksen AK, Juhl R, Schmidt S, Nørgaard K, Jørgensen JB, Madsen H (2014) Predicting plasma glucose from interstitial glucose observations using bayesian methods. J Diabetes Sci Technol 8(2):321–330

    Article  Google Scholar 

  7. 7.

    Hidalgo JI, Maqueda E, Risco-Martín JL, Cuesta-Infante A, Colmenar JM, Nobel J (2014) glucmodel: a monitoring and modeling system for chronic diseases applied to diabetes. J Biomed Inform 48:183–192

    Article  Google Scholar 

  8. 8.

    Velasco JM, Garnica O, Contador S, Colmenar JM, Maqueda E, Botella M, Lanchares J, Hidalgo JI (2017) Enhancing grammatical evolution through data augmentation: application to blood glucose forecasting. In: European conference on the applications of evolutionary computation. Springer, pp 142–157

  9. 9.

    Tanner MA, Wong WH (2010) From EM to data augmentation: the emergence of mcmc Bayesian computation in the 1980s. Stat Sci 25(4):506–516

    MathSciNet  Article  MATH  Google Scholar 

  10. 10.

    Yadav M, Malhotra P, Vig L, Sriram K, Shroff G (2016) ODE—augmented training improves anomaly detection in sensor data from machines. arXiv preprint arXiv:1605.01534

  11. 11.

    Kumar A, Cowen L (2009) Augmented training of hidden markov models to recognize remote homologs via simulated evolution. Bioinformatics 25(13):1602–1608

    Article  Google Scholar 

  12. 12.

    Messori M, Toffanin C, Del Favero S, De Nicolao G, Cobelli C, Magni L (2016) Model individualization for artificial pancreas. Comput Methods Progr Biomed.

    Google Scholar 

  13. 13.

    Hidalgo JI, Colmenar JM, Kronberger G, Winkler SM, Garnica O, Lanchares J (2017) Data based prediction of blood glucose concentrations using evolutionary methods. J Med Syst 41(9):142

    Article  Google Scholar 

  14. 14.

    Hovorka R, Allen JM, Elleri D, Chassin LJ, Harris J, Xing D, Kollman C, Hovorka T, Larsen AMF, Nodale M, De Palma A, Wilinska ME, Acerini CL, Dunger DB (2010) Manual closed-loop insulin delivery in children and adolescents with type 1 diabetes: a phase 2 randomised crossover trial. Lancet 375:743–751

    Article  Google Scholar 

  15. 15.

    Kovatchev B, Cobelli C, Renard E, Anderson S, Breton M, Patek S, Clarke W, Bruttomesso D, Maran A, Costa S, Avogaro A, Man CD, Facchinetti A, Magni L, De Nicolao G, Place J, Farret A (2010) Multinational study of subcutaneous model-predictive closed loop control in type 1 diabetes mellitus: summary of the results. Diabetes Sci Technol 4:1374–1381

    Article  Google Scholar 

  16. 16.

    El-Khatib FH, Russell SJ, Nathan DM, Sutherlin RG, Damiano ER (2010) A bihormonal closed-loop artificial pancreas for type 1 diabetes. Sci Transl Med 2(27):27ra27

    Article  Google Scholar 

  17. 17.

    Magni L, Forgione M, Toffanin C, Dalla Man C, Kovatchev B, De Nicolao G, Cobelli C (2009) Run-to-run tuning of model predictive control for type 1 diabetes subjects: In silico trial. J Diabetes Sci Technol 3(5):1091–1098

    Article  Google Scholar 

  18. 18.

    Dassau E, Zisser H, Grosman B, Bevier W, Percival MW, Jovanovic L, Doyle III FJ (2009) Artificial pancreatic beta-cell protocol for enhanced model identification. Diabetes 58:A105–A106

    Google Scholar 

  19. 19.

    Steil GM, Palerm CC, Kurtz N, Voskanyan G, Roy A, Paz S, Kandeel FR (2011) The effect of insulin feedback on closed loop glucose control. J Clin Endocrinol Metab 96:1402–1408

    Article  Google Scholar 

  20. 20.

    Yu C, Zhao C (2014) Rapid model identification for online glucose prediction of new subjects with type 1 diabetes using model migration method. IFAC Proc Vol 44(1):2094–2099

    MathSciNet  Article  Google Scholar 

  21. 21.

    Gevers M (2005) Identification for control: from the early achievements to the revival of experiment design*. Eur J Control 11(4):335–352

    MathSciNet  Article  MATH  Google Scholar 

  22. 22.

    Clarke WL, Cox D, Gonder Frederick LA, Carter W, Pohl SL, Pohl SL (1987) Evaluating clinical accuracy of systems for self-monitoring of blood glucose. Diabetes Care 10(5):622–628

    Article  Google Scholar 

  23. 23.

    O’Neill M, Ryan C (2003) Grammatical evolution: evolutionary automatic programming in an arbitrary language. Kluwer Academic Publishers, Norwell

    Google Scholar 

  24. 24.

    Pelikan M, Mühlenbein H (1998) Marginal distributions in evolutionary algorithms. In Proceedings of the international conference on genetic algorithms mendel, vol 98. Citeseer, pp 90–95

  25. 25.

    Mühlenbein H (1997) The equation for response to selection and its use for prediction. Evol Comput 5(3):303–346

    Article  Google Scholar 

  26. 26.

    McDermott J, White DR, Luke S, Manzoni L, Castelli M, Vanneschi L, Jaskowski W, Krawiec K, Harper R, De Jong K, O’Reilly U-M (2012) Genetic programming needs better benchmarks. In Proceedings of the 14th annual conference on genetic and evolutionary computation, GECCO ’12, ACM, New York, NY, USA, pp 791–798

  27. 27.

    Razali N, Wah YB (2011) Power comparisons of Shapiro-Wilk, Kolmogorov-Smirnov, Lilliefors and Anderson-Darling tests. J Stat Model Anal 2(1):21–33

    Google Scholar 

Download references


This research is supported by the Spanish Minister of Science and Innovation (TIN2014-54806-R). The authors would like to thank the staff in the Principe de Asturias Hospital at Alcala de Henares for their support and assistance with this project. Special thanks also go to Maria Aranzazu Aramendi Zurimendi and Remedios Martinez Rodriguez.

Author information



Corresponding author

Correspondence to Jose Manuel Velasco.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Velasco, J.M., Garnica, O., Lanchares, J. et al. Combining data augmentation, EDAs and grammatical evolution for blood glucose forecasting. Memetic Comp. 10, 267–277 (2018).

Download citation


  • Grammatical evolution
  • Diabetes
  • Time series forecasting
  • Data augmentation