Enhancing Grammatical Evolution Through Data Augmentation: Application to Blood Glucose Forecasting

  • Jose Manuel Velasco
  • Oscar Garnica
  • Sergio Contador
  • Jose Manuel Colmenar
  • Esther Maqueda
  • Marta Botella
  • Juan Lanchares
  • J. Ignacio HidalgoEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10199)


Currently, Diabetes Mellitus Type 1 patients are waiting hopefully for the arrival of the Artificial Pancreas (AP) in a near future. AP systems will control the blood glucose of people that suffer the disease, improving their lives and reducing the risks they face everyday. At the core of the AP, an algorithm will forecast future glucose levels and estimate insulin bolus sizes. Grammatical Evolution (GE) has been proved as a suitable algorithm for predicting glucose levels. Nevertheless, one the main obstacles that researches have found for training the GE models is the lack of significant amounts of data. As in many other fields in medicine, the collection of data from real patients is very complex. In this paper, we propose a data augmentation algorithm that generates synthetic glucose time series from real data. The synthetic time series can be used to train a unique GE model or to produce several GE models that work together in a combining system. Our experimental results show that, in a scarce data context, Grammatical Evolution models can get more accurate and robust predictions using data augmentation.


Grammatical Evolution Diabetes Time series forecasting Data augmentation Combining systems 



This research is supported by the Spanish Minister of Science and Innovation (TIN2014-54806-R).

The authors would like to thank the staff in the Principe de Asturias Hospital at Alcala de Henares for their support and assistance with this project. Special thanks also go to Maria Aranzazu Aramendi Zurimendi and Remedios Martinez Rodriguez.


  1. 1.
    Krinsley, J.S., Jones, R.L.: Cost analysis of intensive glycemic control in critically ill adult patients. Chest 129(3), 644–650 (2006)CrossRefGoogle Scholar
  2. 2.
    Nicolao, G.D., Magni, L., Man, C.D., Cobelli, C.: Modeling and control of diabetes: towards the artificial pancreas. In: 18th IFAC World Congress of the IFAC Proceedings Volumes, vol. 44, no. 1, pp. 7092–7101 (2011)Google Scholar
  3. 3.
    Hastings, W.: Monte Carlo sampling methods using Markov chains and their applications. J. Biometrica 57, 97–109 (1970)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Tanner, M.A., Wong, W.H.: From EM to data augmentation: the emergence of MCMC Bayesian computation in the 1980s, April 2011. arXiv e-prints arXiv:1104.2210
  5. 5.
    Yadav, M., Malhotra, P., Vig, L., Sriram, K., Shroff, G.: ODE - augmented training improves anomaly detection in sensor data from machines. CoRR (2016). arXiv:1605.01534
  6. 6.
    Kumar, A., Cowen, L.: Augmented training of hidden Markov models to recognize remote homologs via simulated evolution. Bioinformatics 25(13), 1602–1608 (2009)CrossRefGoogle Scholar
  7. 7.
    Mays, L.: Diabetes mellitus standards of care. Nurs. Clin. North Am. 50(4), 703–711 (2015). Pathophysiology and Care Protocols for Nursing ManagementCrossRefGoogle Scholar
  8. 8.
    Messori, M., Toffanin, C., Favero, S.D., Nicolao, G.D., Cobelli, C., Magni, L.: Model individualization for artificial pancreas. Comput. Methods Programs Biomed. (2016, in press).
  9. 9.
    Kastorini, C.-M., Papadakis, G., Milionis, H.J., Kalantzi, K., Puddu, P.-E., Nikolaou, V., Vemmos, K.N., Goudevenos, J.A., Panagiotakos, D.B.: Comparative analysis of a-priori and a-posteriori dietary patterns using state-of-the-art classification algorithms: a case/case-control study. Artif. Intell. Med. 59(3), 175–183 (2013)CrossRefGoogle Scholar
  10. 10.
    Hidalgo, J.I., Maqueda, E., Risco-Martín, J.L., Cuesta-Infante, A., Colmenar, J.M., Nobel, J.: GlUCmodel: a monitoring and modeling system for chronic diseases applied to diabetes. J. Biomed. Inform. 48, 183–192 (2014)CrossRefGoogle Scholar
  11. 11.
    Yu, C., Zhao, C.: Rapid model identification for online glucose prediction of new subjects with type 1 diabetes using model migration method. In: IFAC World Congress of the IFAC Proceedings Volumes, vol. 47, no. 3, pp. 2094–2099 (2011)Google Scholar
  12. 12.
    Gevers, M.: Identification for control: from the early achievements to the revival of experiment design. Eur. J. Control 11(4), 335–352 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    Pelikan, M., Mühlenbein, H.: Marginal distributions in evolutionary algorithms. In: Proceedings of the International Conference on Genetic Algorithms Mendel, vol. 98, pp. 90–95. Citeseer (1998)Google Scholar
  14. 14.
    Mühlenbein, H.: The equation for response to selection and its use for prediction. Evol. Comput. 5, 303–346 (1997)CrossRefGoogle Scholar
  15. 15.
    O’Neill, M., Ryan, C.: Grammatical Evolution: Evolutionary Automatic Programming in an Arbitrary Language. Kluwer Academic Publishers, Norwell (2003)CrossRefzbMATHGoogle Scholar
  16. 16.
    Clarke, W., Cox, D., Gonder-Frederick, L., Carter, W., Pohl, S.: Evaluating clinical accuracy of systems for self-monitoring of blood glucose. Diabetes Care 10, 622–628 (1987)CrossRefGoogle Scholar
  17. 17.
    Shapiro, S.S., Wilk, M.B.: An analysis of variance test for normality (complete samples). Biometrika 3(52), 591–611 (1965)MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Jose Manuel Velasco
    • 1
  • Oscar Garnica
    • 1
  • Sergio Contador
    • 1
  • Jose Manuel Colmenar
    • 2
  • Esther Maqueda
    • 3
  • Marta Botella
    • 4
  • Juan Lanchares
    • 1
  • J. Ignacio Hidalgo
    • 1
    Email author
  1. 1.Universidad Complutense de MadridMadridSpain
  2. 2.Universidad Rey Juan CarlosMóstolesSpain
  3. 3.Hospital Virgen de la SaludToledoSpain
  4. 4.Hospital U. Principe AsturiasAlcala de HenaresSpain

Personalised recommendations