Abstract
In diabetes treatment, the blood glucose level is key quantity for evaluating patient’s condition. Typically, measurements of the blood glucose level are recorded by patients and they are annotated by symbolic quantities, such as, date, timestamp, measurement code (insulin dose, food intake, exercises). In clinical practice, predicting the blood glucose level for different conditions is an important task and plays crucial role in personalized treatment. This paper describes a predictive model for the blood glucose level based on Gaussian processes. The covariance function is proposed to deal with categorical inputs. The usefulness of the presented model is demonstrated on real-life datasets concerning 10 patients. The results obtained in the experiment reveal that the proposed model has small predictive error measured by the Mean Absolute Error criterion even for small training samples.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Further, in the experiment, we will consider only three inputs (\(D=3\)) which are typical in the diabetes treatment, namely, day of a week, period of a day, and a measurement code. However, the presented idea is given in a general case for any number of inputs.
- 2.
Kernel function is a symmetric function and the Gram matrix whose elements are given by \(k(\mathbf {x}_{n}, \mathbf {x}_{m})\) is positive semidefinite for any set \(\{\mathbf {x}_{n}\}_{n=1}^{N}\) [20].
- 3.
By reasonably small we mean up to \(N=1000\).
- 4.
We have omitted the day of a week and the part of a day because of two reasons. First, we wanted to have less parameters of the mean function. Second, in the preliminary experiments, including also \(x_{1}\) and \(x_{2}\) resulted in no significant change in the performance of the GP.
References
Agresti, A.: An Introduction to Categorical Data Analysis. Wiley-Interscience, New York (2007)
Alemdar, H., Ersoy, C.: Wireless sensor networks for healthcare: a survey. Comput. Netw. 54(15), 2688–2710 (2010)
Ažman, K., Kocijan, J.: Application of Gaussian processes for black-box modelling of biosystems. ISA Trans. 46, 443–457 (2007)
Billard, L., Diday, E.: From the statistics of data to the statistics of knowledge: symbolic data analysis. J. Am. Stat. Assoc. 98(462), 470–487 (2003)
Bishop, C.: Pattern Recognition and Machine Learning. Elsevier, Amsterdam (2006)
Breiman, L., Friedman, J., Olshen, R., Stone, C., Steinberg, D., Colla, P.: CART: Classification and Regression Trees. Wadsworth, Belmont (1983)
Chu, W., Ghahramani, Z., Falciani, F., Wild, D.: Biomarker discovery in microarray gene expression data with Gaussian processes. Bioinforma 21(16), 3385–3393 (2005)
Daemen. A., De Moor, B.: Development of a kernel function for clinical data. In: Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC 2009), pp. 5913–5917. IEEE (2009)
De Gaetano, A., Arino, O.: Mathematical modelling of the intravenous glucose tolerance test. J. Math. Biol. 40, 136–168 (2000)
Fischer, I., Meinl, T.: Graph based molecular data mining - an overview. In: IEEE International Conference on Systems, Man and Cybernetics, vol. 5, pp. 4578–4582. IEEE (2004)
Frank, A., Asuncion, A.: UCI machine learning repository (2010). http://archive.ics.uci.edu/ml
Gärtner, T.: A survey of kernels for structured data. ACM SIGKDD Explor. Newsl. 5(1), 49–58 (2003)
Grzech, A., Juszczyszyn, K., Swiatek, P., Mazurek, C. Sochan, A.: Applications of the future internet engineering project. In: International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel & Distributed Computing (SNPD), pp. 635–642. IEEE (2012)
Huang, Z.: Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Min. Knowl. Discov. 2(3), 283–304 (1998)
Hyndman, R., Koehler, A.: Another look at measures of forecast accuracy. Int. J. Forecast 22(4), 679–688 (2006)
Iannario, M.: Preliminary estimators for a mixture model of ordinal data. Adv. Data Anal. Classif. 6, 163–184 (2012)
Likar, B., Kocijan, J.: Predictive control of a gas-liquid separation plant based on a Gaussian process model. Comput. Chem. Eng. 31, 142–152 (2007)
Makosso-Kallyth, S., Diday, E.: Adaptation of interval PCA to symbolic histogram variables. Adv. Data Anal. Classif. 6, 1–13 (2012)
Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning. MIT Press, London (2006)
Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge (2004)
Srinivasan, A., King, R.D.: Feature construction with inductive logic programming: a study of quantitative predictions of biological activity aided by structural attributes. Data Min. Knowl. Discov. 3(1), 37–57 (1999)
Tomczak, J., Gonczarek, A.: Decision rules extraction from data stream in the presence of changing context for diabetes treatment. Knowl. Inf. Syst. 34, 521–546 (2013)
Tomczak, J., Świątek, J., Latawiec, K.: Gaussian process regression as a predictive model for quality-of-service in web service systems. arXiv preprint arXiv: 1207.6910 (2012)
Turner, R., Deisenroth, M.P., Rasmussen, C.E.: System identification in Gaussian process dynamical systems. In: Görür, D. (ed.) NIPS Workshop on Nonparametric Bayes. Whistler, Canada (2009)
Węglarz-Tomczak, E., Vassiliou, S., Mucha, A.: Discovery of potent and selective inhibitors of human aminopeptidases erap. 1 and erap. 2 by screening libraries of phosphorus-containing amino acid and dipeptide analogues. Bioorg. Med. Chem. Lett. 26(16), 4122–4126 (2016)
World Health Organization. Definition and diagnosis of diabetes mellitus and intermediate hyperglycemia. Report of a WHO/IDF Consultation (2006)
Zięba, M., Świątek, J.: Ensemble classifier for solving credit scoring problems. IFIP AICT 372, 59–66 (2012)
Acknowledgements
The research is partially supported by the grant co-financed by the Ministry of Science and Higher Education in Poland.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Tomczak, J.M. (2017). Gaussian Process Regression with Categorical Inputs for Predicting the Blood Glucose Level. In: Świątek, J., Tomczak, J. (eds) Advances in Systems Science. ICSS 2016. Advances in Intelligent Systems and Computing, vol 539. Springer, Cham. https://doi.org/10.1007/978-3-319-48944-5_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-48944-5_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-48943-8
Online ISBN: 978-3-319-48944-5
eBook Packages: EngineeringEngineering (R0)