Managing uncertainty in imputing missing symptom value for healthcare of rural India
- 13 Downloads
In India, 67% of the total population live in remote area, where providing primary healthcare is a real challenge due to the scarcity of doctors. Health kiosks are deployed in remote villages and basic health data like blood pressure, pulse rate, height–weight, BMI, Oxygen saturation level (SpO2) etc. are collected. The acquired data is often imprecise due to measurement error and contains missing value. The paper proposes a comprehensive framework to impute missing symptom values by managing uncertainty present in the data set.
The data sets are fuzzified to manage uncertainty and fuzzy c-means clustering algorithm has been applied to group the symptom feature vectors into different disease classes. The missing symptom values corresponding to each disease are imputed using multiple fuzzy based regression model. Relations between different symptoms are framed with the help of experts and medical literature. Blood pressure symptom has been dealt with using a novel approach due to its characteristics and different from other symptoms. Patients’ records obtained from the kiosks are not adequate, so relevant data are simulated by the Monte Carlo method to avoid over-fitting problem while imputing missing values of the symptoms. The generated datasets are verified using Kulberk–Leiber (K–L) distance and distance correlation (dCor) techniques, showing that the simulated data sets are well correlated with the real data set.
Using the data sets, the proposed model is built and new patients are provisionally diagnosed using Softmax cost function. Multiple class labels as diseases are determined by achieving about 98% accuracy and verified with the ground truth provided by the experts.
It is worth to mention that the system is for primary healthcare and in emergency cases, patients are referred to the experts.
KeywordsRural healthcare Missing value Regression model Fuzzification Monte Carlo method Softmax classifier
This work is supported by Information Technology Research Academy (ITRA), Digital India Corporation (formerly Media Lab Asia), Government of India under, ITRA-Mobile Grant [ITRA/15(59)/Mobile/Remote Health/01].
Compliance with ethical standards
Conflict of interest
The authors declare that they have no conflict of interest.
- 2.Lodh N, Sil J, Bhattacharya I. Graph based clinical decision support system using ontological framework. In: Proc. Conference on Computational Intelligence, Communications and Business Analytics (CICBA-2017). Springer.Google Scholar
- 4.Paul A, Sil J. Estimating missing value in microarray gene expression data using fuzzy similarity measure. In: IEEE International Conference on Fuzzy Systems. 27–30 June 2011, Taipei, Taiwan.Google Scholar
- 17.Chowdhury MH, Islam MK, Khan SI. Imputation of missing healthcare data. In: 2017 20th International Conference of Computer and Information Technology (ICCIT), IEEE, 2017.Google Scholar
- 18.Jinsung Y, Jordon J, van der Schaar M. GAIN: missing data imputation using generative adversarial nets. arXiv:1806.02920 (2018).
- 19.Purwar A, Singh SK. Issues in data mining: a comprehensive survey. In: 2014 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC), IEEE, 2014.Google Scholar
- 20.Govardhan A, Madhu G, Rajinikanth TV. A non-parametric discretization based imputation algorithm for continuous attributes with missing data values. Int J Inf Process. 2014;8:64–72.Google Scholar
- 24.Kumar MN. Performance comparison of state-of-the-art missing value imputation algorithms on some bench mark datasets. arXiv:1307.5599 2013.
- 25.Casillas A et al. First approaches on Spanish medical record classification using diagnostic Term to class transduction. In: Proceedings of the 10th International Workshop on Finite State Methods and Natural Language Processing. 2012.Google Scholar
- 26.Kasper D, et al. Harrison’s principles of internal medicine. New York: McGraw-Hill Education; 2015.Google Scholar
- 27.Glynn M, Drake WM. Hutchison’s clinical methods E-book: an integrated approach to clinical practice. Amsterdam: Elsevier Health Sciences; 2017.Google Scholar
- 30.Allison L. http://www.allisons.org/ll/MML/KL/Normal/.