Abstract
The amount of data produced within health informatics has grown to be quite vast. The large volume of data generated by various vital sign monitoring devices needs to be analysed in real time to alert the care providers about changes in a patients condition. Data processing in real time has complex challenges for the large volume of data. The real-time system should be able to collect millions of events per seconds and handle parallel processing to extract meaningful information efficiently. In our study, we have proposed a real-time BigData and Predictive Analytical Architecture for healthcare application. The proposed architecture comprises three phases: (1) collection of data, (2) offline data management and prediction model building and (3) real-time processing and actual prediction. We have used Apache Kafka, Apache Sqoop, Hadoop, MapReduce, Storm and logistic regression to predict an emergency condition. The proposed architecture can perform early detection of emergency in real time, and can analyse structured and unstructured data like Electronic Health Record (EHR) to perform offline analysis to predict patient’s risk for disease or readmission. We have evaluated prediction performance on different benchmark datasets to detect an emergency condition of any patient in real time and possibility of readmission.
Similar content being viewed by others
References
Abaker I, Yaqoob I, Khan S and Mokhtar S 2015 The rise of BigData on cloud computing: review and open research issues. Information Systems 47(3): 95–115
KApache-Kafka http://kafka.apache.org/
Apache Sqoop http://sqoop.apache.org/
Dean J and Ghemawat S 2004 MapReduce: simplified data processing on large clusters. In: Proceedings of OSDI, pp. 137–150
Apache http://httpd.apache.org/
Agarwal N, Liu H and Zhang J 2001 Using Bayesian networks in the construction of a bi-level multi-classifier: a case study using intensive care unit patients data. Artificial Intelligence in Medicine 22: 233–248
Ramon J, Fierens D, Fabia G, Meyfroid G, Blockeel H and Bruynooghe M 2007 Mining data from intensive care patients. Advanced Engineering Informatics 21: 243–256
Hadoop http://hadoop.apache.org/
Vignesh R and Sivasankar E 2001 Modern framework for distributed healthcare data analytics based on Hadoop. International Federation for Information Processing 22: 233–248
Kreps J, Narkhede N and Rao J 2011 Kafka: a distributed messaging system for log processing. In: Proceedings of the 6th International Workshop on Networking Meets Databases (NetDB), Athens, Greece
Hadoo-Kafka consumer https://github.com/kafka-dev/kafka/tree/master/contrib/hadoop-consumer
Burke H B, Goodman P H, Rosen D B, Henson D E, Weinstein J N, Harrell F E, Marks J R, Winchester D P and Bostwick D G 1997 Artificial neural networks improve the accuracy of cancer survival prediction. Cancer 4: 857–862
Yu W, Liu T, Valdez R, Gwinn M and Khoury M J 2010 Application of support vector machine modeling for prediction of common diseases: the case of diabetes and pre-diabetes. BMC Medical Informatics and Decision Making
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Chauhan, V., Gaur, R., Tiwari, A. et al. Real-time BigData and Predictive Analytical Architecture for healthcare application. Sādhanā 44, 237 (2019). https://doi.org/10.1007/s12046-019-1220-z
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s12046-019-1220-z