Abstract
A vast volume of digitized clinical data has been generated and accumulated rapidly since the widespread adoption of Electronic Medical Records (EMRs). These big data in healthcare hold the promise of propelling healthcare to evolve from a proficiency-based art to data-driven science, from a reactive mode to a proactive mode, and from one-size-fits-all medicine to personalized medicine. This chapter first discusses the research background—big data analytics in healthcare, the research framework of big data analytics in healthcare, analysis of the medical process, and the literature summary of diagnosis-treatment pattern mining. Then the challenges for data-driven typical diagnosis-treatment pattern mining are highlighted, including similarity measures between diagnosis and treatment records, typical diagnosis-treatment pattern extraction, prediction, evaluation, and recommendation, when considering the rich temporal and heterogeneous medical information in EMRs. Furthermore, a data-driven unifying diagnosis identification and prediction method (UDIPM) embedding the disease ontology structure is proposed from EMRs to assist in better coding integration of diagnosis. Three categories of typical treatment patterns are mined from doctor order content, duration, and sequence view respectively, which can provide a data-driven guideline to achieve the “5R” goal for rational drug use and clinical pathways.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ainsworth, J., & Buchan, I. (2012). COCPIT: A tool for integrated care pathway variance analysis. Studies in Health Technology and Informatics, 180, 995–999.
Auffray, C., Chen, Z., & Hood, L. (2009). Systems medicine: The future of medical genomics and healthcare. Genome Medicine, 1(1), 2–2.
Bakker, M., & Tsui, K. L. (2017). Dynamic resource allocation for efficient patient scheduling: A data-driven approach. Journal of Systems Science and Systems Engineering, 26(4), 448–462.
Bouarfa, L., & Dankelman, J. (2012). Workflow mining and outlier detection from clinical activity logs. Journal of Biomedical Informatics, 45(6), 1185–1190.
Bricage, P. (2017). Use of chronolithotherapy for better individual healthcare and welfare. Journal of Systems Science and Systems Engineering, 26(3), 336–358.
Chen, G. Q., Wu, G., Gu, Y. D., Lu, B. J., & Wei, Q. (2018). The challenges for big data driven research and applications in the context of managerial decision-making--paradigm shift and research directions. Journal of Management Science in China, 169(7), 6–15; In Chinese.
Chen, H. C., Chiang, R. H. L., & Storey, V. C. (2012). Business intelligence and analytics: From big data to big impact. MIS Quarterly, 36(4), 1165–1188.
Chen, J. D., Yuan, P. J., Zhou, X. J., & Tang, X. J. (2016). Performance comparison of TF*IDF, LDA and paragraph vector for document classification. In J. Chen, Y. Nakamori, W. Y. Yue, & X. J. Tang (Eds.), Knowledge and systems sciences, KSS 2016, communications in computer and information science (Vol. 660, pp. 225–235). Springer.
Chen, J. F., Guo, C. H., Lu, M. L., & Ding, S. Y. (2022). Unifying diagnosis identification and prediction method embedding the disease ontology structure from electronic medical records. Frontiers in Public Health, 9, 793801.
Chen, J. F., Guo, C. H., Sun, L. L., & Lu, M. L. (2018). Mining typical drug use patterns based on patient similarity from electronic medical records. In J. Chen, Y. Yamada, M. Ryoke, & X. J. Tang (Eds.), Knowledge and systems sciences, KSS 2018, communications in computer and information science (Vol. 949, pp. 71–86). Springer.
Chen, J. F., Guo, C. H., Sun, L. L., & Lu, M. L. (2019). Mining typical treatment duration patterns for rational drug use from electronic medical records. Journal of Systems Science and Systems Engineering, 28(5), 602–620.
Chen, J. F., Sun, L. L., Guo, C. H., Wei, W., & Xie, Y. M. (2018). A data-driven framework of typical treatment process extraction and evaluation. Journal of Biomedical Informatics, 83, 178–195.
Chen, J. F., Sun, L. L., Guo, C. H., & Xie, Y. M. (2020). A fusion framework to extract typical treatment patterns from electronic medical records. Artificial Intelligence in Medicine, 103, 101782. https://doi.org/10.1016/j.artmed.2019.101782
Chen, J. F., Wei, W., Guo, C. H., Tang, L., & Sun, L. L. (2017). Textual analysis and visualization of research trends in data mining for electronic health records. Health Policy and Technology, 6(4), 389–400.
Chen, J. G., Li, K. L., Rong, H. G., Bilal, K., Yang, N., & Li, K. Q. (2018). A disease diagnosis and treatment recommendation system based on big data mining and cloud computing. Information Sciences, 435, 124–149.
Cho, S. G., & Kim, S. B. (2017). Feature network-driven quadrant mapping for summarizing customer reviews. Journal of Systems Science and Systems Engineering, 26(5), 646–664.
Dang, T. T., & Ho, T. B. (2017). Sequence-based measure for assessing drug-side effect causal relation from electronic medical records. In J. Chen, T. Theeramunkong, T. Supnithi, & X. J. Tang (Eds.), Knowledge and systems sciences, KSS 2017, communications in computer and information science (Vol. 780, pp. 53–65). Springer.
Diao, X. L., Huo, Y. N., Zhao, S. A., Yuan, J., Cui, M., Wang, Y. X., Lian, X. D., & Zhao, W. (2021). Automated ICD coding for primary diagnosis via clinically interpretable machine learning. International Journal of Medical Informatics, 153, 104543. https://doi.org/10.1016/j.ijmedinf.2021.104543
Esfandiari, N., Babavalian, M. R., Moghadam, A. M. E., & Tabar, V. K. (2014). Knowledge discovery in medicine: Current issue and future trend. Expert Systems with Applications, 41(9), 4434–4463.
Groves, P., Kayyali, B., Knott, D., & Kuiken, S. V. (2013). The “big data” revolution in healthcare: Accelerating value and innovation. McKinsey Quarterly, 2(3), 1–19.
Guo, C. H., & Chen, J. F. (2019). Big data analytics in healthcare: Data-driven methods for typical treatment pattern mining. Journal of Systems Science and Systems Engineering, 28(6), 694–714.
Guo, C. H., Du, Z. L., & Kou, X. Y. (2018). Products ranking through aspect-based sentiment analysis of online heterogeneous reviews. Journal of Systems Science and Systems Engineering, 27(5), 542–558.
Han, J., Kamber, M., & Pei, J. (2011). Data mining: Concepts and techniques (3rd ed.). Morgan Kaufmann Publishers.
Haque, A., Milstein, A., & Fei-Fei, L. (2020). Illuminating the dark spaces of healthcare with ambient intelligence. Nature, 585, 193–202.
Herman, J. (1994). The unifying diagnosis. Scandinavian Journal of Primary Health Care, 12(2), 68–69.
Hey, T., Tansley, S., & Tolle, K. (2009). The fourth paradigm: Data-intensive scientific discovery. Microsoft Research.
Hirano, S., & Tsumoto, S. (2014). Mining typical order sequences from EHR for building clinical pathways. In W. C. Pend et al. (Eds.), Trend and applications in knowledge discovery and data mining, LNAI 8643, PAKDD 2014 (pp. 39–49). Springer Charm.
Hoang, K. H., & Ho, T. B. (2019). Learning and recommending treatments using electronic medical records. Knowledge-Based Systems, 181, 104788. https://doi.org/10.1016/j.knosys.2019.05.031
Hopp, W. J., Li, J., & Wang, G. H. (2018). Big data and the precision medicine revolution. Production and Operations Management, 27(9), 1647–1664.
Htun, H. H., & Sornlertlamvanich, V. (2017). Concept name similarity measure on SNOMED CT. In J. Chen, T. Theeramunkong, T. Supnithi, & X. J. Tang (Eds.), Knowledge and systems sciences, KSS 2017, communications in computer and information science (Vol. 780, pp. 76–90). Springer.
Huang, Z. X., Dong, W., Bath, P., Ji, L., & Duan, H. L. (2015). On mining latent treatment patterns from electronic medical records. Data Mining and Knowledge Discovery, 29(4), 914–949.
Huang, Z. X., Dong, W., Ji, L., Gan, C. X., Lu, X. D., & Duan, H. L. (2014). Discovery of clinical pathway patterns from event logs using probabilistic topic models. Journal of Biomedical Informatics, 47, 39–57.
Huang, Z. X., Lu, X. D., Duan, H. L., & Fan, W. (2013). Summarizing clinical pathways from event logs. Journal of Biomedical Informatics, 46(1), 111–127.
Jensen, P. B., Jensen, L. J., & Brunak, S. (2012). Mining electronic health records: Towards better research applications and clinical care. Nature Reviews Genetics, 13(6), 395–405.
Ji, G. J., Hu, L. M., & Tan, K. H. (2017). A study on decision-making of food supply chain based on big data. Journal of Systems Science and Systems Engineering, 26(2), 183–198.
Jin, B., Yang, H. Y., Sun, L. L., Liu, C. R., Qu, Y., & Tong, J. N. (2018). A treatment engine by predicting next-period prescriptions. Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining (pp. 1608–1616), 19–23 August, London.
Johnson, A. E. W., Pollard, T. J., Shen, L., Lehman, L. W. H., et al. (2016). MIMIC-III, a freely accessible critical care database. Scientific Data, 3, 160035. https://doi.org/10.1038/sdata.2016.35
Lakshmanan, G. T., Rozsnyai, S., & Wang, F. (2013). Investigating clinical care pathways correlated with outcomes. In F. Daniel, J. Wang, & B. Weber (Eds.), Business process management (Lecture notes in computer science) (Vol. 8094, pp. 323–338). Springer.
Li, X., Mei, J., Liu, H. F., Yu, Y. Q., Xie, G. T., Hu, J. Y., & Wang, F. (2015). Analysis of care pathway variation patterns in patient records. Studies in Health Technology & Informatics, 210, 692–696.
Liang, J. J., Goodsell, K., Grogan, M., & Ackerman, M. J. (2016). LMNA-mediated arrhythmogenic right ventricular cardiomyopathy and Charcot-Marie-tooth type 2B1: A patient-discovered unifying diagnosis. Journal of Cardiovascular Electrophysiology, 27(7), 868–871.
Lynch, C. A. (2008). Big data: how do your data grow? Nature, 455(7209), 28–29.
Malhi, G. S., Bell, E., Boyce, P., Mulder, R., & Porter, R. J. (2020). Unifying the diagnosis of mood disorders. Australian & New Zealand Journal of Psychiatry, 54(6), 561–565.
Mans, R., Schonenberg, H., Leonardi, G., Panzarasa, S., Cavallini, A., Quaglini, S., & Van Der Aalst, W. (2008). Process mining techniques: An application to stroke care. Studies in Health Technology and Informatics, 136, 573–578.
Mayer-Schönberger, V., & Cukier, K. (2013). Big data: A revolution that will transform how we live, work, and think. Houghton Mifflin Harcourt.
Miller, K. (2012). Big data analytics in biomedical research. Biomedical Computation Review, 2, 14–21.
MIT Critical Data. (2016). Secondary analysis of electronic health records. Springer.
Naeem, M., Jamal, T., Diaz-Martinez, J., Butt, S. A., Montesano, N., et al. (2022). Trends and future perspective challenges in big data. In J. S. Pan, V. E. Balas, & C. M. Chen (Eds.), Advances in intelligent data analysis and applications, LNCS12080, IDA2020 (pp. 309–325). Springer.
Niaksu, O. (2015). CRISP data mining methodology extension for medical domain. Baltic Journal of Modern Computing, 3(2), 92–109.
Perer, A., Wang, F., & Hu, J. Y. (2015). Mining and exploring care pathways from electronic medical records with visual analytics. Journal of Biomedical Informatics, 56, 369–378.
Rebuge, Á., & Ferreira, D. R. (2012). Business process analysis in healthcare environments: A methodology based on process mining. Information Systems, 37(2), 99–116.
Sareen, J., Olafson, K., Kredentser, M. S., Bienvenu, O. J., Blouw, M., et al. (2020). The 5-year incidence of mental disorders in a population-based ICU survivor cohort. Critical Care Medicine, 48(8), e675–e683.
Shi, Y. (2014). Big data history, current status, and challenges going forward. The Bridge, 44(4), 6–11.
Shortliffe, E. H., & Cimino, J. J. (2006). Biomedical informatics: Computer applications in health care and biomedicine (3rd ed.). Springer.
Sloan, E. A., Chiang, J., Villanueva-Meyer, J. E., Alexandrescu, S., et al. (2021). Intracranial mesenchymal tumor with FET-CREB fusion—A unifying diagnosis for the spectrum of intracranial myxoid mesenchymal tumors and angiomatoid fibrous histiocytoma-like neoplasms. Brain Pathology, 31(4), e12918.
Sun, L. L., Chen, G. Q., Xiong, H., & Guo, C. H. (2017). Cluster analysis in data-driven management and decisions. Journal of Management Science and Engineering, 2(4), 227–251.
Sun, L. L., Guo, C. H., Liu, C. R., & Xiong, H. (2017). Fast affinity propagation clustering based on incomplete similarity matrix. Knowledge and Information Systems, 51(3), 941–963.
Sun, L. L., Jin, B., Yang, H. Y., Tong, J. N., Liu, C. R., & Xiong, H. (2019). Unsupervised EEG feature extraction based on echo state network. Information Sciences, 475, 1–17.
Sun, L. L., Liu, C. R., Chen, G. Q., Guo, C. H., Xiong, H., & Xie, Y. M. (2021). Automatic treatment regimen design. IEEE Transactions on Knowledge and Data Engineering, 33(11), 3494–3506.
Sun, L. L., Liu, C. R., Guo, C. H., Xiong, H., & Xie, Y. M. (2016). Data-driven automatic treatment regimen development and recommendation. Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 1865-1874), August 13–17, San Francisco.
Tien, J. M., & Goldschmidt-Clermont, P. J. (2009). Healthcare: A complex service system. Journal of Systems Science and Systems Engineering, 18(3), 257–282.
Topol, E. J. (2015). The patient will see you now: The future of medicine is in your hands. Basic Books.
van Elten, H. J., Sülz, S., van Raaij, E. M., & Wehrens, R. (2022). Big data health care innovations: Performance dashboarding as a process of collective sensemaking. Journal of Medical Internet Research, 24(2), e30201.
Wang, Y. Q., Qian, L. Q., Li, F. Z., & Zhang, L. (2018). A comparative study on shilling detection methods for trustworthy recommendations. Journal of Systems Science and Systems Engineering, 27(4), 458–478.
World Health Organization. (2012). The pursuit of responsible use of medicines: Sharing and learning from country experiences. WHO/EMP/MAR/2012.3, Geneva, Switzerland.
Wright, A. P., Wright, A. T., McCoy, A. B., & Sittig, D. F. (2015). The use of sequential pattern mining to predict next prescribed medications. Journal of Biomedical Informatics, 53, 73–80.
Wu, X. D., Chen, H. H., Wu, G. Q., Liu, J., et al. (2015). Knowledge engineering with big data. IEEE Intelligent Systems, 30(5), 46–55.
Wu, Y. F., Zeng, M., Fei, Z. H., Yu, Y., Wu, F. X., & Li, M. (2022). KAICD: A knowledge attention-based deep learning framework for automatic ICD coding. Neurocomputing, 469, 376–383.
Xu, N., & Tang, X. J. (2018). Generating risk maps for evolution analysis of societal risk events. In J. Chen, Y. Yamada, M. Ryoke, & X. J. Tang (Eds.), Knowledge and systems sciences, KSS 2018, communications in computer and information science (Vol. 949, pp. 115–128). Springer.
Yadav, P., Steinbach, M., Kumar, V., & Simon, G. (2018). Mining electronic health records (EHRs): A survey. ACM Computing Surveys, 50(6), 1–40.
Yang, S., Dong, X., Sun, L. L., Zhou, Y. C., Farneth, R. A., Xiong, H., Burd, R. S., & Marsic, I. (2017). A data-driven process recommender framework. Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 2111–2120), August 13–17, Halifax NS, Canada.
Acknowledgments
This work was supported by the National Natural Science Foundation of China (71771034; 72101236), the Fundamental Funds for the Central Universities of China (DUT21YG108), the Henan Province Medical Science and Technology Research Plan (LHGJ20200279), and the Henan Province Youth Talent Promotion Project (2021HYTP052).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Guo, C., Chen, J. (2023). Big Data Analytics in Healthcare. In: Nakamori, Y. (eds) Knowledge Technology and Systems. Translational Systems Sciences, vol 34. Springer, Singapore. https://doi.org/10.1007/978-981-99-1075-5_2
Download citation
DOI: https://doi.org/10.1007/978-981-99-1075-5_2
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-1074-8
Online ISBN: 978-981-99-1075-5
eBook Packages: Business and ManagementBusiness and Management (R0)