Skip to main content

Big Data Analytics in Healthcare

  • Chapter
  • First Online:
Knowledge Technology and Systems

Part of the book series: Translational Systems Sciences ((TSS,volume 34))

Abstract

A vast volume of digitized clinical data has been generated and accumulated rapidly since the widespread adoption of Electronic Medical Records (EMRs). These big data in healthcare hold the promise of propelling healthcare to evolve from a proficiency-based art to data-driven science, from a reactive mode to a proactive mode, and from one-size-fits-all medicine to personalized medicine. This chapter first discusses the research background—big data analytics in healthcare, the research framework of big data analytics in healthcare, analysis of the medical process, and the literature summary of diagnosis-treatment pattern mining. Then the challenges for data-driven typical diagnosis-treatment pattern mining are highlighted, including similarity measures between diagnosis and treatment records, typical diagnosis-treatment pattern extraction, prediction, evaluation, and recommendation, when considering the rich temporal and heterogeneous medical information in EMRs. Furthermore, a data-driven unifying diagnosis identification and prediction method (UDIPM) embedding the disease ontology structure is proposed from EMRs to assist in better coding integration of diagnosis. Three categories of typical treatment patterns are mined from doctor order content, duration, and sequence view respectively, which can provide a data-driven guideline to achieve the “5R” goal for rational drug use and clinical pathways.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Ainsworth, J., & Buchan, I. (2012). COCPIT: A tool for integrated care pathway variance analysis. Studies in Health Technology and Informatics, 180, 995–999.

    Google Scholar 

  • Auffray, C., Chen, Z., & Hood, L. (2009). Systems medicine: The future of medical genomics and healthcare. Genome Medicine, 1(1), 2–2.

    Article  Google Scholar 

  • Bakker, M., & Tsui, K. L. (2017). Dynamic resource allocation for efficient patient scheduling: A data-driven approach. Journal of Systems Science and Systems Engineering, 26(4), 448–462.

    Article  Google Scholar 

  • Bouarfa, L., & Dankelman, J. (2012). Workflow mining and outlier detection from clinical activity logs. Journal of Biomedical Informatics, 45(6), 1185–1190.

    Article  Google Scholar 

  • Bricage, P. (2017). Use of chronolithotherapy for better individual healthcare and welfare. Journal of Systems Science and Systems Engineering, 26(3), 336–358.

    Article  Google Scholar 

  • Chen, G. Q., Wu, G., Gu, Y. D., Lu, B. J., & Wei, Q. (2018). The challenges for big data driven research and applications in the context of managerial decision-making--paradigm shift and research directions. Journal of Management Science in China, 169(7), 6–15; In Chinese.

    Google Scholar 

  • Chen, H. C., Chiang, R. H. L., & Storey, V. C. (2012). Business intelligence and analytics: From big data to big impact. MIS Quarterly, 36(4), 1165–1188.

    Article  Google Scholar 

  • Chen, J. D., Yuan, P. J., Zhou, X. J., & Tang, X. J. (2016). Performance comparison of TF*IDF, LDA and paragraph vector for document classification. In J. Chen, Y. Nakamori, W. Y. Yue, & X. J. Tang (Eds.), Knowledge and systems sciences, KSS 2016, communications in computer and information science (Vol. 660, pp. 225–235). Springer.

    Google Scholar 

  • Chen, J. F., Guo, C. H., Lu, M. L., & Ding, S. Y. (2022). Unifying diagnosis identification and prediction method embedding the disease ontology structure from electronic medical records. Frontiers in Public Health, 9, 793801.

    Article  Google Scholar 

  • Chen, J. F., Guo, C. H., Sun, L. L., & Lu, M. L. (2018). Mining typical drug use patterns based on patient similarity from electronic medical records. In J. Chen, Y. Yamada, M. Ryoke, & X. J. Tang (Eds.), Knowledge and systems sciences, KSS 2018, communications in computer and information science (Vol. 949, pp. 71–86). Springer.

    Google Scholar 

  • Chen, J. F., Guo, C. H., Sun, L. L., & Lu, M. L. (2019). Mining typical treatment duration patterns for rational drug use from electronic medical records. Journal of Systems Science and Systems Engineering, 28(5), 602–620.

    Article  Google Scholar 

  • Chen, J. F., Sun, L. L., Guo, C. H., Wei, W., & Xie, Y. M. (2018). A data-driven framework of typical treatment process extraction and evaluation. Journal of Biomedical Informatics, 83, 178–195.

    Article  Google Scholar 

  • Chen, J. F., Sun, L. L., Guo, C. H., & Xie, Y. M. (2020). A fusion framework to extract typical treatment patterns from electronic medical records. Artificial Intelligence in Medicine, 103, 101782. https://doi.org/10.1016/j.artmed.2019.101782

    Article  Google Scholar 

  • Chen, J. F., Wei, W., Guo, C. H., Tang, L., & Sun, L. L. (2017). Textual analysis and visualization of research trends in data mining for electronic health records. Health Policy and Technology, 6(4), 389–400.

    Article  Google Scholar 

  • Chen, J. G., Li, K. L., Rong, H. G., Bilal, K., Yang, N., & Li, K. Q. (2018). A disease diagnosis and treatment recommendation system based on big data mining and cloud computing. Information Sciences, 435, 124–149.

    Article  Google Scholar 

  • Cho, S. G., & Kim, S. B. (2017). Feature network-driven quadrant mapping for summarizing customer reviews. Journal of Systems Science and Systems Engineering, 26(5), 646–664.

    Article  Google Scholar 

  • Dang, T. T., & Ho, T. B. (2017). Sequence-based measure for assessing drug-side effect causal relation from electronic medical records. In J. Chen, T. Theeramunkong, T. Supnithi, & X. J. Tang (Eds.), Knowledge and systems sciences, KSS 2017, communications in computer and information science (Vol. 780, pp. 53–65). Springer.

    Google Scholar 

  • Diao, X. L., Huo, Y. N., Zhao, S. A., Yuan, J., Cui, M., Wang, Y. X., Lian, X. D., & Zhao, W. (2021). Automated ICD coding for primary diagnosis via clinically interpretable machine learning. International Journal of Medical Informatics, 153, 104543. https://doi.org/10.1016/j.ijmedinf.2021.104543

    Article  Google Scholar 

  • Esfandiari, N., Babavalian, M. R., Moghadam, A. M. E., & Tabar, V. K. (2014). Knowledge discovery in medicine: Current issue and future trend. Expert Systems with Applications, 41(9), 4434–4463.

    Article  Google Scholar 

  • Groves, P., Kayyali, B., Knott, D., & Kuiken, S. V. (2013). The “big data” revolution in healthcare: Accelerating value and innovation. McKinsey Quarterly, 2(3), 1–19.

    Google Scholar 

  • Guo, C. H., & Chen, J. F. (2019). Big data analytics in healthcare: Data-driven methods for typical treatment pattern mining. Journal of Systems Science and Systems Engineering, 28(6), 694–714.

    Article  Google Scholar 

  • Guo, C. H., Du, Z. L., & Kou, X. Y. (2018). Products ranking through aspect-based sentiment analysis of online heterogeneous reviews. Journal of Systems Science and Systems Engineering, 27(5), 542–558.

    Article  Google Scholar 

  • Han, J., Kamber, M., & Pei, J. (2011). Data mining: Concepts and techniques (3rd ed.). Morgan Kaufmann Publishers.

    Google Scholar 

  • Haque, A., Milstein, A., & Fei-Fei, L. (2020). Illuminating the dark spaces of healthcare with ambient intelligence. Nature, 585, 193–202.

    Article  Google Scholar 

  • Herman, J. (1994). The unifying diagnosis. Scandinavian Journal of Primary Health Care, 12(2), 68–69.

    Article  Google Scholar 

  • Hey, T., Tansley, S., & Tolle, K. (2009). The fourth paradigm: Data-intensive scientific discovery. Microsoft Research.

    Google Scholar 

  • Hirano, S., & Tsumoto, S. (2014). Mining typical order sequences from EHR for building clinical pathways. In W. C. Pend et al. (Eds.), Trend and applications in knowledge discovery and data mining, LNAI 8643, PAKDD 2014 (pp. 39–49). Springer Charm.

    Google Scholar 

  • Hoang, K. H., & Ho, T. B. (2019). Learning and recommending treatments using electronic medical records. Knowledge-Based Systems, 181, 104788. https://doi.org/10.1016/j.knosys.2019.05.031

    Article  Google Scholar 

  • Hopp, W. J., Li, J., & Wang, G. H. (2018). Big data and the precision medicine revolution. Production and Operations Management, 27(9), 1647–1664.

    Article  Google Scholar 

  • Htun, H. H., & Sornlertlamvanich, V. (2017). Concept name similarity measure on SNOMED CT. In J. Chen, T. Theeramunkong, T. Supnithi, & X. J. Tang (Eds.), Knowledge and systems sciences, KSS 2017, communications in computer and information science (Vol. 780, pp. 76–90). Springer.

    Google Scholar 

  • Huang, Z. X., Dong, W., Bath, P., Ji, L., & Duan, H. L. (2015). On mining latent treatment patterns from electronic medical records. Data Mining and Knowledge Discovery, 29(4), 914–949.

    Article  Google Scholar 

  • Huang, Z. X., Dong, W., Ji, L., Gan, C. X., Lu, X. D., & Duan, H. L. (2014). Discovery of clinical pathway patterns from event logs using probabilistic topic models. Journal of Biomedical Informatics, 47, 39–57.

    Article  Google Scholar 

  • Huang, Z. X., Lu, X. D., Duan, H. L., & Fan, W. (2013). Summarizing clinical pathways from event logs. Journal of Biomedical Informatics, 46(1), 111–127.

    Article  Google Scholar 

  • Jensen, P. B., Jensen, L. J., & Brunak, S. (2012). Mining electronic health records: Towards better research applications and clinical care. Nature Reviews Genetics, 13(6), 395–405.

    Article  Google Scholar 

  • Ji, G. J., Hu, L. M., & Tan, K. H. (2017). A study on decision-making of food supply chain based on big data. Journal of Systems Science and Systems Engineering, 26(2), 183–198.

    Article  Google Scholar 

  • Jin, B., Yang, H. Y., Sun, L. L., Liu, C. R., Qu, Y., & Tong, J. N. (2018). A treatment engine by predicting next-period prescriptions. Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining (pp. 1608–1616), 19–23 August, London.

    Google Scholar 

  • Johnson, A. E. W., Pollard, T. J., Shen, L., Lehman, L. W. H., et al. (2016). MIMIC-III, a freely accessible critical care database. Scientific Data, 3, 160035. https://doi.org/10.1038/sdata.2016.35

    Article  Google Scholar 

  • Lakshmanan, G. T., Rozsnyai, S., & Wang, F. (2013). Investigating clinical care pathways correlated with outcomes. In F. Daniel, J. Wang, & B. Weber (Eds.), Business process management (Lecture notes in computer science) (Vol. 8094, pp. 323–338). Springer.

    Chapter  Google Scholar 

  • Li, X., Mei, J., Liu, H. F., Yu, Y. Q., Xie, G. T., Hu, J. Y., & Wang, F. (2015). Analysis of care pathway variation patterns in patient records. Studies in Health Technology & Informatics, 210, 692–696.

    Google Scholar 

  • Liang, J. J., Goodsell, K., Grogan, M., & Ackerman, M. J. (2016). LMNA-mediated arrhythmogenic right ventricular cardiomyopathy and Charcot-Marie-tooth type 2B1: A patient-discovered unifying diagnosis. Journal of Cardiovascular Electrophysiology, 27(7), 868–871.

    Article  Google Scholar 

  • Lynch, C. A. (2008). Big data: how do your data grow? Nature, 455(7209), 28–29.

    Article  Google Scholar 

  • Malhi, G. S., Bell, E., Boyce, P., Mulder, R., & Porter, R. J. (2020). Unifying the diagnosis of mood disorders. Australian & New Zealand Journal of Psychiatry, 54(6), 561–565.

    Article  Google Scholar 

  • Mans, R., Schonenberg, H., Leonardi, G., Panzarasa, S., Cavallini, A., Quaglini, S., & Van Der Aalst, W. (2008). Process mining techniques: An application to stroke care. Studies in Health Technology and Informatics, 136, 573–578.

    Google Scholar 

  • Mayer-Schönberger, V., & Cukier, K. (2013). Big data: A revolution that will transform how we live, work, and think. Houghton Mifflin Harcourt.

    Google Scholar 

  • Miller, K. (2012). Big data analytics in biomedical research. Biomedical Computation Review, 2, 14–21.

    Google Scholar 

  • MIT Critical Data. (2016). Secondary analysis of electronic health records. Springer.

    Book  Google Scholar 

  • Naeem, M., Jamal, T., Diaz-Martinez, J., Butt, S. A., Montesano, N., et al. (2022). Trends and future perspective challenges in big data. In J. S. Pan, V. E. Balas, & C. M. Chen (Eds.), Advances in intelligent data analysis and applications, LNCS12080, IDA2020 (pp. 309–325). Springer.

    Google Scholar 

  • Niaksu, O. (2015). CRISP data mining methodology extension for medical domain. Baltic Journal of Modern Computing, 3(2), 92–109.

    Google Scholar 

  • Perer, A., Wang, F., & Hu, J. Y. (2015). Mining and exploring care pathways from electronic medical records with visual analytics. Journal of Biomedical Informatics, 56, 369–378.

    Article  Google Scholar 

  • Rebuge, Á., & Ferreira, D. R. (2012). Business process analysis in healthcare environments: A methodology based on process mining. Information Systems, 37(2), 99–116.

    Article  Google Scholar 

  • Sareen, J., Olafson, K., Kredentser, M. S., Bienvenu, O. J., Blouw, M., et al. (2020). The 5-year incidence of mental disorders in a population-based ICU survivor cohort. Critical Care Medicine, 48(8), e675–e683.

    Article  Google Scholar 

  • Shi, Y. (2014). Big data history, current status, and challenges going forward. The Bridge, 44(4), 6–11.

    Google Scholar 

  • Shortliffe, E. H., & Cimino, J. J. (2006). Biomedical informatics: Computer applications in health care and biomedicine (3rd ed.). Springer.

    Book  Google Scholar 

  • Sloan, E. A., Chiang, J., Villanueva-Meyer, J. E., Alexandrescu, S., et al. (2021). Intracranial mesenchymal tumor with FET-CREB fusion—A unifying diagnosis for the spectrum of intracranial myxoid mesenchymal tumors and angiomatoid fibrous histiocytoma-like neoplasms. Brain Pathology, 31(4), e12918.

    Article  Google Scholar 

  • Sun, L. L., Chen, G. Q., Xiong, H., & Guo, C. H. (2017). Cluster analysis in data-driven management and decisions. Journal of Management Science and Engineering, 2(4), 227–251.

    Article  Google Scholar 

  • Sun, L. L., Guo, C. H., Liu, C. R., & Xiong, H. (2017). Fast affinity propagation clustering based on incomplete similarity matrix. Knowledge and Information Systems, 51(3), 941–963.

    Article  Google Scholar 

  • Sun, L. L., Jin, B., Yang, H. Y., Tong, J. N., Liu, C. R., & Xiong, H. (2019). Unsupervised EEG feature extraction based on echo state network. Information Sciences, 475, 1–17.

    Article  Google Scholar 

  • Sun, L. L., Liu, C. R., Chen, G. Q., Guo, C. H., Xiong, H., & Xie, Y. M. (2021). Automatic treatment regimen design. IEEE Transactions on Knowledge and Data Engineering, 33(11), 3494–3506.

    Article  Google Scholar 

  • Sun, L. L., Liu, C. R., Guo, C. H., Xiong, H., & Xie, Y. M. (2016). Data-driven automatic treatment regimen development and recommendation. Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 1865-1874), August 13–17, San Francisco.

    Google Scholar 

  • Tien, J. M., & Goldschmidt-Clermont, P. J. (2009). Healthcare: A complex service system. Journal of Systems Science and Systems Engineering, 18(3), 257–282.

    Article  Google Scholar 

  • Topol, E. J. (2015). The patient will see you now: The future of medicine is in your hands. Basic Books.

    Google Scholar 

  • van Elten, H. J., Sülz, S., van Raaij, E. M., & Wehrens, R. (2022). Big data health care innovations: Performance dashboarding as a process of collective sensemaking. Journal of Medical Internet Research, 24(2), e30201.

    Article  Google Scholar 

  • Wang, Y. Q., Qian, L. Q., Li, F. Z., & Zhang, L. (2018). A comparative study on shilling detection methods for trustworthy recommendations. Journal of Systems Science and Systems Engineering, 27(4), 458–478.

    Article  Google Scholar 

  • World Health Organization. (2012). The pursuit of responsible use of medicines: Sharing and learning from country experiences. WHO/EMP/MAR/2012.3, Geneva, Switzerland.

    Google Scholar 

  • Wright, A. P., Wright, A. T., McCoy, A. B., & Sittig, D. F. (2015). The use of sequential pattern mining to predict next prescribed medications. Journal of Biomedical Informatics, 53, 73–80.

    Article  Google Scholar 

  • Wu, X. D., Chen, H. H., Wu, G. Q., Liu, J., et al. (2015). Knowledge engineering with big data. IEEE Intelligent Systems, 30(5), 46–55.

    Article  Google Scholar 

  • Wu, Y. F., Zeng, M., Fei, Z. H., Yu, Y., Wu, F. X., & Li, M. (2022). KAICD: A knowledge attention-based deep learning framework for automatic ICD coding. Neurocomputing, 469, 376–383.

    Article  Google Scholar 

  • Xu, N., & Tang, X. J. (2018). Generating risk maps for evolution analysis of societal risk events. In J. Chen, Y. Yamada, M. Ryoke, & X. J. Tang (Eds.), Knowledge and systems sciences, KSS 2018, communications in computer and information science (Vol. 949, pp. 115–128). Springer.

    Google Scholar 

  • Yadav, P., Steinbach, M., Kumar, V., & Simon, G. (2018). Mining electronic health records (EHRs): A survey. ACM Computing Surveys, 50(6), 1–40.

    Article  Google Scholar 

  • Yang, S., Dong, X., Sun, L. L., Zhou, Y. C., Farneth, R. A., Xiong, H., Burd, R. S., & Marsic, I. (2017). A data-driven process recommender framework. Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 2111–2120), August 13–17, Halifax NS, Canada.

    Google Scholar 

Download references

Acknowledgments

This work was supported by the National Natural Science Foundation of China (71771034; 72101236), the Fundamental Funds for the Central Universities of China (DUT21YG108), the Henan Province Medical Science and Technology Research Plan (LHGJ20200279), and the Henan Province Youth Talent Promotion Project (2021HYTP052).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jingfeng Chen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Guo, C., Chen, J. (2023). Big Data Analytics in Healthcare. In: Nakamori, Y. (eds) Knowledge Technology and Systems. Translational Systems Sciences, vol 34. Springer, Singapore. https://doi.org/10.1007/978-981-99-1075-5_2

Download citation

Publish with us

Policies and ethics