Skip to main content

Federated Learning for Data Mining in Healthcare

  • Chapter
  • First Online:
Federated Learning for IoT Applications

Abstract

Vast developments in technology have occurred in both hardware and software, and now increasing amounts of data related to healthcare is readily available, for example, the data of insurance companies, medical institutions, pharmaceutical industries, patients, and through the personal devices used by people to monitor their health. Data science researchers are taking advantage of this opportunity to use all this data to improve the quality of care delivery. However, it is difficult to generate robust results in healthcare data because the data is private and usually fragmented. It is well known that the patient records owned by the hospitals in electronic health records (EHR) cannot be shared because the data is considered to be sensitive. This makes it difficult to develop effective and best approaches that can be applied to such diverse and sensitive data. Federated learning is a training mechanism for the shared global model that has a centralized server, and it can maintain sensitivity of data at local places where the data originates. It can safeguard the security and can likewise associate the healthcare data sources that are divided. The objective of this overview is to give a survey of federated learning advancements, particularly data mining in medical services. Federated learning empowers preparing a worldwide AI model from information conveyed across different locales, without moving the data. This is especially pertinent in healthcare applications, where data comprises individual, highly sensitive data. Furthermore, data analysis strategies must agree with regulatory guidelines. Although federated learning prevents sharing raw data, it is conceivable to launch privacy attacks on the model parameters that are uncovered during the training process, or on the generated machine learning model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 129.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. L.O. Gostin, National health information privacy: regulations under the health insurance portability and accountability act. JAMA 285(23), 3015–3021 (2001)

    Article  Google Scholar 

  2. P. Hill, The Rationale for Learning Communities and Learning Community Models (ERIC, 1985)

    Google Scholar 

  3. Y. Jin, X. Wei, Y. Liu, Q. Yang, A survey towards federated semi-supervised learning. arXiv:2002.11545 50 (2020)

    Google Scholar 

  4. A.E. Johnson, T.J. Pollard, L. Shen, H.L. Li-wei, M. Feng, M. Ghassemi, B. Moody, P. Szolovits, L.A. Celi, R.G. Mark, Mimic-iii, a freely accessible critical care database. Sci Data 3, 160035 (2016)

    Article  Google Scholar 

  5. P. Kairouz, H.B. McMahan, B. Avent, A. Bellet, M. Bennis, A.N. Bhagoji, K. Bonawitz, Z. Charles, G. Cormode, R. Cummings, et al., Advances and open problems in federated learning. arXiv:1912.04977 (2019)

    Google Scholar 

  6. V. Kulkarni, M. Kulkarni, A. Pant, Survey of personalization techniques for federated learning. arXiv:2003.08673 (2020)

    Google Scholar 

  7. Y. LeCun, Y. Bengio, G. Hinton, Deep learning. Nature 521(7553), 436–444 (2015)

    Article  Google Scholar 

  8. T. Li, A.K. Sahu, M. Zaheer, M. Sanjabi, A. Talwalkar, V. Smith, Federated optimization for heterogeneous networks. arXiv:1812.06127 (2019)

    Google Scholar 

  9. W.Y.B. Lim, N.C. Luong, D.T. Hoang, Y. Jiao, Y.C. Liang, Q. Yang, D. Niyato, C. Miao, Federated learning in mobile edge networks: a comprehensive survey. arXiv:1909.11875 (2019)

    Google Scholar 

  10. J. Xu, F. Wang, Federated learning for healthcare informatics. arXiv:1911.06270[] (2019)

    Google Scholar 

  11. D. Rehak, P. Dodds, L. Lannom, A model and infrastructure for federated learning content repositories, in Interoperability of web-based educational systems workshop, vol. 143, (Citeseer, 2005)

    Google Scholar 

  12. B.S. Glicksberg, K.W. Johnson, J.T. Dudley, The next generation of precision medicine: obser-vational studies, electronic health records, biobanks and continuous monitoring. Hum Mol Genet 27(R1), R56–R62[30] (2018)

    Article  Google Scholar 

  13. P.B. Jensen, L.J. Jensen, S. Brunak, Mining electronic health records: towards better research applications and clinical care. Nat Rev Genet 13(6), 395–405[47] (2012)

    Article  Google Scholar 

  14. R. Miotto, F. Wang, S. Wang, X. Jiang, J.T. Dudley, Deep learning for healthcare: review, opportunities and challenges. Brief Bioinformatics 19(6), 1236–1246[72] (2018)

    Article  Google Scholar 

  15. Z. Obermeyer, B. Powers, C. Vogeli, S. Mullainathan, Dissecting racial bias in an algorithm used to manage the health of populations. Science 366(6464), 447–453[76] (2019)

    Article  Google Scholar 

  16. G. Hripcsak, J.D. Duke, N.H. Shah, C.G. Reich, V. Huser, M.J. Schuemie, M.A. Suchard, R.W. Park, I.C.K. Wong, P.R. Rijnbeek, et al., Observational health data sciences and informatics (ohdsi):opportunities for observational researchers. Stud Health Technol Inform 216, 574[44] (2015)

    Google Scholar 

  17. S. Boughorbel, F. Jarray, N. Venugopal, S. Moosa, H. Elhadi, M. Makhlouf, Federated uncertainty-aware learning for distributed hospital ehr data. arXiv:1910.12191[9] (2019)

    Google Scholar 

  18. R. Duan, M.R. Boland, Z. Liu, Y. Liu, H.H. Chang, H. Xu, H. Chu, C.H. Schmid, C.B. Forrest, J.H. Holmes, et al., Learning from electronic health records across multiple sites: a communication-efficient and privacy-preserving distributed algorithm. J Am Med Inform Assoc 27(3), 376–385[25] (2020)

    Article  Google Scholar 

  19. J. Gruendner, T. Schwachhofer, P. Sippl, N. Wolf, M. Erpenbeck, C. Gulden, L.A. Kapsner, J. Zierk, S. Mate, M. Sturzl, et al., KETOS: Clinical decision support and machine learning as a service–A training and deployment platform based on Docker, OMOP-CDM, and FHIR Web Services. PloS one 14(10), 1–16[34] (2019)

    Article  Google Scholar 

  20. L. Huang, D. Liu, Patient clustering improves efficiency of federated machine learning to predict mortality and hospital stay time using distributed electronic medical records. arXiv:1903.09296[45] (2019)

    Google Scholar 

  21. Z. Li, K. Roberts, X. Jiang, Q. Long, Distributed learning from multiple ehr databases:Contextual embedding models for medical events. J Biomed Inform 92, 103138[65] (2019)

    Article  Google Scholar 

  22. P.V. Raja, E. Sivasankar, Modern framework for distributed healthcare data analytics based on hadoop, in Information and communication technology-EurAsia conference, (Springer, 2014), pp. 348–355[82]

    Chapter  Google Scholar 

  23. J. Lee, J. Sun, F. Wang, S. Wang, C.H. Jun, X. Jiang, Privacy-preserving patient similarity learning in a federated environment: development and analysis. JMIR Medical Informatics 6(2), e20[62] (2018)

    Article  Google Scholar 

  24. Y. Kim, J. Sun, H. Yu, X. Jiang, Federated tensor factorization for computational phenotyping, in Proceedings of the 23rd ACM SIGKDD International conference on knowledge discovery and data mining, (ACM, 2017), pp. 887–895

    Chapter  Google Scholar 

  25. D. Liu, D. Dligach, T. Miller, Two-stage federated phenotyping and patient representation learning. arXiv:1908.05596[67] (2019)

    Google Scholar 

  26. T.S. Brisimi, R. Chen, T. Mela, A. Olshevsky, I.C. Paschalidis, W. Shi, Federated learning of predictive models from federated electronic health records. Int J Med Inform 112, 59–67[10] (2018)

    Article  Google Scholar 

  27. P. Sharma, F.E. Shamout, D.A. Clifton, Preserving patient privacy while training a predictive model of in-hospital mortality. arXiv:1912.00354[90] (2019)

    Google Scholar 

  28. P. Vepakomma, O. Gupta, T. Swedish, R. Raskar, Split learning for health: distributed deep learning without sharing raw patient data. arXiv:1812.00564[103] (2018)

    Google Scholar 

  29. O. Gupta, R. Raskar, Distributed learning of deep neural network over multiple agents. J Netw Comput Appl 116, 1–8 (2018); [3] S. Silva, B. Gutman, E. Romero, P.M. Thompson, A. Altmann, M. Lorenzi, Federated learning in distributed medical databases: meta-analysis of large-scale subcortical brain data. arXiv:1810.08553[93] (2018); E. Tramel, Federated learning: rewards & challenges of distributed private ml. ccessed May 28, 2019[99] (2019)

    Google Scholar 

  30. S.R. Pfohl, A. M. Dai, K. Heller, Federated and differentially private learning for electronic health records (2019)

    Google Scholar 

  31. J. Konecny, H.B. McMahan, D. Ramage, P. Richtarik, Federated optimization: distributed machine learning for on-device intelligence. arXiv:1610.02527[57] (2016)

    Google Scholar 

  32. J. Konecny, B. McMahan, D. Ramage, Federated optimization: distributed optimization beyond the datacenter. arXiv:1511.03575[56] (2015)

    Google Scholar 

  33. B. McMahan, E. Moore, D. Ramage, S. Hampson, B.A. Arcas, Communication-efficient learning of deep networks from decentralized data, in Artificial Intelligence and Statistics, (2017), pp. 1273–1282[69]

    Google Scholar 

  34. S. Silva, B. Gutman, E. Romero, P.M. Thompson, A. Altmann, M. Lorenzi, Federated learning in distributed medical databases: meta-analysis of large-scale subcortical brain data. arXiv:1810.08553[29] (2018)

    Google Scholar 

  35. S. Silva, B. Gutman, E. Romero, P.M. Thompson, A. Altmann, M. Lorenzi, Federated learning in distributed medical databases: meta-analysis of large-scale subcortical brain data. arXiv:1810.08553[57] (2018)

    Google Scholar 

  36. R. Gellish, B. Goslin, R. Olson, A. McDonald, G. Russi, V. Moudgil, Longitudinal modeling of the relationship between age and maximal heart rate. Med. Sci. Sports Exerc. 39, 822–829[59] (2007)

    Article  Google Scholar 

  37. R.R. Singh, S. Conjeti, R. Banerjee, An approach for real-time stress-trend detection using physiological signals in wearable computing systems for automotive drivers, in Proceedings of the 14th International IEEE Conference on Intelligent Transportation Systems, (Washington, DC, USA, 5–7 October 2011), pp. 1477–1482.[36]

    Google Scholar 

  38. Y. Mao, W. Chen, Y. Chen, C. Lu, M. Kollef, T. Bailey, An integrated data mining approach to real-time clinical monitoring and deterioration warning, in Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery And Data Mining, (Beijing, China, 16–18 August 2012), pp. 1140–1148.[60]

    Google Scholar 

  39. D. Sow, D. Turaga, M. Schmidt, Mining of sensor data in healthcare: A survey, in Managing and Mining Sensor Data, ed. by C. C. Aggarwal, (Springer, Berlin, Germany, 2013), pp. 459–504.[1]

    Chapter  Google Scholar 

  40. O.D. Lara, M.A. Labrador, A survey on ambient-assisted living tools for older adults. IEEE Commun. Surv. Tutor. 15, 1192–1209[13] (2013)

    Google Scholar 

  41. J. Gialelis, P. Chondros, D. Karadimas, S. Dima, D. Serpanos, Identifying chronic disease complications utilizing state of the art data fusion methodologies and signal processing algorithms, in Wireless Mobile Communication and Healthcare, ed. by K. S. Nikita, J. C. Lin, D. I. Fotiadis, M. T. Arredondo Waldmeyer, vol. 83, (Springer, Berlin, Germany, 2012), pp. 256–263.[29]

    Chapter  Google Scholar 

  42. B. Thakker, A.L. Vyas, Support vector machine for abnormal pulse classification, Int. J.Comput. Appl. 22, 13–19.[32] (2011)

    Google Scholar 

  43. F. Hu, M. Jiang, L. Celentano, Y. Xiao, Robust medical ad hoc sensor networks (MASN) with Wavelet-based ECG data mining. Ad. Hoc. Netw. 6, 986–1012.[62] (2008)

    Google Scholar 

  44. D. Giri, U. Rajendra Acharya, R.J. Martis, S. Vinitha Sree, T.C. Lim, T. Ahamed VI, J.S. Suri, Automated diagnosis of Coronary Artery Disease affected patients using LDA, PCA,ICA and Discrete Wavelet Transform, Know. Based Syst. 37, 274–282.[45] (2013)

    Google Scholar 

  45. C. Cortes, V. Vapnik, Support-vector networks. Mach. Learn. 20, 273–297 (1995)

    MATH  Google Scholar 

  46. C. Bellos, A. Papadopoulos, R. Rosso, D.I. Fotiadis, A support vector machine approach for categorization of patients suffering from chronic diseases, in Wireless Mobile Communication and Healthcare, ed. by K. S. Nikita, J. C. Lin, D. I. Fotiadis, M. T. Arredondo Waldmeyer, vol. 83, (Springer, Berlin, Germany, 2012), pp. 264–267

    Chapter  Google Scholar 

  47. R. Bellazzi, B. Zupan, Predictive data mining in clinical medicine: Current issues and guidelines. Int. J. Med. Inform. 77, 81–97 (2008)

    Article  Google Scholar 

  48. F. Amato, A. Lopez Rodriguez, E.M. Peña-Méndez, P. Vanhara, A. Hampl, J. Havel, Artificial neural networks in medical diagnosis. J Appl. Biomed. 11, 47–58 (2013)

    Article  Google Scholar 

  49. Q. Li, G.D. Clifford, Dynamic time warping and machine learning for signal quality assessment of pulsatile signals. Physiol. Meas. 33, 1491–1501 (2012)

    Article  Google Scholar 

  50. J.A. Lopez-Vallverdù, D. Riano, J.A. Bohada, Improving medical decision trees by combining relevant health-care criteria. Expert Syst. Appl. 39, 11782–11791 (2012)

    Article  Google Scholar 

  51. C.A. Frantzidis, C. Bratsas, M.A. Klados, E. Konstantinidis, C.D. Lithari, A.B. Vivas, C.L. Papadelis, E. Kaldoudi, C. Pappas, P.D. Bamidis, On the classification of emotional biosignals evoked while viewing affective pictures: An integrated data-mining-based approach for healthcare applications. Trans. Inf. Tech. Biomed. 14, 309–318 (2010)

    Article  Google Scholar 

  52. J.Y. Yeh, T.H. Wu, C.W. Tsao, Using data mining techniques to predict hospitalization of hemodialysis patients. Decis. Support Syst. 50, 439–448 (2011)

    Article  Google Scholar 

  53. C. Bellos, A. Papadopoulos, R. Rosso, D.I. Fotiadis, Categorization of patients’ health status in copd disease using a wearable platform and random forests methodology, in Proceedings of the IEEE International Conference on Biomedical and Health Informatics, (Shenzhen, China, 5–7 January 2012), pp. 404–407

    Google Scholar 

  54. F.T. Sun, C. Kuo, H.T. Cheng, S. Buthpitiya, P. Collins, M. Griss, Activity-aware mental stress detection using physiological sensors, in Mobile Computing, Applications, and Services, ed. by M. Gris, G. Yang, vol. 76, (Springer, Berlin, Germany, 2012), pp. 211–230

    Chapter  Google Scholar 

  55. J.Y. Yeh, T.H. Wu, C.W. Tsao, Using data mining techniques to predict hospitalization of hemodialysis patients. Decis. Support Syst. 50, 439–448 (2011)

    Article  Google Scholar 

  56. W. Wang, H. Wang, M. Hempel, D. Peng, H. Sharif, H.H. Chen, Secure stochastic ECG signals based on the gaussian mixture model for e-healthcare systems. IEEE Syst. J. 5, 564–573 (2011)

    Article  Google Scholar 

  57. B.S. Bhati, C.S. Rai, Analysis of support vector machine-based intrusion detection techniques. Arabian J. Sci. Eng. 45(4), 2371–2383 (2020). https://doi.org/10.1007/s13369-019-03970-z

    Article  Google Scholar 

  58. J. Kalagnanam, M. Henrion, A comparison of decision analysis and expert rules for sequential diagnosis. arXiv:1304.2362 (2013)

    Google Scholar 

  59. F.T. Sun, C. Kuo, H.T. Cheng, S. Buthpitiya, P. Collins, M. Griss, Activity-aware mental stress detection using physiological sensors, in Mobile Computing, Applications, and Services, ed. by M. Gris, G. Yang, vol. 76, (Springer, Berlin, Germany, 2012), pp. 211–230

    Chapter  Google Scholar 

  60. K.V.P. Naraharisetti, M. Bawa, M. Tahernezhadi, Comparison of different signal processing methods for reducing artifacts from photoplethysmograph signal, in Proceedings of the IEEE International Conference on Electro/Information Technology, (Mankato, MN, USA, 15–17 May 2011), pp. 1–8

    Google Scholar 

  61. D. Apiletti, E. Baralis, G. Bruno, T. Cerquitelli, Real-time analysis of physiological data to support medical applications. Trans. Info. Tech. Biomed. 13, 313–321 (2009)

    Article  Google Scholar 

  62. G.N. Pradhan, R. Chattopadhyay, S. Panchanathan, Processing body sensor data streams for continuous physiological monitoring, in Proceedings of the International Conference on Multimedia Information Retrieval, (Philadelphia, PA, USA, 29–31 March 2010), pp. 479–486

    Google Scholar 

  63. O. Salem, Y. Liu, A. Mehaoua, A lightweight anomaly detection framework for edical wireless sensor networks, in Proceedings of the IEEE Wireless Communications and Networking Conference, (Shanghai, China, 7–10 April 2013), pp. 4358–4363

    Google Scholar 

  64. L. Rabiner, B.H. Juang, An introduction to hidden Markov models. IEEE ASSP Mag. 3, 4–16 (1986)

    Article  Google Scholar 

  65. A. Hard, K. Rao, R. Mathews, F. Beaufays, S. Augenstein, H. Eichner, C. Kiddon, D. Ramage, Federated learning for mobile keyboard prediction. arXiv:1811.03604 (2018)

    Google Scholar 

  66. S. Samarakoon, M. Bennis, W. Saad, M. Debbah, Federated learning for ultra-reliable low-latency v2v communications, in 2018 IEEE Global Communications Conference (GLOBECOM), (IEEE, 2018), pp. 1–7

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Sharma, S., Kesarwani, A., Maheshwari, S., Rai, B.K. (2022). Federated Learning for Data Mining in Healthcare. In: Yadav, S.P., Bhati, B.S., Mahato, D.P., Kumar, S. (eds) Federated Learning for IoT Applications. EAI/Springer Innovations in Communication and Computing. Springer, Cham. https://doi.org/10.1007/978-3-030-85559-8_16

Download citation

Publish with us

Policies and ethics