Abstract
Vast developments in technology have occurred in both hardware and software, and now increasing amounts of data related to healthcare is readily available, for example, the data of insurance companies, medical institutions, pharmaceutical industries, patients, and through the personal devices used by people to monitor their health. Data science researchers are taking advantage of this opportunity to use all this data to improve the quality of care delivery. However, it is difficult to generate robust results in healthcare data because the data is private and usually fragmented. It is well known that the patient records owned by the hospitals in electronic health records (EHR) cannot be shared because the data is considered to be sensitive. This makes it difficult to develop effective and best approaches that can be applied to such diverse and sensitive data. Federated learning is a training mechanism for the shared global model that has a centralized server, and it can maintain sensitivity of data at local places where the data originates. It can safeguard the security and can likewise associate the healthcare data sources that are divided. The objective of this overview is to give a survey of federated learning advancements, particularly data mining in medical services. Federated learning empowers preparing a worldwide AI model from information conveyed across different locales, without moving the data. This is especially pertinent in healthcare applications, where data comprises individual, highly sensitive data. Furthermore, data analysis strategies must agree with regulatory guidelines. Although federated learning prevents sharing raw data, it is conceivable to launch privacy attacks on the model parameters that are uncovered during the training process, or on the generated machine learning model.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
L.O. Gostin, National health information privacy: regulations under the health insurance portability and accountability act. JAMA 285(23), 3015–3021 (2001)
P. Hill, The Rationale for Learning Communities and Learning Community Models (ERIC, 1985)
Y. Jin, X. Wei, Y. Liu, Q. Yang, A survey towards federated semi-supervised learning. arXiv:2002.11545 50 (2020)
A.E. Johnson, T.J. Pollard, L. Shen, H.L. Li-wei, M. Feng, M. Ghassemi, B. Moody, P. Szolovits, L.A. Celi, R.G. Mark, Mimic-iii, a freely accessible critical care database. Sci Data 3, 160035 (2016)
P. Kairouz, H.B. McMahan, B. Avent, A. Bellet, M. Bennis, A.N. Bhagoji, K. Bonawitz, Z. Charles, G. Cormode, R. Cummings, et al., Advances and open problems in federated learning. arXiv:1912.04977 (2019)
V. Kulkarni, M. Kulkarni, A. Pant, Survey of personalization techniques for federated learning. arXiv:2003.08673 (2020)
Y. LeCun, Y. Bengio, G. Hinton, Deep learning. Nature 521(7553), 436–444 (2015)
T. Li, A.K. Sahu, M. Zaheer, M. Sanjabi, A. Talwalkar, V. Smith, Federated optimization for heterogeneous networks. arXiv:1812.06127 (2019)
W.Y.B. Lim, N.C. Luong, D.T. Hoang, Y. Jiao, Y.C. Liang, Q. Yang, D. Niyato, C. Miao, Federated learning in mobile edge networks: a comprehensive survey. arXiv:1909.11875 (2019)
J. Xu, F. Wang, Federated learning for healthcare informatics. arXiv:1911.06270[] (2019)
D. Rehak, P. Dodds, L. Lannom, A model and infrastructure for federated learning content repositories, in Interoperability of web-based educational systems workshop, vol. 143, (Citeseer, 2005)
B.S. Glicksberg, K.W. Johnson, J.T. Dudley, The next generation of precision medicine: obser-vational studies, electronic health records, biobanks and continuous monitoring. Hum Mol Genet 27(R1), R56–R62[30] (2018)
P.B. Jensen, L.J. Jensen, S. Brunak, Mining electronic health records: towards better research applications and clinical care. Nat Rev Genet 13(6), 395–405[47] (2012)
R. Miotto, F. Wang, S. Wang, X. Jiang, J.T. Dudley, Deep learning for healthcare: review, opportunities and challenges. Brief Bioinformatics 19(6), 1236–1246[72] (2018)
Z. Obermeyer, B. Powers, C. Vogeli, S. Mullainathan, Dissecting racial bias in an algorithm used to manage the health of populations. Science 366(6464), 447–453[76] (2019)
G. Hripcsak, J.D. Duke, N.H. Shah, C.G. Reich, V. Huser, M.J. Schuemie, M.A. Suchard, R.W. Park, I.C.K. Wong, P.R. Rijnbeek, et al., Observational health data sciences and informatics (ohdsi):opportunities for observational researchers. Stud Health Technol Inform 216, 574[44] (2015)
S. Boughorbel, F. Jarray, N. Venugopal, S. Moosa, H. Elhadi, M. Makhlouf, Federated uncertainty-aware learning for distributed hospital ehr data. arXiv:1910.12191[9] (2019)
R. Duan, M.R. Boland, Z. Liu, Y. Liu, H.H. Chang, H. Xu, H. Chu, C.H. Schmid, C.B. Forrest, J.H. Holmes, et al., Learning from electronic health records across multiple sites: a communication-efficient and privacy-preserving distributed algorithm. J Am Med Inform Assoc 27(3), 376–385[25] (2020)
J. Gruendner, T. Schwachhofer, P. Sippl, N. Wolf, M. Erpenbeck, C. Gulden, L.A. Kapsner, J. Zierk, S. Mate, M. Sturzl, et al., KETOS: Clinical decision support and machine learning as a service–A training and deployment platform based on Docker, OMOP-CDM, and FHIR Web Services. PloS one 14(10), 1–16[34] (2019)
L. Huang, D. Liu, Patient clustering improves efficiency of federated machine learning to predict mortality and hospital stay time using distributed electronic medical records. arXiv:1903.09296[45] (2019)
Z. Li, K. Roberts, X. Jiang, Q. Long, Distributed learning from multiple ehr databases:Contextual embedding models for medical events. J Biomed Inform 92, 103138[65] (2019)
P.V. Raja, E. Sivasankar, Modern framework for distributed healthcare data analytics based on hadoop, in Information and communication technology-EurAsia conference, (Springer, 2014), pp. 348–355[82]
J. Lee, J. Sun, F. Wang, S. Wang, C.H. Jun, X. Jiang, Privacy-preserving patient similarity learning in a federated environment: development and analysis. JMIR Medical Informatics 6(2), e20[62] (2018)
Y. Kim, J. Sun, H. Yu, X. Jiang, Federated tensor factorization for computational phenotyping, in Proceedings of the 23rd ACM SIGKDD International conference on knowledge discovery and data mining, (ACM, 2017), pp. 887–895
D. Liu, D. Dligach, T. Miller, Two-stage federated phenotyping and patient representation learning. arXiv:1908.05596[67] (2019)
T.S. Brisimi, R. Chen, T. Mela, A. Olshevsky, I.C. Paschalidis, W. Shi, Federated learning of predictive models from federated electronic health records. Int J Med Inform 112, 59–67[10] (2018)
P. Sharma, F.E. Shamout, D.A. Clifton, Preserving patient privacy while training a predictive model of in-hospital mortality. arXiv:1912.00354[90] (2019)
P. Vepakomma, O. Gupta, T. Swedish, R. Raskar, Split learning for health: distributed deep learning without sharing raw patient data. arXiv:1812.00564[103] (2018)
O. Gupta, R. Raskar, Distributed learning of deep neural network over multiple agents. J Netw Comput Appl 116, 1–8 (2018); [3] S. Silva, B. Gutman, E. Romero, P.M. Thompson, A. Altmann, M. Lorenzi, Federated learning in distributed medical databases: meta-analysis of large-scale subcortical brain data. arXiv:1810.08553[93] (2018); E. Tramel, Federated learning: rewards & challenges of distributed private ml. ccessed May 28, 2019[99] (2019)
S.R. Pfohl, A. M. Dai, K. Heller, Federated and differentially private learning for electronic health records (2019)
J. Konecny, H.B. McMahan, D. Ramage, P. Richtarik, Federated optimization: distributed machine learning for on-device intelligence. arXiv:1610.02527[57] (2016)
J. Konecny, B. McMahan, D. Ramage, Federated optimization: distributed optimization beyond the datacenter. arXiv:1511.03575[56] (2015)
B. McMahan, E. Moore, D. Ramage, S. Hampson, B.A. Arcas, Communication-efficient learning of deep networks from decentralized data, in Artificial Intelligence and Statistics, (2017), pp. 1273–1282[69]
S. Silva, B. Gutman, E. Romero, P.M. Thompson, A. Altmann, M. Lorenzi, Federated learning in distributed medical databases: meta-analysis of large-scale subcortical brain data. arXiv:1810.08553[29] (2018)
S. Silva, B. Gutman, E. Romero, P.M. Thompson, A. Altmann, M. Lorenzi, Federated learning in distributed medical databases: meta-analysis of large-scale subcortical brain data. arXiv:1810.08553[57] (2018)
R. Gellish, B. Goslin, R. Olson, A. McDonald, G. Russi, V. Moudgil, Longitudinal modeling of the relationship between age and maximal heart rate. Med. Sci. Sports Exerc. 39, 822–829[59] (2007)
R.R. Singh, S. Conjeti, R. Banerjee, An approach for real-time stress-trend detection using physiological signals in wearable computing systems for automotive drivers, in Proceedings of the 14th International IEEE Conference on Intelligent Transportation Systems, (Washington, DC, USA, 5–7 October 2011), pp. 1477–1482.[36]
Y. Mao, W. Chen, Y. Chen, C. Lu, M. Kollef, T. Bailey, An integrated data mining approach to real-time clinical monitoring and deterioration warning, in Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery And Data Mining, (Beijing, China, 16–18 August 2012), pp. 1140–1148.[60]
D. Sow, D. Turaga, M. Schmidt, Mining of sensor data in healthcare: A survey, in Managing and Mining Sensor Data, ed. by C. C. Aggarwal, (Springer, Berlin, Germany, 2013), pp. 459–504.[1]
O.D. Lara, M.A. Labrador, A survey on ambient-assisted living tools for older adults. IEEE Commun. Surv. Tutor. 15, 1192–1209[13] (2013)
J. Gialelis, P. Chondros, D. Karadimas, S. Dima, D. Serpanos, Identifying chronic disease complications utilizing state of the art data fusion methodologies and signal processing algorithms, in Wireless Mobile Communication and Healthcare, ed. by K. S. Nikita, J. C. Lin, D. I. Fotiadis, M. T. Arredondo Waldmeyer, vol. 83, (Springer, Berlin, Germany, 2012), pp. 256–263.[29]
B. Thakker, A.L. Vyas, Support vector machine for abnormal pulse classification, Int. J.Comput. Appl. 22, 13–19.[32] (2011)
F. Hu, M. Jiang, L. Celentano, Y. Xiao, Robust medical ad hoc sensor networks (MASN) with Wavelet-based ECG data mining. Ad. Hoc. Netw. 6, 986–1012.[62] (2008)
D. Giri, U. Rajendra Acharya, R.J. Martis, S. Vinitha Sree, T.C. Lim, T. Ahamed VI, J.S. Suri, Automated diagnosis of Coronary Artery Disease affected patients using LDA, PCA,ICA and Discrete Wavelet Transform, Know. Based Syst. 37, 274–282.[45] (2013)
C. Cortes, V. Vapnik, Support-vector networks. Mach. Learn. 20, 273–297 (1995)
C. Bellos, A. Papadopoulos, R. Rosso, D.I. Fotiadis, A support vector machine approach for categorization of patients suffering from chronic diseases, in Wireless Mobile Communication and Healthcare, ed. by K. S. Nikita, J. C. Lin, D. I. Fotiadis, M. T. Arredondo Waldmeyer, vol. 83, (Springer, Berlin, Germany, 2012), pp. 264–267
R. Bellazzi, B. Zupan, Predictive data mining in clinical medicine: Current issues and guidelines. Int. J. Med. Inform. 77, 81–97 (2008)
F. Amato, A. Lopez Rodriguez, E.M. Peña-Méndez, P. Vanhara, A. Hampl, J. Havel, Artificial neural networks in medical diagnosis. J Appl. Biomed. 11, 47–58 (2013)
Q. Li, G.D. Clifford, Dynamic time warping and machine learning for signal quality assessment of pulsatile signals. Physiol. Meas. 33, 1491–1501 (2012)
J.A. Lopez-Vallverdù, D. Riano, J.A. Bohada, Improving medical decision trees by combining relevant health-care criteria. Expert Syst. Appl. 39, 11782–11791 (2012)
C.A. Frantzidis, C. Bratsas, M.A. Klados, E. Konstantinidis, C.D. Lithari, A.B. Vivas, C.L. Papadelis, E. Kaldoudi, C. Pappas, P.D. Bamidis, On the classification of emotional biosignals evoked while viewing affective pictures: An integrated data-mining-based approach for healthcare applications. Trans. Inf. Tech. Biomed. 14, 309–318 (2010)
J.Y. Yeh, T.H. Wu, C.W. Tsao, Using data mining techniques to predict hospitalization of hemodialysis patients. Decis. Support Syst. 50, 439–448 (2011)
C. Bellos, A. Papadopoulos, R. Rosso, D.I. Fotiadis, Categorization of patients’ health status in copd disease using a wearable platform and random forests methodology, in Proceedings of the IEEE International Conference on Biomedical and Health Informatics, (Shenzhen, China, 5–7 January 2012), pp. 404–407
F.T. Sun, C. Kuo, H.T. Cheng, S. Buthpitiya, P. Collins, M. Griss, Activity-aware mental stress detection using physiological sensors, in Mobile Computing, Applications, and Services, ed. by M. Gris, G. Yang, vol. 76, (Springer, Berlin, Germany, 2012), pp. 211–230
J.Y. Yeh, T.H. Wu, C.W. Tsao, Using data mining techniques to predict hospitalization of hemodialysis patients. Decis. Support Syst. 50, 439–448 (2011)
W. Wang, H. Wang, M. Hempel, D. Peng, H. Sharif, H.H. Chen, Secure stochastic ECG signals based on the gaussian mixture model for e-healthcare systems. IEEE Syst. J. 5, 564–573 (2011)
B.S. Bhati, C.S. Rai, Analysis of support vector machine-based intrusion detection techniques. Arabian J. Sci. Eng. 45(4), 2371–2383 (2020). https://doi.org/10.1007/s13369-019-03970-z
J. Kalagnanam, M. Henrion, A comparison of decision analysis and expert rules for sequential diagnosis. arXiv:1304.2362 (2013)
F.T. Sun, C. Kuo, H.T. Cheng, S. Buthpitiya, P. Collins, M. Griss, Activity-aware mental stress detection using physiological sensors, in Mobile Computing, Applications, and Services, ed. by M. Gris, G. Yang, vol. 76, (Springer, Berlin, Germany, 2012), pp. 211–230
K.V.P. Naraharisetti, M. Bawa, M. Tahernezhadi, Comparison of different signal processing methods for reducing artifacts from photoplethysmograph signal, in Proceedings of the IEEE International Conference on Electro/Information Technology, (Mankato, MN, USA, 15–17 May 2011), pp. 1–8
D. Apiletti, E. Baralis, G. Bruno, T. Cerquitelli, Real-time analysis of physiological data to support medical applications. Trans. Info. Tech. Biomed. 13, 313–321 (2009)
G.N. Pradhan, R. Chattopadhyay, S. Panchanathan, Processing body sensor data streams for continuous physiological monitoring, in Proceedings of the International Conference on Multimedia Information Retrieval, (Philadelphia, PA, USA, 29–31 March 2010), pp. 479–486
O. Salem, Y. Liu, A. Mehaoua, A lightweight anomaly detection framework for edical wireless sensor networks, in Proceedings of the IEEE Wireless Communications and Networking Conference, (Shanghai, China, 7–10 April 2013), pp. 4358–4363
L. Rabiner, B.H. Juang, An introduction to hidden Markov models. IEEE ASSP Mag. 3, 4–16 (1986)
A. Hard, K. Rao, R. Mathews, F. Beaufays, S. Augenstein, H. Eichner, C. Kiddon, D. Ramage, Federated learning for mobile keyboard prediction. arXiv:1811.03604 (2018)
S. Samarakoon, M. Bennis, W. Saad, M. Debbah, Federated learning for ultra-reliable low-latency v2v communications, in 2018 IEEE Global Communications Conference (GLOBECOM), (IEEE, 2018), pp. 1–7
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Sharma, S., Kesarwani, A., Maheshwari, S., Rai, B.K. (2022). Federated Learning for Data Mining in Healthcare. In: Yadav, S.P., Bhati, B.S., Mahato, D.P., Kumar, S. (eds) Federated Learning for IoT Applications. EAI/Springer Innovations in Communication and Computing. Springer, Cham. https://doi.org/10.1007/978-3-030-85559-8_16
Download citation
DOI: https://doi.org/10.1007/978-3-030-85559-8_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-85558-1
Online ISBN: 978-3-030-85559-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)