Journal of Medical Systems

, 41:183 | Cite as

A Systematic Review of Techniques and Sources of Big Data in the Healthcare Sector

  • Susel Góngora Alonso
  • Isabel de la Torre Díez
  • Joel J. P. C. Rodrigues
  • Sofiane Hamrioui
  • Miguel López-Coronado
Systems-Level Quality Improvement
Part of the following topical collections:
  1. Systems-Level Quality Improvement


The main objective of this paper is to present a review of existing researches in the literature, referring to Big Data sources and techniques in health sector and to identify which of these techniques are the most used in the prediction of chronic diseases. Academic databases and systems such as IEEE Xplore, Scopus, PubMed and Science Direct were searched, considering the date of publication from 2006 until the present time. Several search criteria were established as ‘techniques’ OR ‘sources’ AND ‘Big Data’ AND ‘medicine’ OR ‘health’, ‘techniques’ AND ‘Big Data’ AND ‘chronic diseases’, etc. Selecting the paper considered of interest regarding the description of the techniques and sources of Big Data in healthcare. It found a total of 110 articles on techniques and sources of Big Data on health from which only 32 have been identified as relevant work. Many of the articles show the platforms of Big Data, sources, databases used and identify the techniques most used in the prediction of chronic diseases. From the review of the analyzed research articles, it can be noticed that the sources and techniques of Big Data used in the health sector represent a relevant factor in terms of effectiveness, since it allows the application of predictive analysis techniques in tasks such as: identification of patients at risk of reentry or prevention of hospital or chronic diseases infections, obtaining predictive models of quality.


Big data Chronic diseases Data mining Health sector Sources Techniques 



This research has been partially supported by the European Commission and the Ministry of Industry, Energy and Tourism under the project AAL-20125036 named “WetakeCare: ICT- based Solution for (Self-) Management of Daily Living”, by National Funding from the FCT – Fundação para a Ciência e a Tecnologia through the UID/EEA/500008/2013 Project, by the Government of the Russian Federation, Grant 074-U01, and by Finep, with resources from Funttel, Grant No. 01.14.0231.00, under the Centro de Referência em Radiocomunicações - CRR project of the Instituto Nacional de Telecomunicações (Inatel), Brazil.

Compliance with Ethical Standards

Conflict of Interest

The authors declare that they have no competing interests.


  1. 1.
    Philip Chen, C.L., and Zhang, C.Y., Data-intensive applications, challenges, techniques and technologies: A survey on Big Data. Inf. Sci. (Ny). 275:314–347, 2014. Scholar
  2. 2.
    Manuel, J., and Sesmero, M., “Big Data”; aplicación y utilidad para el sistema sanitario. Farm. Hosp. 39(2):69–70, 2015. Google Scholar
  3. 3.
    Garg, N., Singla, S., and Jangra, S., Challenges and techniques for testing of big data. Procedia. Comput. Sci. 85:940–948, 2016.CrossRefGoogle Scholar
  4. 4.
    Tu, C., He, X., Shuai, Z., and Jiang, F., Big data issues in smart grid - A review. Renew. Sust. Energy Rev. 79:1099–1107, 2017.CrossRefGoogle Scholar
  5. 5.
    Khan, S., Liu, X., Shakil, K.A., and Alam, M., A survey on scholarly data: From big data perspective. Inf. Process. Manag. 53(4):923–944, 2017.CrossRefGoogle Scholar
  6. 6.
    Wang, H., Xu, Z., and Pedrycz, W., An overview on the roles of fuzzy set techniques in big data processing: Trends, challenges and opportunities. Knowl.-Based Syst. 118:15–30, 2017.CrossRefGoogle Scholar
  7. 7.
    Merelli, I., Pérez-Sánchez, H., Gesing, S., and D’Agostino, D., Managing, Analysing, and Integrating Big Data in Medical Bioinformatics: Open Problems and Future Perspectives. Biomed. Res. Int., 2014.
  8. 8.
    Belle, A., Thiagarajan, R., Soroushmehr, S.M.R., Navidi, F., Beard, D.A., and Najarian, K., Big Data Analytics in Healthcare. Hindawi Publ. Corp.:1–16, 2015.
  9. 9.
    Alyass, A., Turcotte, M., and Meyre, D., From big data analysis to personalized medicine for all: challenges and opportunities. BMC Med. Genomics. 8(1):33, 2015. CrossRefPubMedPubMedCentralGoogle Scholar
  10. 10.
    Trifiletti, D.M., and Showalter, T.N., Big Data and Comparative Effectiveness Research in Radiation Oncology: Synergy and Accelerated Discovery. Front Oncol. 5:5–9, 2015. Scholar
  11. 11.
    Cunha, J., Silva, C., and Antunes, M., Health Twitter Big Bata Management with Hadoop Framework. Procedia Comput. Sci. 64:425–431, 2015. Scholar
  12. 12.
    O’Driscoll, A., Daugelaite, J., and Sleator, R.D., “Big data”, Hadoop and cloud computing in genomics. J. Biomed. Inform. 46(5):774–781, 2013. Scholar
  13. 13.
    Saravana Kumar, N.M., Eswari, T., Sampath, P., and Lavanya, S., Predictive methodology for diabetic data analysis in big data. Procedia Comput. Sci. 50:203–208, 2015. Scholar
  14. 14.
    Huang, T., Lan, L., Fang, X., An, P., Min, J., and Wang, F., Promises and Challenges of Big Data Computing in Health Sciences. Big Data Res. 2(1):2–11, 2015. Scholar
  15. 15.
    Patel, J. A., Sharma, P., Big data for Better Health Planning. Adv. Eng. Technol. Res. (ICAETR), 2014 Int. Conf. IEEE. 0–4, 2014.Google Scholar
  16. 16.
    Chennamsetty, H., Chalasani, S., Riley, D., Predictive analytics on Electronic Health Records (EHRs) using Hadoop and Hive. Proc. 2015 I.E. Int. Conf. Electr. Comput. Commun. Technol. ICECCT 2015, 2015 1–5, . doi:
  17. 17.
    Grover, A., Gholap, J., Janeja, V. P., et al. SQL-like big data environments: Case study in clinical trial analytics. 2015 I.E. Int. Conf. Big Data (Big Data). 2680–2689, 2015. doi:
  18. 18.
    Payakachat, N., Tilford, J.M., and Ungar, W.J., National Database for Autism Research (NDAR): Big Data Opportunities for Health Services Research and Health Technology Assessment. Pharmacoeconomics. 34(2):127–138, 2016. Scholar
  19. 19.
    Moskowitz, A., McSparron, J., Stone, D.J., and Celi, L.A., Preparing a New Generation of Clinicians for the Era of Big Data. Harvard Med. Student Rev. 2(1):24–27, 2015.Google Scholar
  20. 20.
    Andreu-Perez, J., Poon, C.C.Y., Merrifield, R.D., Wong, S.T.C., and Yang, G.Z., Big Data for Health. IEEE J. Biomed. Heal Informatics. 19(4):1193–1208, 2015. Scholar
  21. 21.
    Rose, P.W., Beran, B., Bi, C., et al., The RCSB Protein Data Bank: redesigned web site and web services. Nucleic Acids Res. 39:392–241, 2011. Scholar
  22. 22.
    Wishart, D.S., Jewison, T., Guo, A.C., et al., HMDB 3.0-The Human Metabolome Database in 2013. Nucleic Acids Res. 41(D1):D801–D807, 2013. Scholar
  23. 23.
    Costa, F.F., Big data in biomedicine. Drug Discov. Today. 19(4):433–440, 2014. Scholar
  24. 24.
    Buchanan, C.C., Torstenson, E.S., Bush, W.S., and Ritchie, M.D., A comparison of cataloged variation between International HapMap Consortium and 1000 Genomes Project data. J. Am. Med. Informatics Assoc. 19(2):289–294, 2012. Scholar
  25. 25.
    Lu, J., Keech, M., Emerging Technologies for Health Data Analytics Research: A Conceptual Architecture. 2015 26th Int. Work Database Expert Syst. Appl. 225–229, 2015. doi:
  26. 26.
    Pérez, G., Peligros del uso de los big data en la investigación en salud pública y en epidemiología Risks of the use of big data in research in public health and. epidemiology. 30(1):66–68, 2016.Google Scholar
  27. 27.
    Nambiar, R., Bhardwaj, R., Sethi, A., Vargheese, R., A look at challenges and opportunities of Big Data analytics in healthcare. Proc - 2013 IEEE Int. Conf. Big Data, Big Data 2013. 17–22, 2013. doi:
  28. 28.
    Young, S.D., A “ big data ” approach to HIV epidemiology and prevention. Prev. Med. (Baltim). 70:17–18, 2015. Scholar
  29. 29.
    Palaniappan, S., Awang, R., Intelligent heart disease prediction system using data mining techniques. 2008 IEEE/ACS Int. Conf. Comput. Syst. Appl. 108–115, 2008. doi:
  30. 30.
    Kunwar, V., Chandel, K., Sabitha, A. S., Bansal, A., Chronic Kidney Disease Analysis Using Data Mining Classification. Cloud Syst. Big Data Eng. (Confluence), 2016 6th Int. Conf. IEEE. 300–305, 2016. doi:
  31. 31.
    Chauhan, R., Kumar, A., Cloud computing for improved healthcare: Techniques, potential and challenges. 2013 E-Health Bioeng. Conf. EHB 2013. 2013.
  32. 32.
    Al-Janabi, S., Patel, A., Fatlawi, H., Kalajdzic, K., Al Shourbaji, I., Empirical rapid and accurate prediction model for data mining tasks in cloud computing environments. 2014 Int. Congr. Technol. Commun. Knowledge, ICTCK 2014. 26–27, 2015.
  33. 33.
    Elsebakhi, E., Lee, F., Schendel, E., et al., Large-scale machine learning based on functional networks for biomedical big data with high performance computing platforms. J. Comput. Sci. 11:69–81, 2015. Scholar
  34. 34.
    Melethadathil, N., Chellaiah, P., Nair, B., Diwakar, S., Classification and clustering for neuroinformatics: Assessing the efficacy on reverse-mapped NeuroNLP data using standard ML techniques. 2015 Int. Conf. Adv. Comput. Commun. Informatics, ICACCI 2015. 1065–1070, 2015. doi:
  35. 35.
    Fouad, M.M., Oweis, N.E., Gaber, T., Ahmed, M., and Snasel, V., Data Mining and Fusion Techniques for WSNs as a Source of the Big Data. Procedia Comput. Sci. 65:778–786, 2015. Scholar
  36. 36.
    Sankaranarayanan, S., Perumal, T. P., A Predictive Approach for Diabetes Mellitus Disease through Data Mining Technologies. 2014 World Congr. Comput. Commun. Technol. 231–233, 2014. doi:
  37. 37.
    Sivagowry, S., Durairaj, M., Persia, A., An empirical study on applying data mining techniques for the analysis and prediction of heart disease. 2013 Int. Conf. Inf. Commun. Embed. Syst. 265–270, 2013. doi:
  38. 38.
    Alfisahrin, S. N. N., Mantoro, T., Data Mining Techniques for Optimization of Liver Disease Classification. 2013 Int. Conf. Adv. Comput. Sci. Appl. Technol. 379–384, 2013. doi:
  39. 39.
    Koppad, S. H., Kumar, A., Application of Big Data Analytics in Healthcare System to Predict COPD. Circuit, Power Comput. Technol. (ICCPCT), 2016 Int. Conf. IEEE. 1–5, 2016.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2017

Authors and Affiliations

  1. 1.Department of Signal Theory and Communications, and Telematics EngineeringUniversity of ValladolidValladolidSpain
  2. 2.National Institute of Telecommunications (Inatel)Santa Rita do SapucaíBrazil
  3. 3.Instituto de TelecomunicaçõesCovilhãPortugal
  4. 4.ITMO UniversitySt. PetersburgRussia
  5. 5.University of Fortaleza (UNIFOR)FortalezaBrazil
  6. 6.Bretagne Loire and Nantes UniversitiesNantesFrance

Personalised recommendations