A survey of issues and solutions of health data management systems

  • Anindita Sarkar MondalEmail author
  • Sarmistha Neogy
  • Nandini Mukherjee
  • Samiran Chattopadhyay
Review Article


In the recent era, data science plays an important role in the health-care domain to provide a cost-effective and better treatment procedure. To achieve this goal, the data management system has a huge contribution by controlling, arranging, storing and preprocessing a large volume of health dataset. Already there are a lot of investigation and designing of different approaches to support the big data applications in different domain. Still, management of big data is a challenging task for the data scientist due to the complex characteristics of data and demands of the application. In this survey paper, we discuss the occurring challenges and it’s possible solutions by considering the entities related to data services. It will help the data scientist to understand the supporting parameters of data storage system for designing big data management system.


Big data management system Health-care domain Data application Issues and challenges Data storage 



  1. 1.
    (Last access 2018) Amazon s3.
  2. 2.
    (Last access 2018) Cassandra.
  3. 3.
    (Last access 2018) Couchdb.
  4. 4.
    (Last access 2018) Disaster definitions. In: Public health guide for emergencies, pp 24–43Google Scholar
  5. 5.
    (Last access 2018) Hbase.
  6. 6.
  7. 7.
    (Last access 2018) Microsoft healthvault.
  8. 8.
    (Last access 2018) Mongodb.
  9. 9.
    (Last access 2018) Voltdb.
  10. 10.
    Abrahams J (2011) Disaster risk management for health: overview. In: Global platform, developed by the World Health Organization, United Kingdom Health Protection Agency and partners, 6 pGoogle Scholar
  11. 11.
    Amazon R (2016) Amazon relational database service (Amazon RDS)Google Scholar
  12. 12.
    Anderson JC, Lehnardt J, Slater N (2010) CouchDB: the definitive guide: time to relax. O’Reilly Media Inc, SebastopolGoogle Scholar
  13. 13.
    Atzeni P, Bugiotti F, Rossi L (2012) Uniform access to non-relational database systems: The SOS platform. In: advanced information systems engineering, Springer, New York, pp 160–174Google Scholar
  14. 14.
    Brown SJ (2012) Networked remote patient monitoring with handheld devices. US Patent 8,249,894Google Scholar
  15. 15.
    Brumley R, Enguidanos S, Jamison P, Seitz R, Morgenstern N, Saito S, McIlwane J, Hillary K, Gonzalez J (2007) Increased satisfaction with care and lower costs: results of a randomized trial of in-home palliative care. J Am Geriatr Soc 55(7):993–1000CrossRefGoogle Scholar
  16. 16.
    Buffington J (2010) Microsoft SQL server, chapter 8. In: Data protection for virtual data centers. Wiley, pp 267–315Google Scholar
  17. 17.
    Bugiotti F, Cabibbo L (2013) An object-datastore mapper supporting nosql database design.
  18. 18.
    Bugiotti F, Cabibbo L, Atzeni P, Torlone R (2013) A logical approach to nosql databases.
  19. 19.
    Cabibbo L (2013) ONDM (Object-NoSQL Datastore Mapper). Faculty of Engineering, Roma TRE University Retrieved June 15thGoogle Scholar
  20. 20.
    Chang F, Dean J, Ghemawat S, Hsieh W, Wallach D, Burrows M, Chandra T, Fikes A, Gruber R (2006) Bigtable: a distributed structured data storage system. In: 7th OSDI, pp 305–314Google Scholar
  21. 21.
    Chen PM, Lee EK, Gibson GA, Katz RH, Patterson DA (1994) Raid: high-performance, reliable secondary storage. ACM Computing Surveys (CSUR) 26:145–185CrossRefGoogle Scholar
  22. 22.
    Cooper BF, Ramakrishnan R, Srivastava U, Silberstein A, Bohannon P, Jacobsen HA, Puz N, Weaver D, Yerneni R (2008) Pnuts: Yahoo!’s hosted data serving platform. Proc VLDB Endow 1(2):1277–1288CrossRefGoogle Scholar
  23. 23.
    Curé O, Kerdjoudj F, Duc CL, Lamolle M, Faye D (2012) On the potential integration of an ontology-based data access approach in NoSQL stores. In: 2012 Third international conference on emerging intelligent data and web technologies (EIDWT), pp 166–173Google Scholar
  24. 24.
    Curé O, Lamolle M, Duc CL (2013) Ontology based data integration over document and column family oriented NoSQL. arXiv preprint arXiv:13072603
  25. 25.
    Date CJ, White CJ (1989) A guide to DB2. Addison Wesley Publishing Company, BostonGoogle Scholar
  26. 26.
    Dean J, Ghemawat S (2008) Mapreduce: simplified data processing on large clusters. Commun ACM 51(1):107–113CrossRefGoogle Scholar
  27. 27.
    DeCandia G, Hastorun D, Jampani M, Kakulapati G, Lakshman A, Pilchin A, Sivasubramanian S, Vosshall P, Vogels W (2007) Dynamo: Amazon’s highly available key-value store. In: ACM SIGOPS operating systems review, ACM, vol 41Google Scholar
  28. 28.
    Erling O (2012) Virtuoso, a hybrid RDBMS/graph column store. IEEE Data Eng Bull 35:3–8Google Scholar
  29. 29.
    Fan W, Huai JP (2014) Querying big data: bridging theory and practice. J Comput Sci Technol 29:849–869MathSciNetCrossRefGoogle Scholar
  30. 30.
    Fernández-Alemán JL, Señor IC, Lozoya PÁO, Toval A (2013) Security and privacy in electronic health records: a systematic literature review. J Biomed Inform 46(3):541–562CrossRefGoogle Scholar
  31. 31.
    Fichman RG, Kohli R, Krishnan R (2011) The role of information systems in healthcare: current research and future trends. Inf Syst Res 22:419–428CrossRefGoogle Scholar
  32. 32.
    Gaonkar PE, Bojewar S, Das JA (2013) A survey: data storage technologies. Int J Eng Sci Innov Technol 2(2):547–554Google Scholar
  33. 33.
    Ghemawat S, Gobioff H, Leung ST (2003) The google file system. In: ACM SIGOPS operating systems review, ACM, vol 37Google Scholar
  34. 34.
    Ginsberg J, Mohebbi MM, Patel RS, Brammer L, Smolinski MS, Brilliant L (2009) Detecting influenza epidemics using search engine query data. Nature 457(7232):1012–1014CrossRefGoogle Scholar
  35. 35.
    Greenspan J, Bulger B (2001) MySQL/PHP database applications. Wiley, New YorkzbMATHGoogle Scholar
  36. 36.
    Groves P, Kayyali B, Knott D, Kuiken SV (2013) The ‘big data’ revolution in healthcare. McKinsey Quarterly, SeattleGoogle Scholar
  37. 37.
    Härder T (1984) Observations on optimistic concurrency control schemes. Inf Syst 9:111–120CrossRefGoogle Scholar
  38. 38.
    Harris S, Seaborne A, Prud’hommeaux E (2013) SPARQL 1.1 query language. W3C recommendation 21Google Scholar
  39. 39.
    Hassanalieragh M, Page A, Soyata T, Sharma G, Aktas M, Mateos G, Kantarci B, Andreescu S (2015) Health monitoring and management using internet-of-things (IoT) sensing with cloud-based processing: Opportunities and challenges. In: 2015 IEEE international conference on services computing, IEEE, pp 285–292Google Scholar
  40. 40.
    Hermon R, Williams PAH (2014) Big data in healthcare: what is it used for? In: Australian eHealth informatics and security conference, pp 40–49Google Scholar
  41. 41.
    Huang T, Lan L, Fang X, An P, Min J, Wang F (2015) Promises and challenges of big data computing in health sciences. Big Data Res 2(1):2–11CrossRefGoogle Scholar
  42. 42.
    Jennings B (2008) disaster planning and public health. In: From Birth to Death and Bench to Clinic: the hastings center bioethics briefing book for journalists, policymakers, and Campaigns, The Hastings Center, Garrison, NY pp 41–44Google Scholar
  43. 43.
    Ji Z, Ganchev I, O’Droma M, Zhang X, Zhang X (2014) A cloud-based x73 ubiquitous mobile healthcare system: design and implementation. Sci World J 2014:145803Google Scholar
  44. 44.
    Kiran KV, Vijayakumar R (2014) Ontology based data integration of NoSQL datastores. In: 2014 9th international conference on industrial and information systems (ICIIS), IEEE, pp 1–6Google Scholar
  45. 45.
    Kaur K, Rani R (2015) Managing data in healthcare information systems: many models, one solution. Computer 48(3):52–59CrossRefGoogle Scholar
  46. 46.
    Kulkarni G, Sutar R, Gambhir J (2012) Cloud computing-infrastructure as service-amazon ec2. Int J Eng Res Appl 2(1):117–125Google Scholar
  47. 47.
    Kung HT, Robinson JT (1981) On optimistic methods for concurrency control. ACM Trans Database Syst (TODS) 6(2):213–226CrossRefGoogle Scholar
  48. 48.
    Kwong T, O’Brien A, Kwong Q, Hill K, Haswell J (2009) Medical communication skills and law made easy: the patient-centred approach. Elsevier, AmsterdamGoogle Scholar
  49. 49.
    Lin Y, Agrawal D, Chen C, Ooi BC, Wu S (2011) Llama: leveraging columnar storage for scalable join processing in the mapreduce framework. In: Proceedings of the 2011 ACM SIGMOD international conference on management of data, ACM, pp 961–972Google Scholar
  50. 50.
    Lubbers C, Elkington S, Hess R, Sicola SJ, McCarty J, Korgaonkar A, Leveille J (2005) Flexible data replication mechanism. US Patent 6,947,981Google Scholar
  51. 51.
    Madden S (2012) From databases to big data. IEEE Intern Comput 16(3):4–6CrossRefGoogle Scholar
  52. 52.
    Matthew N, Stones R (2005) Beginning databases with PostgreSQL: from novice to professional. Apress, 664 pGoogle Scholar
  53. 53.
    Miller FP, Vandome AF, McBrewster J (2010) Amazon web services. Alpha PressGoogle Scholar
  54. 54.
    Muro S, Kameda T, Minoura T (1984) Multi-version concurrency control scheme for a database system. J Comput Syst Sci 29:207–224CrossRefzbMATHGoogle Scholar
  55. 55.
  56. 56.
    Nicolae B (2010) Blobseer: Towards efficient data storage management for large-scale, distributed systems. PhD thesis, Université Rennes 1Google Scholar
  57. 57.
    Paksula M (2010) Persisting objects in redis key-value database. University of Helsinki, Department of Computer Science, HelsinkiGoogle Scholar
  58. 58.
    Palankar MR, Iamnitchi A, Ripeanu M, Garfinkel S (2008) Amazon s3 for science grids: a viable solution? In: Proceedings of the 2008 international workshop on data-aware distributed computing, New York, NYGoogle Scholar
  59. 59.
    Paul M, Das A (2017) Health informatics as a service (HIAAS) for developing countries. In: Internet of things and big data technologies for next generation healthcare. Springer, New York, pp 251–279Google Scholar
  60. 60.
    Proctor S (2013) Exploring the architecture of the NuoDB database, part 1. Dosegljivo NaGoogle Scholar
  61. 61.
    Roijackers J, Fletcher GH (2013) On bridging relational and document-centric data stores. In: Big Data, Springer, New York, pp 135–148Google Scholar
  62. 62.
    Rolison JJ, Hanoch Y, Wood S, Liu PJ (2014) Risk-taking differences across the adult life span: a question of age and domain. J Gerontol Ser B: Psychol Sci Soc Sci 69(6):870–880CrossRefGoogle Scholar
  63. 63.
    Rosenthal B (2006) Method and system for providing low cost, readily accessible healthcare. US Patent App. 11/105,220Google Scholar
  64. 64.
    Rossi R, Hirama K (2015) Characterizing big data management. Issues Inform Sci Inf Technol 12:165–180CrossRefGoogle Scholar
  65. 65.
    Russom P (2013) Managing big data. TDWI Research TDWI Best Practices ReportGoogle Scholar
  66. 66.
    Sawarkar S (2013) Remote healthcare solution.
  67. 67.
    Shih KY, Srinivasan U (2003) Method and system for data replication. US Patent 6,615,223Google Scholar
  68. 68.
    Shvachko K, Kuang H, Radia S, Chansler R (2010) The hadoop distributed file system. In: 2010 IEEE 26th symposium on mass storage systems and technologies (MSST), pp 1–10Google Scholar
  69. 69.
    Silva LAB, Costa C, Oliveira JL (2012) A pacs archive architecture supported on cloud services. Int J Comput Assist Radiol Surg 7(3):349–358CrossRefGoogle Scholar
  70. 70.
    Sivasubramanian S (2012) Amazon dynamodb: a seamlessly scalable non-relational database service. In: Proceedings of the 2012 ACM SIGMOD international conference on management of data, ACM, pp 729–730Google Scholar
  71. 71.
    Skourletopoulos G, Mavromoustakis CX, Mastorakis G, Batalla JM, Dobre C, Panagiotakis S, Pallis E (2017) Big data and cloud computing: a survey of the state-of-the-art and research challenges. In: Advances in mobile cloud computing and big data in the 5G Era, Springer, New York, pp 23–41Google Scholar
  72. 72.
    Tran VT, Narayanan D, Antoniu G, Bougé L (2012) Dstore: an in-memory document-oriented store. PhD thesis, INRIAGoogle Scholar
  73. 73.
    Vernica R, Carey MJ, Li C (2010) Efficient parallel set-similarity joins using mapreduce. In: Proceedings of the 2010 ACM SIGMOD international conference on management of data, ACM, pp 495–506Google Scholar
  74. 74.
    West KG, Moon JB, Colquitt NL, Weiner HS, Petersen EG, Howell WH (2003) Patient monitoring system. US Patent 6,544,174Google Scholar
  75. 75.
    White T (2012) Hadoop: the definitive guide. O’Reilly Media Inc, SebastopolGoogle Scholar
  76. 76.
    Wu L, Yuan L, You J (2015) Survey of large-scale data management systems for big data applications. J Comput Sci Technol 30(1):163–183CrossRefGoogle Scholar

Copyright information

© Springer-Verlag London Ltd., part of Springer Nature 2019

Authors and Affiliations

  1. 1.School of Mobile Computing and CommunicationJadavpur UniversityKolkataIndia
  2. 2.Department of Computer Science and EngineeringJadavpur UniversityKolkataIndia
  3. 3.Department of Information TechnologyJadavpur UniversityKolkataIndia

Personalised recommendations