The State of Data in Healthcare: Path Towards Standardization


Coupled with the rise of data science and machine learning, the increasing availability of digitized health and wellness data has provided an exciting opportunity for complex analyses of problems throughout the healthcare domain. Whereas many early works focused on a particular aspect of patient care, often drawing on data from a specific clinical or administrative source, it has become clear such a single-source approach is insufficient to capture the complexity of the human condition. Instead, adequately modeling health and wellness problems requires the ability to draw upon data spanning multiple facets of an individual’s biology, their care, and the social aspects of their life. Although such an awareness has greatly expanded the breadth of health and wellness data collected, the diverse array of data sources and intended uses often leave researchers and practitioners with a scattered and fragmented view of any particular patient. As a result, there exists a clear need to catalogue and organize the range of healthcare data available for analysis. This work represents an effort at developing such an organization, presenting a patient-centric framework deemed the Healthcare Data Spectrum (HDS). Comprised of six layers, the HDS begins with the innermost micro-level omics and macro-level demographic data that directly characterize a patient, and extends at its outermost to aggregate population-level data derived from attributes of care for each individual patient. For each level of the HDS, this manuscript will examine the specific types of constituent data, provide examples of how the data aid in a broad set of research problems, and identify the primary terminology and standards used to describe the data.

This is a preview of subscription content, access via your institution.

Fig. 1


  1. 1.

  2. 2.


  1. 1.

    AbouZahr C, Boerma T (2005) Health information systems: the foundations of public health. Bull World Health Organ 83(8):578–583

    Google Scholar 

  2. 2.

    Adashi EY, Geiger HJ, Fine MD (2010) Health care reform and primary care—the growing importance of the community health center. England J Med 362 (22):2047–2050

    Article  Google Scholar 

  3. 3.

    Aiken LH, Clarke SP, Sloane DM (2002) Hospital staffing, organization, and quality of care: cross-national findings. Nurs Outlook 50(5):187–194

    Article  Google Scholar 

  4. 4.

    Alderwick H, Ham C, Buck D (2015) Population health systems. Going beyond integrated care. The King’s Fund

  5. 5.

    Appelboom G, Yang AH, Christophe BR, Bruce EM, Slomian J, Bruyère O., Bruce SS, Zacharia BE, Reginster JY, Connolly ES (2014) The promise of wearable activity sensors to define patient recovery. J Clin Neurosci 21(7):1089–1093

    Article  Google Scholar 

  6. 6.

    Ashley EA (2016) Towards precision medicine. Nat Rev Genet 17(9):507

    Article  Google Scholar 

  7. 7.

    Association AM Genetic testing. Accessed 31 May (2016)

  8. 8.

    Association AM (2007) Current procedural terminology: CPT. American Medical Association

  9. 9.

    Association AP et al. (2013) Diagnostic and Statistical Manual of Mental Disorders (DSM-5). American Psychiatric Pub

  10. 10.

    Association CE Guiding principles on the privacy and security of personal wellness data. Online (2015). Accessed 31 May (2016)

  11. 11.

    Barro AR (1973) Survey and evaluation of approaches to physician performance measurement. Acad Med 48(11):1047–93

    Article  Google Scholar 

  12. 12.

    Berger S (2008) Fundamentals of health care financial management: a practical guide to fiscal issues and activities. Wiley

  13. 13.

    Berwick DM, Nolan TW, Whittington J (2008) The triple aim: care, health, and cost. Health Aff 27(3):759–769

    Article  Google Scholar 

  14. 14.

    Bibb SCG (2007) Issues associated with secondary analysis of population health data. Appl Nurs Res 20(2):94–99

    Article  Google Scholar 

  15. 15.

    Bloomrosen M, Detmer DE (2010) Informatics, evidence-based care, and research; implications for national policy: a report of an american medical informatics association health policy conference. J Am Med Inform Assoc 17(2):115–123

    Article  Google Scholar 

  16. 16.

    Bradley P, Kaplan J (2010) Turning hospital data into dollars: healthcare financial executives can use predictive analytics to enhance their ability to capture charges and identify underpayments. Healthc Financ Manage 64(2):64–69

    Google Scholar 

  17. 17.

    Brazma A, Hingamp P, Quackenbush J, Sherlock G, Spellman P, Stoeckert C, Aach J, Ansorge W, Ball CA, Causton HC et al (2001) Minimum information about a microarray experiment (miame)—toward standards for microarray data. Nat Genet 29(4):365–371

    Article  Google Scholar 

  18. 18.

    Brennan N, Conway PH, Tavenner M (2014) The medicare physician-data release—context and rationale. England J Med 371(2):99–101

    Article  Google Scholar 

  19. 19.

    Brown ML, Riley GF, Potosky AL, Etzioni RD (1999) Obtaining long-term disease specific costs of care: application to medicare enrollees diagnosed with colorectal cancer. Med Care 37(12):1249–1259

    Article  Google Scholar 

  20. 20.

    Bureau UC Census product catalog (2012). Accessed 31 May (2016)

  21. 21.

    Bureau UC Census bureau linkage infrastructure (cbli) (2016). Accessed 31 May (2016)

  22. 22.

    Carroll R, Cnossen R, Schnell M, Simons D (2007) Continua: an interoperable personal healthcare ecosystem. Pervas. Comput. IEEE 6(4):90–94

    Article  Google Scholar 

  23. 23.

    Castle AL, Fiehn O, Kaddurah-Daouk R, Lindon JC (2006) Metabolomics standards workshop and the development of international standards for reporting metabolomics experimental results. Brief Bioinform 7(2):159–165

    Article  Google Scholar 

  24. 24.

    Centers for Medicare & Medicaid Services (1996) The Health Insurance Portability and Accountability Act of 1996 (HIPAA). Online at

  25. 25.

    Chen R, Snyder M (2013) Promise of personalized omics to precision medicine. Wiley Interdiscip Rev Syst Biol Med 5(1):73–82

    Article  Google Scholar 

  26. 26.

    Chervitz SA, Deutsch EW, Field D, Parkinson H, Quackenbush J, Rocca-Serra P, Sansone SA, Stoeckert CJ, Taylor CF, Taylor R et al (2011) Data standards for omics data: The basis of data sharing and reuse. Bioinf. Omics Data: Methods Protocols, 31–69

    Google Scholar 

  27. 27.

    Consortium GO et al. (2004) The gene ontology (go) database and informatics resource. Nucl Acids Res 32(suppl 1):D258–D261

    Article  Google Scholar 

  28. 28.

    Crimmins EM (1993) Demography: the past 30 years, the present, and the future. Demography 30(4):579–591

    Article  Google Scholar 

  29. 29.

    Crimmins EM, Seeman T (2001) Integrating biology into demographic research on health and aging (with a focus on the macarthur study of successful aging). In: Cells and surveys: should biological measures be included in social science research? National Academies Press (US)

  30. 30.

    Delaney C, Moorhead S (1995) The nursing minimum data set, standarized language, and health care quality. J Nurs Care Q 10(1):16–30

    Article  Google Scholar 

  31. 31.

    Demiris G, Afrin LB, Speedie S, Courtney KL, Sondhi M, Vimarlund V, Lovis C, Goossen W, Lynch C (2008) Patient-centered applications: use of information technology to promote disease management and wellness. A white paper by the amia knowledge in motion working group. J Am Med Inform Assoc 15 (1):8–13

    Article  Google Scholar 

  32. 32.

    Dettmer K, Hammock BD (2004) Metabolomics—a new exciting field within the “omics” sciences. Environ Health Perspect 112(7):A396

    Article  Google Scholar 

  33. 33.

    Centers for Disease Control and Prevention (2014) Classification of diseases, functioning, and disability. International classification of diseases, tenth revision, clinical modification (ICD-10-CM) CDC web site

  34. 34.

    Dolin RH, Alschuler L, Boyer S, Beebe C, Behlen FM, Biron PV, Shabo A (2006) Hl7 clinical document architecture, release 2. J Am Med Inform Assoc 13(1):30–39

    Article  Google Scholar 

  35. 35.

    Draper M, Cohen P, Buchan H (2001) Seeking consumer views: what use are results of hospital patient satisfaction surveys? Int J Qual Health Care 13 (6):463–468

    Article  Google Scholar 

  36. 36.

    Dwyer S.J. III, Weaver AC, Hughes KK (2004) Health insurance portability and accountability act. Secur Issues Digit Med Enterp 72(2):9–18

    Google Scholar 

  37. 37.

    Eisenberg JM (2000) Quality research for quality healthcare: the data connection. Health services research 35(2) xii

  38. 38.

    Evans WE, Relling MV (1999) Pharmacogenomics: translating functional genomics into rational therapeutics. Science 286(5439):487–491

    Article  Google Scholar 

  39. 39.

    Fahy E, Subramaniam S, Murphy RC, Nishijima M, Raetz CR, Shimizu T, Spener F, van Meer G, Wakelam MJ, Dennis EA (2009) Update of the lipid maps comprehensive classification system for lipids. J Lipid Res 50(Supplement):S9–S14

    Article  Google Scholar 

  40. 40.

    Feldman B, Martin EM, Skotnes T (2012) Big data in healthcare hype and hope. October 2012. Dr Bonnie, 360

  41. 41.

    Feldman K, Chawla NV (2015) Does medical school training relate to practice? Evidence from big data. Big Data 3(2):103–113

    Article  Google Scholar 

  42. 42.

    Feldman K, Faust L, Wu X, Huang C, Chawla NV (2017) Beyond volume: the impact of complex healthcare data on the machine learning pipeline. In: Towards Integrative machine learning and knowledge extraction. Springer, pp 150–169

  43. 43.

    Fenton JJ, Jerant AF, Bertakis KD, Franks P (2012) The cost of satisfaction: a national study of patient satisfaction, health care utilization, expenditures, and mortality. Arch Intern Med 172(5):405–411

    Article  Google Scholar 

  44. 44.

    Fiehn O, Robertson D, Griffin J, van der Werf M, Nikolau B, Morrison N, Sumner LW, Goodacre R, Hardy NW, Taylor C et al (2007) The metabolomics standards initiative (msi). Metabolomics 3(3):175–178

    Article  Google Scholar 

  45. 45.

    Field D, Garrity G, Gray T, Morrison N, Selengut J, Sterk P, Tatusova T, Thomson N, Allen MJ, Angiuoli SV et al (2008) The minimum information about a genome sequence (migs) specification. Nat Biotechnol 26(5):541–547

    Article  Google Scholar 

  46. 46.

    Fisher ES, Baron JA, Malenka DJ, Barrett J, Bubolz TA (1990) Overcoming potential pitfalls in the use of medicare data for epidemiologic research. Am J Public Health 80(12):1487–1490

    Article  Google Scholar 

  47. 47.

    Food U, Administration D et al. National drug code directory. Internet address: (2011)

  48. 48.

    Gee J, Button M, Brooks G (2010) The financial cost of healthcare fraud: what data from around the world shows. Tech. rep., MacIntyre Hudson

  49. 49.

    Ginsburg GS, Willard HF (2009) Genomic and personalized medicine: foundations and applications. Transl Res 154(6):277–287

    Article  Google Scholar 

  50. 50.

    Gray KA, Yates B, Seal RL, Wright MW, Bruford EA (2014) Genenames. org: the hgnc resources in 2015. Nucleic acids research p gku1071

    Article  Google Scholar 

  51. 51.

    Greenbaum D, Luscombe NM, Jansen R, Qian J, Gerstein M (2001) Interrelating different types of genomic data, from proteome to secretome:’oming in on function. Genome Res 11(9):1463–1468

    Article  Google Scholar 

  52. 52.

    Greenfield S, Nelson EC, Zubkoff M, Manning W, Rogers W, Kravitz RL, Keller A, Tarlov AR, Ware JE (1992) Variations in resource utilization among medical specialties and systems of care: results from the medical outcomes study. Jama 267(12):1624–1630

    Article  Google Scholar 

  53. 53.

    Hall LM, Doran D, Pink GH (2004) Nurse staffing models, nursing hours, and patient safety outcomes. J Nurs Admin 34(1):41–45

    Article  Google Scholar 

  54. 54.

    Hattie JA, Myers JE, Sweeney TJ (2004) A factor structure of wellness: theory, assessment, analysis, and practice. J Counsel Develop 82(3):354–364

    Article  Google Scholar 

  55. 55.

    Hays RD, Spritzer KL, Thompson WW, Cella D (2015) Us general population estimate for ”excellent” to ”poor” self-rated health item. J Gen Intern Med 30(10):1511–1516

    Article  Google Scholar 

  56. 56.

    of Health UD, Services H et al. (1980) ICD 9 CM. The International Classification of Diseases. 9. Rev: Clinical Modification.; Vol. 1: Diseases: Tabular List. ; Vol. 2: Diseases: Alphabetic Index; Vol. 3: Procedures: Tabular List and Alphabetic Index. US Government Printing Office

  57. 57.

    of Health UD, Services H et al. (2011) Us department of health and human services implementation guidance on data collection standards for race, ethnicity, sex, primary language and disability status

  58. 58.

    on Accreditation of Healthcare Organizations JC (1991) Accreditation manual for hospitals, vol. 1 Joint Commission on Accreditation of Healthcare Organizations

  59. 59.

    Hettler B (1984) Wellness: encouraging a lifetime pursuit of excellence. Health Values 8(4):13

    Google Scholar 

  60. 60.

    Hibbard JH, Stockard J, Tusler M (2005) Hospital performance reports: impact on quality, market share, and reputation. Health Aff 24(4):1150–1160

    Article  Google Scholar 

  61. 61.

    Horaitis O, Cotton RG (2004) The challenge of documenting mutation across the genome: the human genome variation society approach. Human Mutation 23 (5):447–452

    Article  Google Scholar 

  62. 62.

    Horgan RP, Kenny LC (2011) ’omic’technologies: genomics, transcriptomics, proteomics and metabolomics. Obstetr Gynaecol 13(3):189–195

    Google Scholar 

  63. 63.

    Huang YT (2014) Integrative modeling of multiple genomic data from different types of genetic association studies. Biostatistics 15(4):587–602

    Article  Google Scholar 

  64. 64.

    Jacob SG, Ramani RG (2012) Data mining in clinical data sets: a review. IJAIS-ISSN: 2249-0868 Foundation of Computer Science FCS, New York USA 4(6)

  65. 65.

    Jensen PB, Jensen LJ, Brunak S (2012) Mining electronic health records: towards better research applications and clinical care. Nat Rev Gen 13(6):395

    Article  Google Scholar 

  66. 66.

    Kailas A, Chong CC, Watanabe F (2010) From mobile phones to personal wellness dashboards. Pulse, IEEE 1(1):57–63

    Article  Google Scholar 

  67. 67.

    Kaplan GA, Pamuk ER, Lynch JW, Cohen RD, Balfour JL (1996) Inequality in income and mortality in the united states: analysis of mortality and potential pathways. Bmj 312(7037):999–1003

    Article  Google Scholar 

  68. 68.

    Kaplan WA (2006) Can the ubiquitous power of mobile phones be used to improve health outcomes in developing countries? Global Health 2(1):1

    Article  Google Scholar 

  69. 69.

    Kass-Hout TA, Alhinnawi H (2013) Social media in public health. British Med Bull 108(1):5–24

    Article  Google Scholar 

  70. 70.

    Kayyali B, Knott D, Van Kuiken S (2013) The big-data revolution in us health care: accelerating value and innovation. Mc Kinsey & Company, pp 1–13

  71. 71.

    Kindig D, Stoddart G (2003) What is population health? Am J Public Health 93(3):380–383

    Article  Google Scholar 

  72. 72.

    Landon BE, Normand SLT, Blumenthal D, Daley J (2003) Physician clinical performance assessment: prospects and barriers. Jama 290(9):1183–1189

    Article  Google Scholar 

  73. 73.

    Lave JR, Pashos CL, Anderson G, Brailer D, Bubolz T, Conrad D, Freund DA, Fox SH, Keeler E, Lipscomb J et al (1994) Costing medical care: using medicare administrative data. Medical care, 32(7) JS77

    Article  Google Scholar 

  74. 74.

    Lemieux-Charles L, McGuire WL (2006) What do we know about health care team effectiveness? a review of the literature. Med Care Res Rev 63(3):263–300

    Article  Google Scholar 

  75. 75.

    Lenfant C (2003) Clinical research to clinical practice—lost in translation? England J Med 349(9):868–874

    Article  Google Scholar 

  76. 76.

    Li J, Huang KY, Jin J, Shi J (2008) A survey on statistical methods for health care fraud detection. Health Care Manag Sci 11(3):275–287

    Article  Google Scholar 

  77. 77.

    Lopez AD, Mathers CD, Ezzati M, Jamison DT, Murray CJ (2006) Global and regional burden of disease and risk factors, 2001: systematic analysis of population health data. Lancet 367(9524):1747–1757

    Article  Google Scholar 

  78. 78.

    Marconi K, Lehmann H (2014) Big data and health analytics. CRC Press

  79. 79.

    McDonald CJ, Huff SM, Suico JG, Hill G, Leavelle D, Aller R, Forrey A, Mercer K, DeMoor G, Hook J et al (2003) Loinc, a universal standard for identifying laboratory observations: a 5-year update. Clin Chem 49 (4):624–633

    Article  Google Scholar 

  80. 80.

    McGrath MJ, Scanaill CN (2013) Wellness, fitness, and lifestyle sensing applications. In: Sensor technologies. Springer, pp 217–248

  81. 81.

    McGraw-Hill Concise dictionary of modern medicine. Online (2002). Accessed 31 May (2016)

  82. 82.

    McLean R (2002) Financial management in health care organizations. Cengage Learning

  83. 83.

    for Medicare & Medicaid Services C Hospital consumer assessment of healthcare providers and systems. Online. Accessed 31 May (2016)

  84. 84.

    for Medicare & Medicaid Services C ICD-9-CM, ICD-10-CM, ICD-10-PCS, CPT, and HCPCS code sets. Online (2015). Accessed 31 May 2016. ICN: 900943

  85. 85.

    for Medicare & Medicaid Services C et al. (2003) Healthcare Common Procedure Coding System (HCPCS) Centers for Medicare & Medicaid Services

  86. 86.

    Centers for Medicare & Medicaid Services H et al. (2004) Hipaa administrative simplification: standard unique health identifier for health care providers. Final rule. Fed Register 69(15):3433

  87. 87.

    Meltzer D (1997) Accounting for future costs in medical cost-effectiveness analysis. J Health Econ 16(1):33–64

    MathSciNet  Article  Google Scholar 

  88. 88.

    Mildenberger P, Eichelberg M, Martin E (2002) Introduction to the dicom standard. Europ Radiol 12(4):920–927

    Article  Google Scholar 

  89. 89.

    Müller M., Kersten S (2003) Nutrigenomics: goals and strategies. Nat Rev Gen 4(4):315–322

    Article  Google Scholar 

  90. 90.

    Murdoch TB, Detsky AS (2013) The inevitable application of big data to health care. Jama 309(13):1351–1352

    Article  Google Scholar 

  91. 91.

    Nelson CW, Niederberger J (1990) Patient satisfaction surveys: an opportunity for total quality improvement. Hosp Health Serv Admin 35(3):409–428

    Google Scholar 

  92. 92.

    Orchard S, Hermjakob H, Apweiler R (2003) The proteomics standards initiative. Proteomics 3(7):1374–1376

    Article  Google Scholar 

  93. 93.

    Organization WH et al. International classification of diseases (ICD) (2012)

  94. 94.

    Organization WH et al. Global reference list of 100 core health indicators (2015)

  95. 95.

    Ostherr K, Borodina S, Bracken RC, Lotterman C, Storer E, Williams B (2017) Trust and privacy in the context of user-generated health data. Big Data Soc 4(1):2053951717704,673

    Article  Google Scholar 

  96. 96.

    Pol LG, Thomas RK (2000) The demography of health and health care. Springer Science & Business Media

  97. 97.

    Pol L G, Thomas RK (2013) Health demography: an evolving discipline. In: The demography of health and healthcare. Springer, pp 1–12

  98. 98.

    Poulton BC, West MA (1999) The determinants of effectiveness in primary health care teams. J Interprof Care 13(1):7–18

    Article  Google Scholar 

  99. 99.

    Raghupathi W, Raghupathi V (2014) Big data analytics in healthcare: promise and potential. Health Inf Sci Syst 2(1):3

    Article  Google Scholar 

  100. 100.

    Registrar F (1997) Revisions to the standards for the classification of federal data on race and ethnicity. Fed Registr 62:58,781–58,790

    Google Scholar 

  101. 101.

    Retchin SM, Ballard D (1998) Commentary: establishing standards for the utility of administrative claims data. Health Serv Res 32(6):861

    Google Scholar 

  102. 102.

    Richard L, Gauvin L, Raine K (2011) Ecological models revisited: their uses and evolution in health promotion over two decades. Ann Rev Public Health 32:307–326

    Article  Google Scholar 

  103. 103.

    Riley GF (2009) Administrative and claims records as sources of health care cost data. Med Care 47(7_Supplement_1):S51–S55

    Article  Google Scholar 

  104. 104.

    Rosenbloom ST Person-generated health and wellness data for health care (2016)

    Article  Google Scholar 

  105. 105.

    Safran C, Bloomrosen M, Hammond WE, Labkoff S, Markel-Fox S, Tang PC, Detmer D E et al. (2007) Toward a national framework for the secondary use of health data: an american medical informatics association white paper. J Am Med Inform Assoc 14(1):1–9

    Article  Google Scholar 

  106. 106.

    Schiller JS, Adams PF, Nelson ZC (2005) Summary health statistics for the us population: national health interview survey, 2003. Vital and health statistics. Series 10. Data Nat Health Surv 2005(224):1–104

    Google Scholar 

  107. 107.

    Schneider MV, Orchard S (2011) Omics technologies, data and bioinformatics principles. Bioinforma Omics Data: Methods Protocols, 3–30

  108. 108.

    Shameer K, Badgeley MA, Miotto R, Glicksberg BS, Morgan JW, Dudley JT (2016) Translational bioinformatics in the era of real-time biomedical, health care and wellness data streams. Briefings in bioinformatics p bbv118

  109. 109.

    Shapiro M, Johnston D, Wald J, Mon D (2012) Patient-generated health data. RTI International

  110. 110.

    Shekelle PG, Ortiz E, Rhodes S, Morton SC, Eccles MP, Grimshaw JM, Woolf SH (2001) Validity of the agency for healthcare research and quality clinical practice guidelines: how quickly do guidelines become outdated? Jama 286 (12):1461–1467

    Article  Google Scholar 

  111. 111.

    Shryock HS, Siegel JS, Larmon EA (1973) The methods and materials of demography. US Bureau of the Census

  112. 112.

    Siegel JS (2011) The demography and epidemiology of human health and aging. Springer Science & Business Media

  113. 113.

    Skelly AC, Dettori JR, Brodt ED (2012) Assessing bias: the importance of considering confounding. Evidence-based Spine-care J 3(1):9

    Article  Google Scholar 

  114. 114.

    Smith HL (2003) Some thoughts on causation as it relates to demography and population studies. Popul Dev Rev 29(3):459–469

    Article  Google Scholar 

  115. 115.

    Stanhope M, Lancaster J (2015) Public health nursing: population-centered health care in the community. Elsevier Health Sciences

  116. 116.

    Stoto MA (2013) Population health in the Affordable Care Act era, vol 1. AcademyHealth, Washington, DC

    Google Scholar 

  117. 117.

    Suetens P (2009) Fundamentals of medical imaging. Cambridge University Press

  118. 118.

    Taber KAJ, Dickinson BD, Wilson M (2014) The promise and challenges of next-generation genome sequencing for clinical care. JAMA Int Med 174(2):275–280

    Article  Google Scholar 

  119. 119.

    Taylor CF (2007) Standards for reporting bioscience data: a forward look. Drug Discov Today 12(13):527–533

    Article  Google Scholar 

  120. 120.

    Taylor CF, Paton NW, Lilley KS, Binz PA, Julian RK, Jones AR, Zhu W, Apweiler R, Aebersold R, Deutsch EW et al (2007) The minimum information about a proteomics experiment (miape). Nature Biotechnol 25(8):887–893

    Article  Google Scholar 

  121. 121.

    Tebani A, Afonso C, Marret S, Bekri S (2016) Omics-based strategies in precision medicine: toward a paradigm shift in inborn errors of metabolism investigations. Int J Molec Sci 17(9):1555

    Article  Google Scholar 

  122. 122.

    Van Ommen B, Stierum R (2002) Nutrigenomics: exploiting systems biology in the nutrition and health arena. Curr Opin Biotechnol 13(5):517–521

    Article  Google Scholar 

  123. 123.

    Veeramah KR, Hammer MF (2014) The impact of whole-genome sequencing on the reconstruction of human population history. Nat Rev Gen 15(3):149–162

    Article  Google Scholar 

  124. 124.

    Ware JE, Snyder MK, Wright WR, Davies AR (1983) Defining and measuring patient satisfaction with medical care. Eval Program Plan 6(3):247–263

    Article  Google Scholar 

  125. 125.

    Wenk MR (2005) The emerging field of lipidomics. Nat Rev Drug Discov 4 (7):594–610

    Article  Google Scholar 

  126. 126.

    West M, Ginsburg GS, Huang AT, Nevins JR (2006) Embracing the complexity of genomic data for personalized medicine. Genome Res 16(5):559–566

    Article  Google Scholar 

  127. 127.

    Whetzel PL, Parkinson H, Causton HC, Fan L, Fostel J, Fragoso G, Game L, Heiskanen M, Morrison N, Rocca-Serra P et al (2006) The mged ontology: a resource for semantics-based description of microarray experiments. Bioinformatics 22(7):866–873

    Article  Google Scholar 

  128. 128.

    Wilkinson RG, Marmot MG (2003) Social determinants of health: the solid facts. World Health Organization

  129. 129.

    Williams GH (2003) The determinants of health: structure, context and agency. Sociol Health Illness 25(3):131–154

    Article  Google Scholar 

  130. 130.

    Wood WA, Bennett AV, Basch E (2015) Emerging uses of patient generated health data in clinical research. Molec Oncol 9(5):1018–1024

    Article  Google Scholar 

  131. 131.

    Wu PY, Cheng CW, Kaddi CD, Venugopalan J, Hoffman R, Wang MD (2017) –omic and electronic health record big data analytics for precision medicine. IEEE Trans Biomed Eng 64(2):263–273

    Article  Google Scholar 

  132. 132.

    Wunsch G et al. (2007) Confounding and control. Demograph Res 16(4):97–120

    Article  Google Scholar 

  133. 133.

    Yumak Z, Pu P (2013) Survey of sensor-based personal wellness management systems. BioNanoScience 3(3):254–269

    Article  Google Scholar 

Download references


This work is supported in part by the National Science Foundation (NSF) Grant IIS-1447795.

Author information



Corresponding author

Correspondence to Nitesh V. Chawla.

Ethics declarations

Conflict of Interest

The authors declare that they have no conflict of interest.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Feldman, K., Johnson, R.A. & Chawla, N.V. The State of Data in Healthcare: Path Towards Standardization. J Healthc Inform Res 2, 248–271 (2018).

Download citation


  • Healthcare analytics
  • Big data
  • Review
  • Standards