Abstract
Big data in health is a subject associated with many column inches and claims. It’s closely connected with other areas that are also close to the peak of the Gartner Hype Cycle: Artificial Intelligence, Machine Learning, Precision Medicine and Genomics. Given this exposure and also the potential for great benefit if done right, this chapter will sound a largely cautionary note. We must be aware of, and properly address the technical, trust, privacy and governance issues that Big Data brings up. From Hippocratic times, Primum non nocere (first do no harm) has driven medical advance. Big Data needs to be no exception. We need to take great care at the outset with Big Data as once out, the genie will not go back in the bottle. Answers to some of the technical concerns have existed since the early 1980s and unless addressed at the point of care now, will continue to foster a Garbage in and Garbage out model. While there is no doubt that a degree of composting can improve matters, machine learning, AI and true precision medicine requires high quality, semantically interoperable, structured and coded, curated data. Alongside these technical concerns we consider the potential impact on the data subject, citizen and consumer. We also examine whether clinicians are appropriately led and equipped with education and tools for use at the point of care. It is only with these in place that we will be able to deliver solutions on the data quality challenges. Through ensuring these criteria are met with the highest possible quality, big data will start to meet the multiple expectations already in place.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The older term ‘Personalised Medicine’, which is often used interchangeably with ‘Precision Medicine’ will not be used in this Chapter. Personalised Medicine which implies specific manufacture or synthesis for the individual is a valid but much more specific concept.
References
Big data, big hype? [Internet] (2014) [cited 24 Feb 2018]. Available from: https://www.wired.com/insights/2014/04/big-data-big-hype/
Hurwitz J, Nugent A, Halper F, Kaufman M (2013) Big data for dummies, 1st edn
Adamson D (2015) Big data in healthcare made simple [Internet]. Health Catalyst [cited 24 Feb 2018]. Available from: https://www.healthcatalyst.com/big-data-in-healthcare-made-simple
Bate A, Reynolds RF, Caubel P (2018) The hope, hype and reality of big data for pharmacovigilance. Ther Adv Drug Saf 9(1):5–11
Anonymous (2008) Chapter 67: children, young people and attitudes to privacy [Internet]. Australian Privacy Law and Practice (ALRC report 108) [cited 25 Feb 2018]. Available from: https://www.alrc.gov.au/publications/For%20Your%20Information%3A%20Australian%20Privacy%20Law%20and%20Practice%20%28ALRC%20Report%20108%29%20/67-childre
Collier R (2012) Medical privacy breaches rising. CMAJ 184(4):E215–E216
Keen PGW (1980) Decision support systems: a research perspective. https://dspace.mit.edu/handle/17211/47172 [Internet]. [cited 24 Feb 2018]. Available from: https://dspace.mit.edu/handle/1721.1/47172?show=full?show=full
Jugulum R (2016) Importance of data quality for analytics. In: Quality in the 21st century. Springer, Cham, pp 23–31
Cai L, Zhu Y (2015) The challenges of data quality and data quality assessment in the big data era. Data Sci J 14:2
Middleton B, Bloomrosen M, Dente MA, Hashmat B, Koppel R, Overhage JM et al (2013) Enhancing patient safety and quality of care by improving the usability of electronic health record systems: recommendations from AMIA. J Am Med Inform Assoc 20(e1):e2–e8
Novas C, Rose N (2000) Genetic risk and the birth of the somatic individual. Econ Soc 29(4):485–513
Sermon K, Goossens V, Seneca S, Lissens W, De Vos A, Vandervorst M et al (1998) Preimplantation diagnosis for Huntington’s disease (HD): clinical application and analysis of the HD expansion in affected embryos. Prenat Diagn 18(13):1427–1436
Sini E (2016) How big data is changing healthcare.pdf [Internet]. Humanitas Hospital Italy. Available from: https://www.eiseverywhere.com/file_uploads/9b7793c3ad732c28787b2a8bc0892c31_Elena-Sini_How-Big-Data-is-Changing-Healthcare.pdf
Big opportunities, big challenges [Internet]. [cited 25 Feb 2018]. Available from: http://www.ey.com/gl/en/services/advisory/ey-big-data-big-opportunities-big-challenges
Bellazzi R (2014) Big data and biomedical informatics: a challenging opportunity. Yearb Med Inform 22(9):8–13
The big-data revolution in US health care: accelerating value and innovation [Internet]. [cited 18 Dec 2017]. Available from: https://www.mckinsey.com/industries/healthcare-systems-and-services/our-insights/the-big-data-revolution-in-us-health-care
Grissinger M (2010) The five rights: a destination without a map. Pharm Ther 35(10):542
Polubriaginof F, Tatonetti NP, Vawdrey DK (2015) An assessment of family history information captured in an electronic health record. AMIA Annu Symp Proc 5(2015):2035–2042
Nathan PA, Johnson O, Clamp S, Wyatt JC (2016) Time to rethink the capture and use of family history in primary care. Br J Gen Pract 66(653):627–628
Mehrabi S, Krishnan A, Sohn S, Roch AM, Schmidt H, Kesterson J et al (2015) DEEPEN: a negation detection system for clinical text incorporating dependency relation into NegEx. J Biomed Inform 1(54):213–219
Wu S, Miller T, Masanz J, Coarr M, Halgrim S, Carrell D et al (2014) Negation’s not solved: generalizability versus optimizability in clinical natural language processing. PLoS One 9(11):e112774
Ford EW, Menachemi N, Phillips MT (2006) Predicting the adoption of electronic health records by physicians: when will health care be paperless? J Am Med Inform Assoc 13(1):106–112
Warner JL, Jain SK, Levy MA (2016) Integrating cancer genomic data into electronic health records. Genome Med 8(1):113
Richard Lilford AM (2012) Looking back, moving forward [Internet]. University of Birmingham [cited 17 Oct 2017]. Available from: https://www.birmingham.ac.uk/Documents/college-mds/haps/projects/cfhep/news/HSJ.pdf
Wood WA, Bennett AV, Basch E (2015) Emerging uses of patient generated health data in clinical research. Mol Oncol 9(5):1018–1024
Haghi M, Thurow K, Stoll R (2017) Wearable devices in medical internet of things: scientific research and commercially available devices. Healthc Inform Res 23(1):4–15
Montgomery K, Chester J (2017) Health wearable devices in the big data era: ensuring privacy, security, and consumer protection. American University, Washington
Zhu H, Colgan J, Reddy M, Choe EK (2016) Sharing patient-generated data in clinical practices: an interview study. AMIA Annu Symp Proc 2016:1303–1312
Cohen DJ, Keller SR, Hayes GR, Dorr DA, Ash JS, Sittig DF (2016) Integrating patient-generated health data into clinical care settings or clinical decision-making: lessons learned from project healthdesign. JMIR Hum Factors 3(2):e26
Burn J (2013) Should we sequence everyone’s genome? Yes. BMJ 21(346):f3133
Herper M (2017) Illumina promises to sequence human genome for $100—but not quite yet. Forbes Magazine [Internet]. [cited 25 Feb 2018]. Available from: https://www.forbes.com/sites/matthewherper/2017/01/09/illumina-promises-to-sequence-human-genome-for-100-but-not-quite-yet/
Rochman B (2017) Full genome sequencing for newborns raises questions. Scientific American [Internet]. [cited 25 Feb 2018]. Available from: https://www.scientificamerican.com/article/full-genome-sequencing-for-newborns-raises-questions/
Rojahn SY (2014) DNA sequencing of IVF embryos. MIT Technology Review [Internet]. [cited 25 June 2018]. Available from: https://www.technologyreview.com/s/524396/dna-sequencing-of-ivf-embryos/
Martin J, Asan, Yi Y, Alberola T, Rodríguez-Iglesias B, Jiménez-Almazán J, et al (2015) Comprehensive carrier genetic test using next-generation deoxyribonucleic acid sequencing in infertile couples wishing to conceive through assisted reproductive technology. Fertil Steril 104(5):1286–1293
Marx V (2013) Next-generation sequencing: the genome jigsaw. Nature 501(7466):263–268
Hoffman MA, Williams MS (2011) Electronic medical records and personalized medicine. Hum Genet 130(1):33–39
Hoffman MA (2007) The genome-enabled electronic medical record. J Biomed Inform 40(1):44–46
Salehinejad H, Valaee S, Mnatzakanian A, Dowdell T, Barfett J, Colak E (2017) Interpretation of mammogram and chest X-ray reports using deep neural networks—preliminary results [Internet]. arXiv [cs.CV]. Available from: http://arxiv.org/abs/1708.09254
Roberts K, Rink B, Harabagiu SM, Scheuermann RH, Toomay S, Browning T et al (2012) A machine learning approach for identifying anatomical locations of actionable findings in radiology reports. AMIA Annu Symp Proc 3(2012):779–788
Hassanpour S, Langlotz CP, Amrhein TJ, Befera NT, Lungren MP (2017) Performance of a machine learning classifier of knee MRI reports in two large academic radiology practices: a tool to estimate diagnostic yield. AJR Am J Roentgenol 208(4):750–753
Vaidya J, Shafiq B, Jiang X, Ohno-Machado L (2013) Identifying inference attacks against healthcare data repositories. AMIA Jt Summits Transl Sci Proc 18(2013):262–266
Weed LL (1968) Medical records that guide and teach. N Engl J Med 278(11):593–600
Henriksson A, Conway M, Duneld M, Chapman WW (2013) Identifying synonymy between SNOMED clinical terms of varying length using distributional analysis of electronic health records. AMIA Annu Symp Proc 16(2013):600–609
Rector AL, Brandt S, Schneider T (2011) Getting the foot out of the pelvis: modeling problems affecting use of SNOMED CT hierarchies in practical applications. J Am Med Inform Assoc 18(4):432–440
Karlsson D, Nyström M, Cornet R (2014) Does SNOMED CT post-coordination scale? Stud Health Technol Inform 205:1048–1052
Park Y-T, Atalag K (2015) Current national approach to healthcare ICT standardization: focus on progress in New Zealand. Healthc Inform Res 21(3):144–151
Tim Benson GG (2017) Interoperability, SNOMED, HL7 and FHIR [Internet]. [cited 23 Feb 2018]. Available from: https://www.slideshare.net/TimBenson1/interoperability-snomed-hl7-and-fhir
WHO | International Classification of Diseases (2018) [cited 25 Feb 2018]. Available from: http://www.who.int/classifications/icd/en/
Metke A (2016) The human phenotype ontolgy in ontoserver. CSIRO
National Clinical Terminology Service (NCTS) website [Internet]. [cited 23 Feb 2018]. Available from: https://www.healthterminologies.gov.au/tools
SNOMED CT implementation in primary care [Internet]. [cited 24 Feb 2018]. Available from: https://digital.nhs.uk/SNOMED-CT-implementation-in-primary-care
SNOMED CT implementation in New Zealand [Internet]. Ministry of Health NZ [cited 24 Feb 2018]. Available from: https://www.health.govt.nz/nz-health-statistics/classification-and-terminology/new-zealand-snomed-ct-national-release-centre/snomed-ct-implementation-new-zealand
Professional Record Standards Body (PRSB) for health and social care [Internet]. [cited 15 Nov 2017]. Available from: https://theprsb.org/
INTEROPen [Internet]. [cited 27 Feb 2018]. Available from: https://www.interopen.org/
The Apperta Foundation [Internet] (2018) Apperta [cited 26 Feb 2018]. Available from: https://apperta.org/
Codd EF (1970) A relational model of data for large shared data banks. Commun ACM 13(6):377–387
Database normalization and design techniques [Internet] (2008) Barry Wise NJ SEO [cited 25 June 2018]. Available from: http://www.barrywise.com/2008/01/database-normalization-and-design-techniques/
McDonald K (2018) MSIA questions need for minimum functionality requirements project [Internet]. Pulse+IT [cited 26 Feb 2018]. Available from: https://www.pulseitmagazine.com.au:443/news/australian-ehealth/4171-msia-questions-need-for-minimum-functionality-requirements-project
GP2GP [Internet]. [cited 15 Sep 2017]. Available from: https://digital.nhs.uk/gp2gp
DSCN 09/2010 initial standard—ISB—patient banner [Internet]. [cited 27 Feb 2018]. Available from: http://webarchive.nationalarchives.gov.uk/+http://www.isb.nhs.uk/documents/isb-1505/dscn-09-2010/index_html
Common User Interface (CUI) [Internet]. [cited 07 Dec 2018]. Available from: https://webarchive.nationalarchives.gov.uk/20160921150545, http://systems.digital.nhs.uk/data/cui/uig
National guidelines for on-screen display of medicines information | Safety and Quality [Internet]. [cited 26 Feb 2018]. Available from: https://www.safetyandquality.gov.au/our-work/medication-safety/electronic-medication-management/national-guidelines-for-on-screen-display-of-medicines-information/
DeepMind-Royal Free deal is “cautionary tale” for healthcare in the algorithmic age [Internet] (2017) University of Cambridge [cited 23 Feb 2018]. Available from: http://www.cam.ac.uk/research/news/deepmind-royal-free-deal-is-cautionary-tale-for-healthcare-in-the-algorithmic-age
Hodson H (2016) Revealed: Google AI has access to huge haul of NHS patient data. New Scientist [Internet]. [cited 23 Feb 2018]. Available from: https://www.newscientist.com/article/2086454-revealed-google-ai-has-access-to-huge-haul-of-nhs-patient-data/
Basu S. Should the NHS share patient data with Google’s DeepMind? [Internet]. WIRED UK [cited 19 Feb 2018]. Available from: http://www.wired.co.uk/article/nhs-deepmind-google-data-sharing
Vincent J (2017) Google’s DeepMind made “inexcusable” errors handling UK health data, says report [Internet]. The Verge [cited 15 Nov 2017]. Available from: https://www.theverge.com/2017/3/16/14932764/deepmind-google-uk-nhs-health-data-analysis
Powles J, Hodson H (2017) Google DeepMind and healthcare in an age of algorithms. Health Technol 7(4):351–367
How the NHS got it so wrong with care.data [Internet] (2016) [cited 19 Feb 2018]. Available from: http://www.telegraph.co.uk/science/2016/07/07/how-the-nhs-got-it-so-wrong-with-caredata/
Temperton J. NHS care.data scheme closed after years of controversy [Internet]. WIRED UK [cited 15 Sep 2017]. Available from: http://www.wired.co.uk/article/care-data-nhs-england-closed
NHS (2013) NHS England sets out the next steps of public awareness about care.data [Internet]. [cited 15 Sep 2017]. Available from: https://www.england.nhs.uk/2013/10/care-data/
van Staa T-P, Goldacre B, Buchan I, Smeeth L (2016) Big health data: the need to earn public trust. BMJ 14(354):i3636
McCartney M (2014) Care.data doesn’t care enough about consent. BMJ 348:g2831
Godlee F (2016) What can we salvage from care.data? BMJ 354:i3907
Mann N (2016) Learn from the mistakes of care.data. BMJ 354:i4289
Cowan P. Govt releases billion-line “de-identified” health dataset [Internet]. iTnews [cited 18 Feb 2018]. Available from: http://www.itnews.com.au/news/govt-releases-billion-line-de-identified-health-dataset-433814
Lubarsky B (2017) Re-identification of “anonymized” data. Georgetown Law Technol Rev 12:202–212
Why quantum computers might not break cryptography | Quanta Magazine [Internet]. Quanta Magazine [cited 25 Feb 2018]. Available from: https://www.quantamagazine.org/why-quantum-computers-might-not-break-cryptography-20170515/
Bernstein DJ, Heninger N, Lou P, Valenta L (2017) Post-quantum RSA. In: Post-quantum cryptography. Lecture notes in computer science. Springer, Cham, pp 311–329
Wan Z, Vorobeychik Y, Xia W, Clayton EW, Kantarcioglu M, Malin B (2017) Expanding access to large-scale genomic data while promoting privacy: a game theoretic approach. Am J Hum Genet 100(2):316–322
Malin B, Sweeney L (2004) How (not) to protect genomic data privacy in a distributed network: using trail re-identification to evaluate and design anonymity protection systems. J Biomed Inform 37(3):179–192
Murphy D (2017) @CareQualityComm—this is one of the triages relating to the 48yr old 30/day smoker woken from sleep with chest pain. It is now updated. pic.twitter.com/BJG27sft4J [Internet]. @DrMurphy11 [cited 27 Feb 2018]. Available from: https://twitter.com/DrMurphy11/status/848110663054622721
Middleton K, Butt M, Hammerla N, Hamblin S, Mehta K, Parsa A (2016) Sorting out symptoms: design and evaluation of the “babylon check” automated triage system [Internet]. arXiv [cs.AI]. Available from: http://arxiv.org/abs/1606.02041
Crouch H (2017) Babylon health services says it has “duty” to point out CQC “shortcomings” [Internet]. Digital Health [cited 18 Feb 2018]. Available from: https://www.digitalhealth.net/2017/12/babylon-health-services-says-duty-point-cqc-shortcomings/
McCartney M (2017) Margaret McCartney: innovation without sufficient evidence is a disservice to all. BMJ 5(358):j3980
Ogden J (2016) CQC and BMA set out their positions on GP inspections. Prescriber 27(6):44–48
Dent S (2018) Amazon gets into healthcare with Warren Buffet and JP Morgan [Internet]. Engadget [cited 25 Feb 2018]. Available from: https://www.engadget.com/2018/01/30/amazon-healthcare-warren-buffet-jpmorgan-chase/
Terlep S (2017) The real reason CVS wants to buy Aetna? Amazon.com. WSJ Online [Internet]. [cited 25 Feb 2018]; Available from: https://www.wsj.com/articles/the-real-reason-cvs-wants-to-buy-aetna-amazon-com-1509057307
Blumenthal D (2017) Realizing the value (and profitability) of digital health data. Ann Intern Med 166(11):842–843
How much should small businesses spend on IT annually? [Internet] (2015) Optimal Networks [cited 26 Feb 2018]. Available from: https://www.optimalnetworks.com/2015/03/06/small-business-spend-it-annually/
Atasoy H, Chen P-Y, Ganju K (2017) The spillover effects of health IT investments on regional healthcare costs. Manage Sci [Internet]. Available from: https://doi.org/10.1287/mnsc.2017.2750
Appleby J, Gershlick B (2017) Keeping up with the Johanssons: how does UK health spending compare internationally? BMJ 3(358):j3568
Williams J, Bullman D (2018) The faculty of clinical informatics [Internet]. FCI [cited 26 Feb 2018]. Available from: https://www.facultyofclinicalinformatics.org.uk/
Klasko SK (2017) Interview with Deborah DiSanzo of IBM Watson health. Healthc Transform 2(2):60–70
Fogel AL, Kvedar JC (2018) Artificial intelligence powers digital medicine. NPJ Digit Med 1(1):5
Personalised health and care 2020 [Internet]. GOV.UK [cited 25 June 2018]. Available from: https://www.gov.uk/government/publications/personalised-health-and-care-2020
Spencer SA (2016) Future of clinical coding. BMJ 26(353):i2875
McBeth R (2015) NHS number use becomes law | Digital Health [Internet]. Digital Health. [cited 15 Nov 2017]. Available from: https://www.digitalhealth.net/2015/10/nhs-number-use-becomes-law/
NHS number [Internet]. [cited 15 Sep 2017]. Available from: https://digital.nhs.uk/NHS-Number
Morrison Z, Robertson A, Cresswell K, Crowe S, Sheikh A (2011) Understanding contrasting approaches to nationwide implementations of electronic health record systems: England, the USA and Australia. J Healthc Eng 2(1):25–41
Pearce C, Bainbridge M (2014) A personally controlled electronic health record for Australia. J Am Med Inform Assoc 21(4):707–713
Kelman CW, Bass AJ, Holman CDJ (2002) Research use of linked health data—a best practice protocol. Aust N Z J Public Health 26(3):251–255
National health index [Internet]. Ministry of Health NZ [cited 15 Sep 2017]. Available from: http://www.health.govt.nz/our-work/health-identity/national-health-index
Ludvigsson JF, Otterblad-Olausson P, Pettersson BU, Ekbom A (2009) The Swedish personal identity number: possibilities and pitfalls in healthcare and medical research. Eur J Epidemiol 24(11):659–667
Sood H, Bates D, Halamka J, Sheikh A (2018) Has the time come for unique patient identifiers for the U.S.? [Internet]. NEJM Catal [cited 26 Feb 2018]. Available from: https://catalyst.nejm.org/time-unique-patient-identifiers-us/
Asian Development Bank (2018) Unique health identifier assessment tool kit [Internet]. Asian Development Bank, Manila, Philippines. Available from: https://www.adb.org/documents/unique-health-identifier-assessment-toolkit
West M (2015) Leadership and leadership development in health care [Internet]. The King’s Fund [cited 26 Feb 2018]. Available from: https://www.kingsfund.org.uk/publications/leadership-and-leadership-development-health-care
Schneider EC, Sarnak DO, Squires D, Shah A, Doty MM (2017) Mirror, mirror 2017: international comparison reflects flaws and opportunities for better U.S. health care [Internet]. [cited 27 Feb 2018]. Available from: http://www.commonwealthfund.org/~/media/files/publications/fund-report/2017/jul/schneider_mirror_mirror_2017.pdf
Robinson I, Webber J, Eifrem E (2015) Graph databases: new opportunities for connected data. O’Reilly Media, Inc., p 238
Waldrop MM (2016) The chips are down for Moore’s law. Nature 530(7589):144–147
Hruska J (2013) Intel’s former chief architect: Moore’s law will be dead within a decade [Internet]. http://www.extremetech.com/computing/165331-intels-chief-architect-moores-law-will-be-dead-within-adecade
Iwama K, Kawano Y, Murao M (2013) Theory of quantum computation, communication, and cryptography. In: 7th conference, TQC 2012, Tokyo, Japan, 17–19 May 2012, revised selected papers. Springer, p 153
Dumitrescu EF, McCaskey AJ, Hagen G, Jansen GR, Morris TD, Papenbrock T et al (2018) Cloud quantum computing of an atomic nucleus. Phys Rev Lett 120(21):210501
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Bainbridge, M. (2019). Big Data Challenges for Clinical and Precision Medicine. In: Househ, M., Kushniruk, A., Borycki, E. (eds) Big Data, Big Challenges: A Healthcare Perspective. Lecture Notes in Bioengineering. Springer, Cham. https://doi.org/10.1007/978-3-030-06109-8_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-06109-8_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-06108-1
Online ISBN: 978-3-030-06109-8
eBook Packages: MedicineMedicine (R0)