Skip to main content

The Ethics of Big Data: Current and Foreseeable Issues in Biomedical Contexts

  • Chapter
  • First Online:

Part of the book series: Law, Governance and Technology Series ((LGTS,volume 29))

Abstract

The capacity to collect and analyse data is growing exponentially. Referred to as ‘Big Data’, this scientific, social and technological trend has helped create destabilising amounts of information, which can challenge accepted social and ethical norms. Big Data remains a fuzzy idea, emerging across social, scientific, and business contexts sometimes seemingly related only by the gigantic size of the datasets being considered. As is often the case with the cutting edge of scientific and technological progress, understanding of the ethical implications of Big Data lags behind. In order to bridge such a gap, this article systematically and comprehensively analyses academic literature concerning the ethical implications of Big Data, providing a watershed for future ethical investigations and regulations. Particular attention is paid to biomedical Big Data due to the inherent sensitivity of medical information. By means of a meta-analysis of the literature, a thematic narrative is provided to guide ethicists, data scientists, regulators and other stakeholders through what is already known or hypothesised about the ethical risks of this emerging and innovative phenomenon. Five key areas of concern are identified: (1) informed consent, (2) privacy (including anonymisation and data protection), (3) ownership, (4) epistemology and objectivity, and (5) ‘Big Data Divides’ created between those who have or lack the necessary resources to analyse increasingly large datasets. Critical gaps in the treatment of these themes are identified with suggestions for future research. Six additional areas of concern are then suggested which, although related have not yet attracted extensive debate in the existing literature. It is argued that they will require much closer scrutiny in the immediate future: (6) the dangers of ignoring group-level ethical harms; (7) the importance of epistemology in assessing the ethics of Big Data; (8) the changing nature of fiduciary relationships that become increasingly data saturated; (9) the need to distinguish between ‘academic’ and ‘commercial’ Big Data practices in terms of potential harm to data subjects; (10) future problems with ownership of intellectual property generated from analysis of aggregated datasets; and (11) the difficulty of providing meaningful access rights to individual data subjects that lack necessary resources. Considered together, these eleven themes provide a thorough critical framework to guide ethical assessment and governance of emerging Big Data practices.

This chapter is re-printed with the permission of Springer. The chapter was previously published in Science and Engineering Ethics: Mittelstadt, Brent Daniel, and Luciano Floridi. 2016. “The Ethics of Big Data: Current and Foreseeable Issues in Biomedical Contexts.” Science and Engineering Ethics, 22(2): 303–341. doi:10.1007/s11948-015-9652-2. Page numbers from the original publication should be used when citing. Appendices excluded, please see original publication.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   179.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    For example, the identification of the presence of diabetes can support targeted marketing (Terry 2012, p. 392).

  2. 2.

    For an overview of sample companies providing such services, see Costa (2014).

  3. 3.

    In some contexts, such as the USA under HIPAA, administrative data will be afforded less protection than genomic and similar biobank data despite possessing similar capacities for revealing sensitive aspects of a person’s health. This may be due partly to the possibility of removing identifiers from administrative data without ‘ruining’ the data (Currie 2013) as is an apparent limitation with anonymisation of genomic data (Hansson 2009, p. 10).

  4. 4.

    These forms of biomedical data are incredibly varied and complex, consisting of data produced from a wide variety of sources, including “laboratory auto-analyzers, pharmacy systems, and clinical imaging systems…augmented by data from systems supporting health administrative functions such as patient demographics, insurance coverage, financial data, etc…clinical narrative information, captured electronically as structured data or transcribed ‘free text’…electronic health records” to name but a few (Safran et al. 2006, p. 2).

  5. 5.

    For instance, Facebook has recently announced plans for “support communities” and “preventative care applications” (Reuters 2014), while Google and Apple have recently released platforms for health and fitness data aggregation (Google Fit and Apple HealthKit/ResearchKit).

  6. 6.

    However, the efficacy of such platforms remains questionable (Butler 2013).

  7. 7.

    See for example the UK Biobank Ethics and Governance Council: http://www.egcukbiobank.org.uk/.

  8. 8.

    By some accounts moral obligations exist for medical research. As suggested by accounts of solidarity-based governance of biomedical Big Data (e.g. Prainsack and Buyx 2013), patients may have a moral duty to participate in research due to the value generated through advances in medical knowledge and treatments (Harris 2005; Schaefer et al. 2009). As participation inherently includes risks, researchers may similarly have a moral obligation to minimise risks as far as possible by extracting maximum value from existing datasets through re-purposing and aggregation (Currie 2013; Harris 2005).

  9. 9.

    The shift to solidarity is also said to free up the “significant resources” currently spent on (re-)consenting procedures for primary and secondary uses of data held in biobanks for research, innovation and infrastructural improvements including interoperability between repositories (Prainsack and Buyx 2013, p. 80). This position rests on the assumption that significant resources are currently being spent on re-consent procedures in particular, which are a central concern for consent and Big Data (e.g. Wellcome Trust 2013), and that these resources would instead be spent on valuable research and structural improvements.

  10. 10.

    The relative lack of reporting on harms stemming from abuses of biomedical data has been noted in a recent Nuffield Council report on the ethics of linking biomedical datasets for research (Nuffield Council on Bioethics 2015). The lack has been largely attributed to a lack of robust reporting mechanisms and empirical research on underreporting, with most cases coming from anecdotal accounts and notable media stories. As a result a lack of evidence of harms should not be considered evidence for a lack of harms.

  11. 11.

    The applicability of theories on the ethics of care (e.g. Gilligan 1982; Noddings 2013; Slote 2007) to Big Data likely extend beyond discrimination against marginalised groups. For example, emphasising responsiveness and relationships between data subjects, custodians and analysts may provide avenues for development of new privacy protection mechanisms and group-level ethics which acknowledge the network ethical effects possible through Big Data (see Section 5.1). While a full account of this and related topics concerning ethics of care goes beyond the scope of this paper, existing work on the applicability of the ethics of care to public health (e.g. Kass 2001) may provide a starting point for future enquiries.

  12. 12.

    With these tendencies noted, the capacity of Big Data to provide scientific explanations of particular types of social phenomena or human behaviours should not be rejected (e.g. Schroeder 2014).

  13. 13.

    For further details on the specification of the right to be forgotten by Google in the EU, see: Advisory Council to Google on the Right to be Forgotten (2015).

  14. 14.

    Regulatory action may be required, as Big Data creates new opportunities for “data aggregators and miners to…run around health care’s domain-specific protections by creating medical profiles of individuals” not subject to existing legislation (Terry 2012, p. 386), as was the case with the Google Health platform which operated outside of HIPAA restrictions in the United States (Mora 2012, p. 373).

  15. 15.

    As an example of the latter, if biobanking research utilising genome sequences were to reveal that obesity is linked primarily to behaviour rather than genes, or an ethnic group were shown to have a higher genetic pre-disposition to cancer (cf. Angrist 2009; Mathaiyan et al. 2013), well-meaning research may inadvertently lead to future discrimination against these groups.

References

Download references

Acknowledgements

The research leading to this work has been funded by a John Fell Fund major research grant. An initial version of this paper was discussed at a workshop organised at the Ethics of Biomedical Big Data workshop organised in April 2015 at the Oxford Internet Institute. We wish to acknowledge the extremely valuable feedback received during that meeting and from the two anonymous reviewers.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Brent Daniel Mittelstadt .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Mittelstadt, B.D., Floridi, L. (2016). The Ethics of Big Data: Current and Foreseeable Issues in Biomedical Contexts. In: Mittelstadt, B., Floridi, L. (eds) The Ethics of Biomedical Big Data. Law, Governance and Technology Series, vol 29. Springer, Cham. https://doi.org/10.1007/978-3-319-33525-4_19

Download citation

Publish with us

Policies and ethics