The Ethics of Big Data: Current and Foreseeable Issues in Biomedical Contexts
The capacity to collect and analyse data is growing exponentially. Referred to as ‘Big Data’, this scientific, social and technological trend has helped create destabilising amounts of information, which can challenge accepted social and ethical norms. Big Data remains a fuzzy idea, emerging across social, scientific, and business contexts sometimes seemingly related only by the gigantic size of the datasets being considered. As is often the case with the cutting edge of scientific and technological progress, understanding of the ethical implications of Big Data lags behind. In order to bridge such a gap, this article systematically and comprehensively analyses academic literature concerning the ethical implications of Big Data, providing a watershed for future ethical investigations and regulations. Particular attention is paid to biomedical Big Data due to the inherent sensitivity of medical information. By means of a meta-analysis of the literature, a thematic narrative is provided to guide ethicists, data scientists, regulators and other stakeholders through what is already known or hypothesised about the ethical risks of this emerging and innovative phenomenon. Five key areas of concern are identified: (1) informed consent, (2) privacy (including anonymisation and data protection), (3) ownership, (4) epistemology and objectivity, and (5) ‘Big Data Divides’ created between those who have or lack the necessary resources to analyse increasingly large datasets. Critical gaps in the treatment of these themes are identified with suggestions for future research. Six additional areas of concern are then suggested which, although related have not yet attracted extensive debate in the existing literature. It is argued that they will require much closer scrutiny in the immediate future: (6) the dangers of ignoring group-level ethical harms; (7) the importance of epistemology in assessing the ethics of Big Data; (8) the changing nature of fiduciary relationships that become increasingly data saturated; (9) the need to distinguish between ‘academic’ and ‘commercial’ Big Data practices in terms of potential harm to data subjects; (10) future problems with ownership of intellectual property generated from analysis of aggregated datasets; and (11) the difficulty of providing meaningful access rights to individual data subjects that lack necessary resources. Considered together, these eleven themes provide a thorough critical framework to guide ethical assessment and governance of emerging Big Data practices.
KeywordsEthics Big data Bioethics Information ethics Medical ethics Ethical foresight
The research leading to this work has been funded by a <removed for anonymity> major research grant. An initial version of this paper was discussed at a workshop organised at the <removed for anonymity> on <removed for anonymity>. We wish to acknowledge the extremely valuable feedback received during that meeting and from the two anonymous reviewers. This study was funded by the University of Oxford’s John Fell Fund.
Conflict of interest
The authors declare that they have no conflict of interest.
- Advisory Council to Google on the Right to be Forgotten. (2015). Report of the advisory council to google on the right to be forgotten. Google Docs. https://drive.google.com/file/d/0B1UgZshetMd4cEI3SjlvV0hNbDA/view?pli=1&usp=embed_facebook. Accessed 19 Mar 2015.
- Andrejevic, M. (2014). Big data, big questions the big data divide. International Journal of Communication, 8(0), 17. Accessed 7 Oct 2014.Google Scholar
- Apple. (2014). iBeacon for developers: Apple developer. https://developer.apple.com/ibeacon/. Accessed 17 Nov 2014.
- Barry, C. A., Stevenson, F. A., Britten, N., Barber, N., & Bradley, C. P. (2001). Giving voice to the lifeworld. More humane, more effective medical care? A qualitative study of doctor-patient communication in general practice. Social Science and Medicine, 53, 487–505. doi: 10.1016/s0277-9536(00)00351-8.CrossRefGoogle Scholar
- Beauchamp, T. L., & Childress, J. F. (2009). Principles of biomedical ethics. New York: Oxford University Press.Google Scholar
- Berry, D. M. (2011). The computational turn: Thinking about the digital humanities. Culture Machine, 12(0). ftp://220.127.116.11/big.data/%EB%B9%85%EB%8D%B0%EC%9D%B4%ED%84%B02_20131024_sunup/THE%20COMPUTATIONAL%20TURN%20Digital-Humanities.pdf. Accessed 7 Oct 2014.
- Booch, G. (2014). The human and ethical aspects of big data. IEEE Software, 31(1), 20–22. Accessed 30 Sept 2014.Google Scholar
- Bowker, G. C. (2013). Data flakes: An afterword to “Raw Data”is an oxymoron. Raw data” is an oxymoron. Cambridge: MIT Press. http://www.ics.uci.edu/~vid/Readings/bowker_data_flakes.pdf. Accessed 14 Oct 2014.
- Bowker, G. C. (2014). Big data, big questions the theory/data thing. International Journal of Communication, 8(0), 5. Accessed 7 Oct 2014.Google Scholar
- Boye, N. (2012). Co-production of Health enabled by next generation personal health systems. Studies in health technology and informatics, 177, 52–58.Google Scholar
- Busch, L. (2014). Big data, big questions a dozen ways to get lost in translation: Inherent challenges in large scale data sets. International Journal of Communication, 8(0), 18. Accessed 7 Oct 2014.Google Scholar
- Callebaut, W. (2012). Scientific perspectivism: A philosopher of science’s response to the challenge of big data biology. Studies in History and Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical Sciences, 43(1), 69–80. doi: 10.1016/j.shpsc.2011.10.007.CrossRefGoogle Scholar
- Collingridge, D. (1980). The social control of technology. Palgrave Macmillan.Google Scholar
- Craig, T. (2011). Privacy and big data. Sebastopol; Cambridge: O’Reilly.Google Scholar
- Crawford, K. (2013). The hidden biases in big data. Harvard Business Review. http://blogs.hbr.org/2013/04/the-hidden-biases-in-big-data/. Accessed 10 Oct 2014.
- Crawford, K., Gray, M. L., & Miltner, K. (2014). Critiquing big data: Politics, ethics, epistemology special section introduction. International Journal of Communication, 8, 10. Accessed 2 Oct 2014.Google Scholar
- Davis, K. (2012). Ethics of big data. O’Reilly Media, Inc.Google Scholar
- Dereli, T., Coskun, Y., Kolker, E., Guner, O., Agirbasli, M., & Ozdemir, V. (2014). Big data and ethics review for health systems research in LMICs: Understanding risk, uncertainty and ignorance-and catching the black swans? American Journal of Bioethics, 14(2), 48–50. doi: 10.1080/15265161.2013.868955.CrossRefGoogle Scholar
- Devos, Y., Maeseele, P., Reheul, D., Van Speybroeck, L., & De Waele, D. (2008). Ethics in the societal debate on genetically modified organisms: A (Re)Quest for sense and sensibility. Journal of Agricultural and Environmental Ethics, 21(1), 29–61. doi: 10.1007/s10806-007-9057-6.CrossRefGoogle Scholar
- Enjolras, B. (2014). Big Data and social research: New possibilities and ethical challenges. Tidsskrift for Samfunnsforskning, 55(1), 80–89.Google Scholar
- EURORDIS. (2013). Statement on the EP Report on the Protection of Personal Data. http://www.publichealth.ox.ac.uk/helex/Statement%20Data%20Prot%20FINAL.pdf. Accessed 22 Oct 2014.
- Fan, W., & Bifet, A. (2013). Mining big data: Current status, and forecast to the future. ACM SIGKDD Explorations Newsletter, 14(2), 1–5. Accessed 2 Oct 2014.Google Scholar
- Floridi, L. (2013). The philosophy of information (Reprint ed.). Oxford: OUP Oxford.Google Scholar
- Floridi, L. (Ed.). (2014a). The onlife manifesto. New York: Springer. http://www.springer.com/philosophy/epistemology+and+philosophy+of+science/book/978-3-319-04092-9. Accessed 2 Dec 2014.
- Gadamer, H. G. (1976). The historicity of understanding. Harmondsworth: Penguin Books Ltd.Google Scholar
- Gadamer, H. G. (2004). Truth and method. London: Continuum International Publishing Group.Google Scholar
- General Medical Council. (2008). Consent guidance. http://www.gmc-uk.org/guidance/ethical_guidance/consent_guidance_index.asp.
- Gilligan, C. (1982). In a different voice. Cambridge: Harvard University Press.Google Scholar
- Goodman, E. (2014). Design and ethics in the era of big data. Interactions, 21(3), 22–24. Accessed 1 Oct 2014.Google Scholar
- Habermas, J. (1984). The theory of communicative action. Volume 1: Reason and the rationalization of society. Boston: Beacon.Google Scholar
- Habermas, J. (1985). The theory of communicative action. Volume 2: Lifeworld and system: A critique of functionalist reason. Boston: Beacon.Google Scholar
- Hayden, E. C. (2012). A broken contract. NATURE PUBLISHING GROUP MACMILLAN BUILDING, 4 CRINAN ST, LONDON N1 9XW, ENGLAND. http://environmentportal.in/files/file/informed%20consent.pdf. Accessed 7 Oct 2014.
- Heidegger, M. (1967). Being and time. Oxford: Blackwell.Google Scholar
- Higuchi, N. (2013). Three challenges in advanced medicine. Japan Medical Association Journal, 56(6), 437–447.Google Scholar
- Hoffman, S. (2014). Citizen science: The law and ethics of public access to medical big data (SSRN Scholarly Paper No. ID 2491054). Rochester, NY: Social Science Research Network. http://papers.ssrn.com/abstract=2491054. Accessed 13 Oct 2014.
- IBM. (2014). The four V’s of big data. http://www.ibmbigdatahub.com/infographic/four-vs-big-data. Accessed 23 Oct 2014.
- Joly, Y., Dove, E. S., Knoppers, B. M., Bobrow, M., & Chalmers, D. (2012). Data sharing in the post-genomic world: The experience of the international cancer genome consortium (ICGC) data access compliance office (DACO). PLoS Computational Biology, 8(7), e1002549. doi: 10.1371/journal.pcbi.1002549.CrossRefGoogle Scholar
- Knobel, C. P. (2010). Ontic occlusion and exposure in sociotechnical systems. University of Pittsburgh. Retrieved from http://deepblue.lib.umich.edu/handle/2027.42/78763.
- Laney, D. (2001). 3D data management: Controlling data volume, velocity and variety. META Group Research Note, 6.Google Scholar
- Liyanage, H., de Lusignan, S., Liaw, S.-T., Kuziemsky, C. E., Mold, F., Krause, P., et al. (2014). Big data usage patterns in the health care domain: A use case driven approach applied to the assessment of vaccination benefits and risks. Contribution of the IMIA Primary Healthcare Working Group. Yearbook of medical informatics, 9(1), 27–35. doi: 10.15265/IY-2014-0016.
- Lyon, D. (2003). Surveillance as social sorting: Privacy, risk, and digital discrimination. London: Routledge.Google Scholar
- MacIntyre, A. (2007). After virtue: A study in moral theory (3rd ed.). London: Gerald Duckworth & Co Ltd.Google Scholar
- Mahajan, R. L., Reed, J., Ramakrishnan, N., Mueller, R., Williams, C. B., & Campbell, T. A. (2012). Cultivating emerging and black swan technologies (Vol. 6, pp. 549–557). Presented at the ASME international mechanical engineering congress and exposition, proceedings (IMECE). doi: 10.1115/IMECE2012-89339
- McGuire, A. L., Achenbaum, L. S., Whitney, S. N., Slashinski, M. J., Versalovic, J., Keitel, W. A., et al. (2012). Perspectives on human microbiome research ethics. Journal of Empirical Research on Human Research Ethics: An International Journal, 7(3), 1–14. doi: 10.1525/jer.2012.7.3.1.CrossRefGoogle Scholar
- Mittelstadt, B. D., Fairweather, N. B., McBride, N., & Shaw, M. (2011). Ethical issues of personal health monitoring: A literature review. In ETHICOMP 2011 conference proceedings (pp. 313–321). Presented at the ETHICOMP 2011, Sheffield, UK.Google Scholar
- Mittelstadt, B. D., Fairweather, N. B., McBride, N., & Shaw, M. (2013). Privacy, risk and personal health monitoring. In ETHICOMP 2013 conference proceedings (pp. 340–351). Presented at the ETHICOMP 2013, Kolding, Denmark.Google Scholar
- Mittelstadt, B. D., Stahl, B. C., & Fairweather, N. B. (2015). How to shape a better future? Epistemic difficulties for ethical assessment and anticipatory governance of emerging technologies. Ethical Theory and Moral Practice, 1–21. doi: 10.1007/s10677-015-9582-8.
- Moore, P., Xhafa, F., Barolli, L., & Thomas, A. (2013). Monitoring and detection of agitation in dementia towards real-time and big-data solutions. 2013 Eighth international conference on P2p, parallel, grid, cloud and internet computing (3pgcic 2013), pp 128–135. doi: 10.1109/3PGCIC.2013.26
- Mora, F. (2012). The demise of google health and the future of personal health records. International Journal of Healthcare Technology and Management, 13(5), 363–377. Accessed 11 Nov 2014.Google Scholar
- National Science Foundation. (2014). Critical techniques and technologies for advancing big data science & engineer (BIGDATA): Program Solicitation NSF 14-543. http://www.nsf.gov/pubs/2014/nsf14543/nsf14543.pdf. Accessed 17 Oct 2014.
- NHS England (2014). NHS England. The care.data programme: better information means better care. http://www.england.nhs.uk/ourwork/tsd/care-data/. Accessed 11 Nov 2014.
- Niemeijer, A. R., Frederiks, B. J., Riphagen, I. I., Legemaate, J., Eefsting, J. A., & Hertogh, C. M. (2010). Ethical and practical concerns of surveillance technologies in residential care for people with dementia or intellectual disabilities: An overview of the literature. International Psychogeriatrics, 22, 1129–1142.CrossRefGoogle Scholar
- Nissenbaum, H. (2004). Privacy as contextual integrity (SSRN Scholarly Paper No. ID 534622). Rochester, NY: Social Science Research Network. http://papers.ssrn.com/abstract=534622. Accessed 12 Mar 2013.
- Noddings, N. (2013). Caring: A relational approach to ethics and moral education. Berkeley: University of California Press.Google Scholar
- Nuffield Council on Bioethics. (2015). The collection, linking and use of data in biomedical research and health care: Ethical issues (p. 198). Nuffield Council on Bioethics. http://nuffieldbioethics.org/wp-content/uploads/Biological_and_health_data_web.pdf.
- Oboler, A., Welsh, K., & Cruz, L. (2012a). The danger of big data: Social media as computational social science. First Monday, 17(7). https://www.scopus.com/inward/record.url?eid=2-s2.0-84867308941&partnerID=40&md5=0e4cb2f657154c7f82a76c2a657259ab.
- Oboler, A., Welsh, K., & Cruz, L. (2012b). The danger of big data: Social media as computational social science. First Monday, 17(7). http://journals.uic.edu/ojs/index.php/fm/article/view/3993. Accessed 1 Oct 2014.
- Pariser, E. (2011). The filter bubble: What the Internet is hiding from you. London: Viking.Google Scholar
- Patterson, M. E., & Williams, D. R. (2002). Collecting and analyzing qualitative data: Hermeneutic principles, methods and case examples (Vol. 9). Champaign, IL: Sagamore Publishing, Inc. http://www.treesearch.fs.fed.us/pubs/29421. Accessed 7 Nov 2012.
- Pellegrino, E. D., & Thomasma, D. C. (1993). The virtues in medical practice. New York: Oxford University Press.Google Scholar
- Puschmann, C., & Burgess, J. (2014). Big data, big questions metaphors of big data. International Journal of Communication, 8(0), 20. Accessed 7 Oct 2014.Google Scholar
- Reuters. (2014, October 3). Facebook plots first steps into healthcare. http://www.telegraph.co.uk/technology/facebook/11139606/Facebook-plots-first-steps-into-healthcare.html. Accessed 15 Nov 2014.
- Richards, N. M., & King, J. H. (2013). Three paradoxes of big data. Stanford Law Review Online, 66, 41. Accessed 18 Feb 2015.Google Scholar
- Safran, C., Bloomrosen, M., Hammond, W. E., Labkoff, S., Markel-Fox, S., Tang, P. C., et al. (2006). Toward a national framework for the secondary use of health data: An American medical informatics association white paper. Journal of the American Medical Informatics Association, 14(1), 1–9. doi: 10.1197/jamia.M2273.CrossRefGoogle Scholar
- Schadt, E. E. (2012). The changing privacy landscape in the era of big data. Molecular Systems Biology, 8. doi: 10.1038/msb.2012.47
- Schaefer, G. O., Emanuel, E. J., & Wertheimer, A. (2009). The obligation to participate in biomedical research. JAMA, 302(1), 67–72. Accessed 19 Mar 2015.Google Scholar
- Schroeder, R. (2014). Big data and the brave new world of social media research. Big Data & Society, 1(2). doi: 10.1177/2053951714563194
- Schroeder, R., & Cowls, J. (2014). Big data, ethics, and the social implications of knowledge production. http://dataethics.github.io/proceedings/BigDataEthicsandtheSocialImplicationsofKnowledgeProduction.pdf. Accessed 2 Oct 2014.
- Schwandt, T. A. (2000). Three epistemological stances for qualitative inquiry: Interpretivism, hermeneutics, and social constructionism. Handbook of qualitative research (pp. 189–214). Thousand Oaks, CA: Sage.Google Scholar
- Slashinski, M. J., McCurdy, S. A., Achenbaum, L. S., Whitney, S. N., & McGuire, A. L. (2012). “Snake-oil,”“quack medicine,” and “industrially cultured organisms:” biovalue and the commercialization of human microbiome research. BMC medical ethics, 13(1), 28. Accessed 13 Oct 2014.Google Scholar
- Slote, M. (2007). The ethics of care and empathy (New Ed edition.). London, New York: Routledge.Google Scholar
- Taylor, L., & Floridi, L. (Eds.). (2015). Group privacy: New challenges of data technologies. New York: Springer (forthcoming).Google Scholar
- Tene, O., & Polonetsky, J. (2013). Big data for all: Privacy and user control in the age of analytics. http://heinonlinebackup.com/hol-cgi-bin/get_pdf.cgi?handle=hein.journals/nwteintp11§ion=20. Accessed 2 Oct 2014.
- Terry, N. (2012). Protecting patient privacy in the age of big data. UMKC L. Rev., 81, 385. Accessed 2 Oct 2014.Google Scholar
- van den Berg, B., & van der Hof, S.. (2012). What happens to my data? A novel approach to informing users of data processing practices. First Monday, 17(7). doi: 10.5210/fm.v17i7.4010
- van der Sloot, B. 2014). Privacy in the Post-NSA Era: Time for a fundamental revision? http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2432104. Accessed 17 Feb 2015.
- Wellcome Trust. (2013). Impact of the draft European data protection regulation and proposed amendments from the rapporteur of the LIBE committee on scientific research. Wellcome Trust. http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/WTP055584.pdf. Accessed 22 Oct 2014.