Advertisement

Data science ethical considerations: a systematic literature review and proposed project framework

  • Jeffrey S. SaltzEmail author
  • Neil Dewar
Original Paper
  • 225 Downloads

Abstract

Data science, and the related field of big data, is an emerging discipline involving the analysis of data to solve problems and develop insights. This rapidly growing domain promises many benefits to both consumers and businesses. However, the use of big data analytics can also introduce many ethical concerns, stemming from, for example, the possible loss of privacy or the harming of a sub-category of the population via a classification algorithm. To help address these potential ethical challenges, this paper maps and describes the main ethical themes that were identified via systematic literature review. It then identifies a possible structure to integrate these themes within a data science project, thus helping to provide some structure in the on-going debate with respect to the possible ethical situations that can arise when using data science analytics.

Keywords

Big data Data science Ethics Code of conduct 

Notes

References

  1. Boell, S., & Cecez-Kecmanovic, D. (2014). A hermeneutic approach for conducting literature reviews and literature searches. Communications of the Association for Information Systems, 34, 1.CrossRefGoogle Scholar
  2. Boyd, D, & Crawford, K. (2012). Critical questions for big data: Provocations for a cultural, technological, and scholarly phenomenon. Information, communication & society, 15(5), 662–679.CrossRefGoogle Scholar
  3. Boyd, D, Levy, K., & Marwick, A. E. (2014). The networked nature of algorithmic discrimination. In Data and discrimination: Collected essays (pp. 43–57). Washington, DC: Open Technology Institute.Google Scholar
  4. Boyd, K. (2012). Critical questions for big data. Information, Communication & Society, 15, 662–679.CrossRefGoogle Scholar
  5. Braun, A., & Garriga, G. (2018). Consumer journey analytics in the context of data privacy and ethics. In C. Linnhoff-Popien, R. Schneider & M. Zaddach (Eds.), Digital marketplaces unleashed. Berlin: Springer.Google Scholar
  6. Brey, P., & Soraker, J. (2009). Philosophy of computing and information technology. In D. M. Gabbay, A. W. M. Meijers, J. Woods, & P. Thagard (Eds). Philosophy of technology and engineering sciences (pp. 1341–1408). North Holland: Elsevier.CrossRefGoogle Scholar
  7. Butrymowicz, S., & Garland, S. (2012). How New York city’s value-added model compares to what other districts, states are doing, hechingerreport. Retrieved from http://hechingerreport.org/content/how-new-york-citys-value-added-model-compares-to-what-other-districts-states-are-doing_7757/.
  8. Bynum, T. (2008). Computer and information ethics. In Stanford encyclopedia of philosophy. Retrieved from http://plato.stanford.edu/entries/ethics-computer/. Accessed 14 January 2016
  9. Bynum, T., & Rogerson, S. (2003). Computer ethics and professional responsibility: Introductory text. New York: WileyGoogle Scholar
  10. Chen, A. (2017). Using machine learning to find the 8 types of players in the NBA, Fastbreak. http://fastbreakdata.com/classifying-the-modern-nba-player-with-machine-learning-539da03bb824.
  11. Clarke, R. (2016). Big data, big risks. Information Systems Journal, 26(1), 77–90.CrossRefGoogle Scholar
  12. Crawford, K. (2013). The hidden biases in big data. Harvard Business Review Online Edn. Harvard Business Review.Google Scholar
  13. De Laat, P. B. (2017). Big data and algorithmic decision-making: Can transparency restore accountability? ACM SIGCAS Computers and Society, 47(3), 39–53.CrossRefGoogle Scholar
  14. Dorasamy, N., & Pomazalová, N. (2016). Social impact and social media analysis relating to big data. In Data science and big data computing (pp. 293–313). Cham: Springer.CrossRefGoogle Scholar
  15. Drosou, M., Jagadish, H. V., Pitoura, E., & Stoyanovich, J. (2017). Diversity in big data: A review. Big data, 5(2), 73–84.CrossRefGoogle Scholar
  16. Elo, S., & Kyngäs, H. (2007). The qualitative content analysis process. Journal of Advanced Nursing, 62(1), 107–115.CrossRefGoogle Scholar
  17. Fairfield, J., & Shtein, H. (2014). Big data, big problems: Emerging issues in the ethics and data science of journalism. Journal of Mass Media Ethics, 29, 38–51.CrossRefGoogle Scholar
  18. Fleiss, J. L., Levin, B., & Paik, M. C. (2004). Determining sample sizes needed to detect a difference between two proportions. Statistical Methods for Rates and Proportions, 2, 64–85.Google Scholar
  19. Floridi, L., & Taddeo, M. (2016). What is data ethics?. Philosophical Transactions Series A, 374, 2083.Google Scholar
  20. Fong, K. (2016). The ethics conversation we’re not having about analytics. Harvard Business Review Online Edn. Retrieved from http://blogs.hbr.org/2013/04/thehidden-biases-in-big-data/. Accessed 20 August 2017.
  21. Fuller, M. (2017). Big data, ethics and religion: New questions from a new science. Religions, 8(5), 88.CrossRefGoogle Scholar
  22. Grindrod, P. (2016). Beyond privacy and exposure: Ethical issues within citizen-facing analytics. Philosophical Transactions of the Royal Society A, 374(2083), 20160132.CrossRefGoogle Scholar
  23. Gumbus, A., & Grodzinsky, F. (2016). Era of big data: Danger of descrimination. ACM SIGCAS Computers and Society, 45(3), 118–125.CrossRefGoogle Scholar
  24. Haffar, J. (2015). Have you seen ASUM-DM? Retrieved from IBM: https://developer.ibm.com/predictiveanalytics/2015/10/16/have-you-seen-asum-dm/.
  25. Harkens, A. (2016). ‘Rear window ethics’ and discrimination: The darker side of big data. In European conference on e-government (p. 267). Academic Conferences International Limited.Google Scholar
  26. Hsieh, H.-F., & Shannon, S. E. (2005). Three approaches to qualitative content analysis. Qualitative Health Research, 15(9), 1277–1288.CrossRefGoogle Scholar
  27. Jagadish, H., Gehrke, J., Labrinidis, A., Papakonstantinou, Y., Patel, J. M., Ramakrishnan, R., & Shahabi, C. (2014). Big data and its technical challenges. Communications of the ACM, 57(7), 86–94.CrossRefGoogle Scholar
  28. Johnson, D. (1985). Computer ethics. Upper Saddle River: Prentice-Hall.Google Scholar
  29. Johnson, D., & Nissenbaum, H. (1995). Computers, ethics and social values. New York: Pearson.Google Scholar
  30. Joseph, D., Ng, K., Koh, C., and Ang. S (2007). Turnover of information technology professionals: A narrative review, meta-analytic structural equation modeling, and model development. MIS Quarterly, 31(3), 547–577.CrossRefGoogle Scholar
  31. Kitchenham, B., & Charters, S. (2007). Guidelines for performing systematic literature reviews in software engineering. UK: Keele.Google Scholar
  32. Leonelli, S. (2016). Locating ethics in data science: Responsibility and accountability in global and distributed knowledge production systems. Philosophical Transactions of the Royal Society A, 374(2083), 20160122.CrossRefGoogle Scholar
  33. Manders-Huits, N., & Zimmer, M. (2009). Values and pragmatic action: The challenges of introducing ethical intelligence in technical design communities. International Review of Information Ethics, 10(2), 37–45.Google Scholar
  34. Martin, K. E. (2015). Ethical issues in the big data industry. MIS Quarterly Executive, 14, 2.Google Scholar
  35. Mateosian, R. (2013). Ethics of big data. IEEE Micro, 33(2), 60–61.CrossRefGoogle Scholar
  36. Metcalf, J., Keller, E., Boyd, D. (2016). Perspectives on big data, ethics and society. Council for Big Data, Ethics and Society. http://bdes.datasociety.net/council-output/perspectives-on-big-data-ethics-andsociety/.
  37. Mingers, J., & Walsham, G. (2010). Towards ethical information systems: The contribution of discourse ethics. MIS Quarterly, 34(4), 833–854.CrossRefGoogle Scholar
  38. Mittelstadt, B. (2017). From individual to group privacy in big data analytics. Philosophy & Technology, 30, 475–494.CrossRefGoogle Scholar
  39. Newell, S., & Marabelli, M. (2015). Strategic opportunities (and challenges) of algorithmic decisionmaking: A call for action on the long-term societal effects of ‘datification’. The Journal of Strategic Information Systems.  https://doi.org/10.1016/j.jsis.2015.02.001.Google Scholar
  40. Nyes, K. (2016). White house to data scientists: We need you. Computer world. Retrieved from http://www.computerworld.com/article/3125660/big-data/white-house-to-data-scientists-we-need-you.html. Accessed 20 August 2017.
  41. Pascalev, M. (2017). Privacy exchanges: Restoring consent in privacy self-management. Ethics and Information Technology, 19(1), 39–48.  https://doi.org/10.1007/s10676-016-9410-4.CrossRefGoogle Scholar
  42. Rowe, F. (2014). What literature review is not: Diversity, boundaries and recommendations. European Journal of Information Systems, 23(3), 241–255.CrossRefGoogle Scholar
  43. Saltz, J., Dewar, N., & Heckman, R. (2018). Key concepts for a data science ethics curriculum. In Proceedings of the 49th ACM technical symposium on computer science education (pp. 952–957). ACM.Google Scholar
  44. Saltz, J., & Stanton, J. (2017). An introduction to data science. Thousand Oaks: SAGE Publications.Google Scholar
  45. Sandvig, C., Hamilton, K., Karahalios, K., & Langbort, C. (2014). An algorithm audit. In Data and discrimination: Collected essays. New York: New America, Open Technology Institute.Google Scholar
  46. Schwartz, P. M. (2011). Privacy, ethics and analytics. IEEE security and privacy 9(3). IEEE.Google Scholar
  47. Shearer, C. (2000). The CRISP-DM model: The new blueprint for data mining. Journal of Data Warehousing, 5(4), 13–22.Google Scholar
  48. Someh, I. A., Breidbach, C. F., Davern, M. J., & Shanks, G. G. (2016). Ethical implications of big data analytics. In ECIS (pp. Research-in).Google Scholar
  49. Stahl, B. C., Timmermans, J., & Mittelstadt, B. D. (2016). The ethics of computing: A survey of the computing-oriented literature. ACM Computing Surveys (CSUR), 48(4), 55.CrossRefGoogle Scholar
  50. Stevenson, D. (2014). Locating discrimination in data-based systems. Data and discrimination: Collected essays (16–20). Washington, DC: New America/Open Technology InstituteGoogle Scholar
  51. Stoyanovich, J., Howe, B., Abiteboul, S., Miklau, G., Sahuguet, A., & Weikum, G. (2017). Fides: Towards a platform for responsible data science. In SSDBM’17-29th International Conference on Scientific and Statistical Database Management.Google Scholar
  52. Sweeney, L. (2013). Discrimination in Online Ad Delivery. ACM Queue 11(3). Association of Computing Machinery.Google Scholar
  53. Tene, O., & Polotensky, J. (2012). Privacy in the age of big data. Stanford Law Review.Google Scholar
  54. Tiell, S., & Metcalf, J. (2016). The Universal Principles of Data Science Ethics. Accenture Labs. https://www.accenture.com/t20160629T012639__w__/us-en/_acnmedia/PDF-24/Accenture-Universal-Principles-Data-Ethics.pdf.
  55. Tractenberg, R. E., Russell, A. J., Morgan, G. J., FitzGerald, K. T., Collmann, J., Vinsel, L., … Dolling, L. M. (2015). Using ethical reasoning to amplify the reach and resonance of professional codes of conduct in training big data scientists. Science and Engineering Ethics, 21(6), 1485–1507.CrossRefGoogle Scholar
  56. Voronova, L., & Kazantsev, N. (2015). The ethics of big data: Analytical survey. In Business informatics (CBI), 2015 IEEE 17th conference on (Vol. 2, pp. 57–63). IEEE.Google Scholar
  57. Wielki, J. (2015). The social and ethical challenges connected with the big data phenomenon. Polish Journal of Management Studies, 11(2), 192–202.Google Scholar
  58. Wiener, N. (1954). The human use of human beings. New York: Doubleday.Google Scholar
  59. Zwitter, A. (2014). Big data ethics. Big Data & Society, 1(2), 2053951714559253.CrossRefGoogle Scholar

Copyright information

© Springer Nature B.V. 2019

Authors and Affiliations

  1. 1.Syracuse UniversitySyracuseUSA

Personalised recommendations