Approaching Ethical Guidelines for Data Scientists

  • Ursula Garzcarek
  • Detlef SteuerEmail author
Part of the Studies in Classification, Data Analysis, and Knowledge Organization book series (STUDIES CLASS)


The goal of this article is to inspire data scientists to participate in the debate on the impact that their professional work has on society, and to become active in public debates on the digital world as data science professionals. How do ethical principles (e.g. fairness, justice, beneficence and non-maleficence) relate to actual situations in our professional lives? What lies in our responsibility as professionals by our expertise in the field? More specifically, this article makes an appeal to statisticians that may neither consider themselves data scientists, nor what they do data science, to join that debate, and to be part of the community that establishes data science as a proper profession in the sense of Airaksinen (2009), a philosopher working on professional ethics. As we will argue, data science has one of its roots in statistics but also contains additional tasks and features that extend it. To shape the future of statistics, and to take responsibility for the statistical contributions to data science, statisticians should actively engage in the discussions. In Sect. 10.1, the term data science is defined, and the technical changes that have led to a strong influence of data science on society are outlined. In Sect. 10.2, the systematic approach from Commission Nationale Informatique & Liberte (2018) is introduced. Along the lines of that approach, prominent examples are given for ethical issues arising from the work of data scientists. In Sect. 10.3, we provide reasons why data scientists should engage in shaping morality around data science and to formulate codes of conduct and codes of practice for data science professionals. In Sect. 10.4, we present established ethical guidelines for the related fields of statistics and computing science. Section 10.5 describes the necessary steps in the community to develop professional ethics for data science. Finally in Sect. 10.6, we motivate our own engagement and give our starting statement for the debate: Data science is in the focal point of current societal development. Without becoming a profession with professional ethics, data science will fail in building trust in its interaction with and its much-needed contributions to society!


  1. Airaksinen, T. (2009). The Philosophy of Professional Ethics. In R.C. Elliot (Ed.), INSTITUTIONAL ISSUES INVOLVING ETHICS AND JUSTICE, (Vol. 1, pp. 201) in Encyclopedia of life support systems (EOLSS), Developed under the Auspices of the UNESCO, Paris, France: Eolss Publishers.
  2. American Statistical Association, Ethical Guidelines for Statistical Practice, Approved April 2018. Cited 9 Nov 2018.
  3. Angwin, J., Larson, J., Mattu, S., & Kirchner, L. (2016a). ProPublica: Machine bias, May 23, 2016. Cited 24 Oct 2018.
  4. Angwin, J., Larson, J., Mattu, S., & Kirchner, L. (2016b). ProPublica: How we analyzed the COMPAS recidivism algorithm, May 23, 2016. Cited 24 Oct 2018.
  5. Association for Computing Machinery (ACM), ACM Code of Ethics and Professional Conduct, Approved June 2018. Cited 9 Nov 2018.
  6. Bundeskriminalamt, Presseinformation: Neues Instrument zur Risikobewertung von potentiellen Gewaltstraftätern, RADAR-iTE (Regelbasierte Analyse potentiell destruktiver Täter zur Einschätzung des akuten Risikos - islamistischer Terrorismus), 2 Feb 2017. Cited 6 Nov 2018.
  7. Bundesministerium der Justiz und für Verbraucherschutz, Gesetz zur Verbesserung der Rechtsdurchsetzung in sozialen Netzwerken (Netzwerkdurchsetzungsgesetz - NetzDG), 1.9.2017, Cited 6 Nov 2018.
  8. Commission Nationale Informatique & Liberte, Algorithms and artificial intelligence: CNIL’s report on the ethical issues, 25 May 2018. Cited 2 Nov 2018.
  9. Commission Nationale Informatique & Liberte, HOW CAN HUMANS KEEP THE UPPER HAND? The ethical matters raised by algorithms and artificial intelligence, Dec 2017. Cited 2 Nov 2018.
  10. Confessore, N. (2018). New York times, Cambridge analytica and facebook: The scandal and the fallout so far, 4 Apr 2018. Cited 6 Nov 2018.
  11. Dastin, J. (2018). Amazon scraps secret AI recruiting tool that showed bias against women, Reuters news, 22 Oct 2018, 5:12 am. Cited 1 Nov 2018.
  12. De Veaux, R.D., Agarwal, M., Averett, M., Baumer, B.S., Bray, A., & Bressoud, T.C., et al. (2016). Curriculum guidelines for undergraduate programs in data science, 2016, Annual review of statistics and its application, (Vol. 4, no. 1, pp. 15-30), 2017. Scholar
  13. U.S. Department of Health and Human Services. (2002). Standards for privacy of individually identifiable health information, final rule. Federal Register, 45(CFR), 160–164.Google Scholar
  14. Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie (GMDS), Manzeschke, A., Winter, A., Isele, C., Deserno, T., Pallas, F., Weber, K., & Niederlag, W. (2017). Workshop: Ethische Leitlinien wissenschaftlicher Fachgesellschaften, 4 May 2017. Cited 6 Nov 2018.
  15. Diallo, I. (2018). The machine fired me, 17 June 2018. Cited 2 Nov 2018.
  16. Donoho, D. (2017). 50 Years of Data Science. Journal of Computational and Graphical Statistics, 26(4), 745–766. Scholar
  17. Erickson, L.C., Harris, N.E., & Lee, M.M. (2018). It’s time to talk about data ethics, 26 Mar 2018. Cited 6 Nov 2018.
  18. European Commission, Cybersecurity & Digital Privacy Policy (Unit H.2), eCall: Time saved = lives saved, 14 Feb 2018. Cited 7 Nov 2018.
  19. FAZ, Jeder zweite Gefährder hat das Potential zum Terroristen, 18.12.2017. Cited 6 Nov 2018.
  20. Fanta, A. (2018). Österreichs Jobcenter richten künftig mit Hilfe von Software über Arbeitslose, 13 Oct 2018. Cited 2 Nov 2018.
  21. Futurezone, AMS-Chef: Mitarbeiter schätzen Jobchancen pessimistischer ein als der Algorithmus, 12 Oct 2018. Cited 6 Nov 2018.
  22. Galit, S. (2010). To explain or to predict? Statistical science, 25(3), 289–310., Scholar
  23. Gesellschaft für Informatik, Ethical Guidelines of the German Informatics Society, 29 June 2018. Cited 6 Nov 2018.
  24. Goldberger, A.L., Amaral LAN, Glass, L., Hausdorff, J.M., Ivanov, P.C., & Mark, R.G., et al. (2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource complex physiologic signals. Circulation 101(23), e215–e220. [Circulation Electronic Pages.].
  25. Greiner, W., Batram, M., Damm, O., Scholz, S., & Witte, J. (2018). Kinder- und Jugendreport 2018, Beiträge zur Gesundheitsökonomie und Versorgungsforschung (Band 23), Andreas Storm (Herausgeber), DAK-Gesundheit.Google Scholar
  26. Horner, J. (2003). Morality, Ethics, and Law: Introductory concepts. SEMINARS IN SPEECH AND LANGUAGE, 24(4), 263–274.Google Scholar
  27. James, L. (2018). Oaths, pledges and manifestos: A master list of ethical tech values, doteveryone., 7 Mar 2018. Cited 6 Nov 2018.
  28. Kuntz, B., Mauz, M., & Lampert, T. (2018) Die KiGGS-Studie des Robert Koch-Instituts: Studiendesign, Erhebungsinhalte und Ergebnisse zur gesundheitlichen Ungleichheit im Kindes- und Jugendalter – Robert Koch-Institut, Berlin, Kinder- und Jugendreport 2018, Beiträge zur Gesundheitsökonomie und Versorgungsforschung (Band 23), Andreas Storm (Herausgeber), DAK-Gesundheit.Google Scholar
  29. Kurth, B.M., Kamtsiuris, P., Hölling, H., Schlaud, M., Dölle, R., & Ellert, U., et al. (2008) The challenge of comprehensively mapping children’s health in a nation-wide health survey: Design of the German KiGGS-Study BMC Public Health 20088:196. Kurth et al. licensee BioMed Central Ltd.
  30. Marr, B. (2015). Forbes, How big data is changing insurance forever, 16 Dec 2015.
  31. O’Neil, C. (2016). Weapons of math destruction: How big data increases inequality and threatens democracy. New York: Crown Publishing Group.Google Scholar
  32. Panoptykon Foundation, Niklas, J., Sztandar-Sztanderska, K., & Szymielewicz, K. (2015). Warsaw. Cited 2 Nov 2018.
  33. Pariser, E. (2012). The filter bubble : how the new personalized web is changing what we read and how we think, Penguin Books, New York. ISBN: 0143121235.Google Scholar
  34. Perez, S. (2016). Microsoft silences its new A.I. bot Tay, after Twitter users teach it racism. Cited 1 Nov 2018.
  35. Pollard, T.J., Johnson, A.E.W. (2016). The MIMIC-III clinical database.
  36. Pottegård, A., Kristensen, K.B., Ernst, M.T., Johansen, N.B., Quartarolo, P., Hallas, J. (2018). Use of N-nitrosodimethylamine (NDMA) contaminated valsartan products and risk of cancer: Danish nationwide cohort study, (Vol. 362). BMJ Publishing Group Ltd.,
  37. Schroepfer, M. (2018). CTO facebook, An update on our plans to restrict data access on facebook, 4 Apr 2018. Cited 6 Nov 2018.
  38. Seeger, J. (2016). ADAC-Untersuchung: Autohersteller sammeln Daten in großem Stil, 4 June 2016, Cited 6 Nov 2018.
  39. Wissenschaftlicher Dienst des Bundestags Fachbereich WD 3: Verfassung und Verwaltung, Veröffentlichung der Ergebnisse von Umfragen vor Wahlen (Deutschland und Mitgliedstaaten der EU), Aktenzeichen WD 3 - 3000 - 058/18, 2018. Cited 2 Nov 2018.

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Cytel Inc, Clinical Research Services ICCGenevaSwitzerland
  2. 2.Helmut-Schmidt-Universität, Universität der Bundeswehr HamburgHamburgGermany

Personalised recommendations