Skip to main content

The Ethics of Data Science

  • 671 Accesses

Part of the Springer Series in the Data Sciences book series (SSDS)


In 2017, Ali Rahimi and Benjamin Recht received the “Test of Time Award” at the Neural Information Processing Systems (NIPS) conference—one of the top research conferences for machine learning and artificial intelligence.

This is a preview of subscription content, access via your institution.

Buying options

USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-030-71352-2_14
  • Chapter length: 15 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
USD   59.99
Price excludes VAT (USA)
  • ISBN: 978-3-030-71352-2
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   59.99
Price excludes VAT (USA)
Hardcover Book
USD   79.99
Price excludes VAT (USA)
Figure 14.1:


  1. 1.


  2. 2.

    At the time of writing this textbook, when using the term ML in the tech sector, practitioners usually refer to neural networks and deep learning. The same term in less cutting edge sectors includes traditional ML (as covered in this book) and new developments. However, the need for transparent models is an emerging necessity regardless of the form of ML.

  3. 3.

    The study found a number of contributing factors; However, for the sake of brevity, we focus on sampling biases in this example.

  4. 4.

    Most statistical methods and some machine learning algorithms provide facility to incorporate weights. In cases when this is not possible, each record can be duplicated proportional (e.g., rep function) to their sampling weight to artificially impose the weights structure.

  5. 5.

    See Chapter 12 for a refresher on language models.

  6. 6.

    With a bench of academic experts leading the charge, the bureaucracy and politics of local government prevented even the most basic information to enable the task force’s work—ironically, leading to a lack of transparency for the transparency investigation (Kaye 2019).

  7. 7.

    The terms “Interpretability” and “Explainability” are both used in the debates centered on transparent machine learning. While they have proximal meanings and are often used interchangeably, some researchers argue that there are differences.

  8. 8.

    Recent research has begun to blend non-interpretable models for causal inference, such as Athey and Wager (2019). Thus, we remove causality from this discussion.

  9. 9.

    For more on these three explainable ML diagnostics, revisit Chapter 10.

  10. 10.

    See Staniak and Biecek (2019) for the implementation in R.

  11. 11.

    For more information on this R package, visit

  12. 12.

    Previously known by the acronym FAT prior to 2020.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Jeffrey C. Chen .

Rights and permissions

Reprints and Permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this chapter

Verify currency and authenticity via CrossMark

Cite this chapter

Chen, J.C., Rubin, E.A., Cornwall, G.J. (2021). The Ethics of Data Science. In: Data Science for Public Policy. Springer Series in the Data Sciences. Springer, Cham.

Download citation