Skip to main content

Using Explainable Boosting Machines (EBMs) to Detect Common Flaws in Data

  • 2270 Accesses

Part of the Communications in Computer and Information Science book series (CCIS,volume 1524)


Every dataset is flawed, often in surprising ways that data scientists might not anticipate. However, popular machine learning methods are mostly black-boxes. Due to their lack of interpretability, they might learn defective knowledge from these datasets, which can be difficult to detect. In this work, we show how interpretable machine learning methods such as EBMs can help users detect problems that are lurking in their data. Specifically, we provide a number of case studies, where EBM discovers various types of common dataset flaws, including missing values, confounding and treatment effects, data drift, bias and fairness, and outliers. In each case study, we analyze the flaws using visualization of EBM shape functions combined with domain knowledge. We also demonstrate that in some cases interpretable learning methods such as EBMs provide simple tools for correcting problems when correcting the data is difficult.


  • Interpretability
  • Generalized additive model
  • Debugging datasets
  • Model editing
  • Missing values
  • Treatment effects
  • Fairness

This is a preview of subscription content, access via your institution.

Buying options

USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
USD   129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions


  1. Acock, A.C.: Working with missing values. J. Marriage Family 67(4), 1012–1028 (2005)

    CrossRef  Google Scholar 

  2. Ambrosino, R., Buchanan, B.G., Cooper, G.F., Fine, M.J.: The use of misclassification costs to learn rule-based decision support models for cost-effective hospital admission strategies. In: Proceedings of the Annual Symposium on Computer Application in Medical Care, p. 304. American Medical Informatics Association (1995)

    Google Scholar 

  3. Barreno, M., Nelson, B., Joseph, A.D., Tygar, J.D.: The security of machine learning. Mach. Learn. 81(2), 121–148 (2010).

    CrossRef  MathSciNet  MATH  Google Scholar 

  4. Bolukbasi, T., Chang, K.W., Zou, J.Y., Saligrama, V., Kalai, A.T.: Man is to computer programmer as woman is to homemaker? debiasing word embeddings. Adv. Neural Inf. Process. Syst. 29, 4349–4357 (2016)

    Google Scholar 

  5. Buolamwini, J., Gebru, T.: Gender shades: Intersectional accuracy disparities in commercial gender classification. In: Conference on Fairness, Accountability and Transparency, pp. 77–91. PMLR (2018)

    Google Scholar 

  6. Caruana, R., Lou, Y., Gehrke, J., Koch, P., Sturm, M., Elhadad, N.: Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1721–1730 (2015)

    Google Scholar 

  7. Cooper, G.F., et al.: Predicting dire outcomes of patients with community acquired pneumonia. J. Biomed. Inf. 38(5), 347–366 (2005)

    CrossRef  Google Scholar 

  8. Gama, J., Žliobaitė, I., Bifet, A., Pechenizkiy, M., Bouchachia, A.: A survey on concept drift adaptation. ACM Comput. Surv. (CSUR) 46(4), 1–37 (2014)

    CrossRef  Google Scholar 

  9. Hastie, T., Tibshirani, R.: Generalized additive models: some applications. J. Am. Stat. Assoc. 82(398), 371–386 (1987)

    CrossRef  Google Scholar 

  10. Kleinberg, J., Mullainathan, S., Raghavan, M.: Inherent trade-offs in the fair determination of risk scores. arXiv preprint arXiv:1609.05807 (2016)

  11. Larson, J., Mattu, S., Kirchner, L., Angwin, J.: How we analyzed the compas recidivism algorithm. ProPublica 9(1) (2016)

    Google Scholar 

  12. Le Gall, J.R., Lemeshow, S., Saulnier, F.: A new simplified acute physiology score (saps ii) based on a European/north American multicenter study. Jama 270(24), 2957–2963 (1993)

    CrossRef  Google Scholar 

  13. Li, B., Wang, Y., Singh, A., Vorobeychik, Y.: Data poisoning attacks on factorization-based collaborative filtering. Adv. Neural Inf. Process. Syst. 29, 1885–1893 (2016)

    Google Scholar 

  14. Lin, W.-C., Tsai, C.-F.: Missing value imputation: a review and analysis of the literature (2006–2017). Artif. Intell. Rev. 53(2), 1487–1509 (2019).

    CrossRef  Google Scholar 

  15. Lou, Y., Caruana, R., Gehrke, J.: Intelligible models for classification and regression. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 150–158 (2012)

    Google Scholar 

  16. Lou, Y., Caruana, R., Gehrke, J., Hooker, G.: Accurate intelligible models with pairwise interactions. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 623–631 (2013)

    Google Scholar 

  17. Mayson, S.G.: Bias in, bias out. YAle lJ 128, 2218 (2018)

    Google Scholar 

  18. Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K., Galstyan, A.: A survey on bias and fairness in machine learning. arXiv preprint arXiv:1908.09635 (2019)

  19. Menon, S., Damian, A., Hu, S., Ravi, N., Rudin, C.: Pulse: Self-supervised photo upsampling via latent space exploration of generative models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2437–2445 (2020)

    Google Scholar 

  20. Paudice, A., Muñoz-González, L., Gyorgy, A., Lupu, E.C.: Detection of adversarial training examples in poisoning attacks through anomaly detection. arXiv preprint arXiv:1802.03041 (2018)

  21. Rudin, C., Wang, C., Coker, B.: The age of secrecy and unfairness in recidivism prediction. Harvard Data Sci. Rev. 2(1), 1811 (2018)

    Google Scholar 

  22. Saeed, M., Lieu, C., Raber, G., Mark, R.G.: Mimic ii: a massive temporal ICU patient database to support research in intelligent patient monitoring. In: Computers in Cardiology, pp. 641–644. IEEE (2002)

    Google Scholar 

  23. Steinhardt, J., Koh, P.W., Liang, P.: Certified defenses for data poisoning attacks. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 3520–3532 (2017)

    Google Scholar 

  24. Stekhoven, D.J., Bühlmann, P.: MissForest-non-parametric missing value imputation for mixed-type data. Bioinformatics 28(1), 112–118 (2011).

Download references

Author information

Authors and Affiliations


Corresponding authors

Correspondence to Zhi Chen or Rich Caruana .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chen, Z., Tan, S., Nori, H., Inkpen, K., Lou, Y., Caruana, R. (2021). Using Explainable Boosting Machines (EBMs) to Detect Common Flaws in Data. In: Kamp, M., et al. Machine Learning and Principles and Practice of Knowledge Discovery in Databases. ECML PKDD 2021. Communications in Computer and Information Science, vol 1524. Springer, Cham.

Download citation

  • DOI:

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-93735-5

  • Online ISBN: 978-3-030-93736-2

  • eBook Packages: Computer ScienceComputer Science (R0)