Skip to main content

Explainability for Clustering Models

  • Conference paper
  • First Online:
Soft Computing in Data Science (SCDS 2023)

Abstract

The field of Artificial Intelligence is growing at a very high pace. Application of bigger and complex algorithms have become commonplace, thus making them harder to understand. The explainability of the algorithms and models in practice has become a necessity as these models are being widely adopted to make significant and consequential decisions. It makes it even more important for us to keep our understanding of the decisions and results of AI up to date. Explainable AI methods are currently addressing the interpretability, explainability, and fairness in supervised learning methods. There has been very less focus on explaining the results of unsupervised learning methods. This paper proposes an extension of the supervised explainability methods to deal with the unsupervised methods as well. We have researched and experimented with widely used clustering models to show the applicability of the proposed solution on most practiced unsupervised problems. We also have thoroughly investigated the methods to validate the results of both supervised and unsupervised explainability modules.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Hastie, T., Tibshirani, R.: Generalized additive models: some applications. J. Am. Stat. Assoc. 82(398), 371–386 (1987)

    Article  MATH  Google Scholar 

  2. Müller, M.: Generalized linear models. In: Gentle, J., Härdle, W., Mori, Y. (eds.) Handbook of Computational Statistics. Springer Handbooks of Computational Statistics. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-21551-3_24

  3. Ribeiro, M.T., Singh, S., Guestrin, C.: “why should I trust you?”: Explaining the predictions of any classifier. CoRR abs/1602.04938 (2016)

    Google Scholar 

  4. Lundberg, S.M., Lee, S.-I.: A unified approach to interpreting model predictions. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4–9, 2017, Long Beach, CA, USA, pp. 4765–4774 (2017)

    Google Scholar 

  5. Kanungo, T., et al.: An efficient k-means clustering algorithm: analysis and implementation. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 881–892 (2002). https://doi.org/10.1109/TPAMI.2002.1017616

    Article  Google Scholar 

  6. Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, KDD 1996, pp. 226–231. AAAI Press (1996)

    Google Scholar 

  7. Zhang, T., Ramakrishnan, R., Livny, M.: Birch: an efficient data clustering method for very large databases. ACM SIGMOD Rec. 25(2), 103–114 (1996)

    Article  Google Scholar 

  8. Sasirekha, K., Baby, P.: Agglomerative hierarchical clustering algorithm-a. Int. J. Sci. Res. Publ. 83(3), 83 (2013)

    Google Scholar 

  9. Wickramasinghe, C.S., Amarasinghe, K., Marino, D.L., Rieger, C., Manic, M.: Explainable unsupervised machine learning for cyber-physical systems. IEEE Access 9, 131824–131843 (2021). https://doi.org/10.1109/ACCESS.2021.3112397

    Article  Google Scholar 

  10. Kauffmann, J.R., Esders, M., Montavon, G., Samek, W., Müller, K.-R.: From clustering to cluster explanations via neural networks. CoRR abs/1906.07633 (2019)

    Google Scholar 

  11. Montavon, G., Kauffmann, J., Samek, W., Müller, K.R.: Explaining the predictions of unsupervised learning models. In: Holzinger, A., Goebel, R., Fong, R., Moon, T., Müller, KR., Samek, W. (eds.) xxAI - Beyond Explainable AI, xxAI 2020. Lecture Notes in Computer Science(), vol. 13200. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-04083-2_7, ISBN 978-3-031-04083-2

  12. Morichetta, A., Casas, P., Mellia, M.: EXPLAIN-IT: towards explainable AI for unsupervised network traffic analysis. CoRR abs/2003.01670 (2020)

    Google Scholar 

  13. Dasgupta, S., Frost, N., Moshkovitz, M., Rashtchian, C.: Explainable k- means and k-medians clustering. CoRR abs/2002.12538 (2020)

    Google Scholar 

  14. Bandyapadhyay, S., Fomin, F.V., Golovach, P.A., Lochet, W., Purohit, N., Simonov, K.: How to find a good explanation for clustering? CoRR abs/2112.06580 (2021)

    Google Scholar 

  15. Gamlath, B., Jia, X., Polak, A., Svensson, O.: Nearly-tight and oblivious algorithms for explainable clustering. CoRR abs/2106.16147 (2021). https://arxiv.org/abs/2106.16147

  16. Fisher, R.A.: Iris. UCI Machine Learning Repository (1988). https://archive.ics.uci.edu/ml/index.php

  17. Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001). https://doi.org/10.1023/A:1010950718922

    Article  MATH  Google Scholar 

  18. Ke, G., et al.: Lightgbm: a highly efficient gradient boosting decision tree. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems, vol. 30. Curran Associates, Inc. (2017)

    Google Scholar 

  19. Zwillinger, D., Kokoska, S.: CRC Standard Probability and Statistics Tables and Formulae. CRC Press, USA (1999)

    Book  MATH  Google Scholar 

  20. Wine UCI. Wine. UCI Machine Learning Repository (1991). https://archive.ics.uci.edu/ml/index.php

  21. Cinar, I., Koklu, M.: Classification of rice varieties using artificial intelligence methods. Int. J. Intell. Syst. Appl. Eng. 7(3), 188–194 (2019). https://doi.org/10.18201/ijisae.2019355381

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mahima Arora .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Arora, M., Chopra, A. (2023). Explainability for Clustering Models. In: Yusoff, M., Hai, T., Kassim, M., Mohamed, A., Kita, E. (eds) Soft Computing in Data Science. SCDS 2023. Communications in Computer and Information Science, vol 1771. Springer, Singapore. https://doi.org/10.1007/978-981-99-0405-1_1

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-0405-1_1

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-0404-4

  • Online ISBN: 978-981-99-0405-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics