Explainability for Clustering Models

Arora, Mahima; Chopra, Ankush

doi:10.1007/978-981-99-0405-1_1

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1771))

Included in the following conference series:

International Conference on Soft Computing in Data Science

368 Accesses

Abstract

The field of Artificial Intelligence is growing at a very high pace. Application of bigger and complex algorithms have become commonplace, thus making them harder to understand. The explainability of the algorithms and models in practice has become a necessity as these models are being widely adopted to make significant and consequential decisions. It makes it even more important for us to keep our understanding of the decisions and results of AI up to date. Explainable AI methods are currently addressing the interpretability, explainability, and fairness in supervised learning methods. There has been very less focus on explaining the results of unsupervised learning methods. This paper proposes an extension of the supervised explainability methods to deal with the unsupervised methods as well. We have researched and experimented with widely used clustering models to show the applicability of the proposed solution on most practiced unsupervised problems. We also have thoroughly investigated the methods to validate the results of both supervised and unsupervised explainability modules.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Hastie, T., Tibshirani, R.: Generalized additive models: some applications. J. Am. Stat. Assoc. 82(398), 371–386 (1987)
Article MATH Google Scholar
Müller, M.: Generalized linear models. In: Gentle, J., Härdle, W., Mori, Y. (eds.) Handbook of Computational Statistics. Springer Handbooks of Computational Statistics. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-21551-3_24
Ribeiro, M.T., Singh, S., Guestrin, C.: “why should I trust you?”: Explaining the predictions of any classifier. CoRR abs/1602.04938 (2016)
Google Scholar
Lundberg, S.M., Lee, S.-I.: A unified approach to interpreting model predictions. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4–9, 2017, Long Beach, CA, USA, pp. 4765–4774 (2017)
Google Scholar
Kanungo, T., et al.: An efficient k-means clustering algorithm: analysis and implementation. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 881–892 (2002). https://doi.org/10.1109/TPAMI.2002.1017616
Article Google Scholar
Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, KDD 1996, pp. 226–231. AAAI Press (1996)
Google Scholar
Zhang, T., Ramakrishnan, R., Livny, M.: Birch: an efficient data clustering method for very large databases. ACM SIGMOD Rec. 25(2), 103–114 (1996)
Article Google Scholar
Sasirekha, K., Baby, P.: Agglomerative hierarchical clustering algorithm-a. Int. J. Sci. Res. Publ. 83(3), 83 (2013)
Google Scholar
Wickramasinghe, C.S., Amarasinghe, K., Marino, D.L., Rieger, C., Manic, M.: Explainable unsupervised machine learning for cyber-physical systems. IEEE Access 9, 131824–131843 (2021). https://doi.org/10.1109/ACCESS.2021.3112397
Article Google Scholar
Kauffmann, J.R., Esders, M., Montavon, G., Samek, W., Müller, K.-R.: From clustering to cluster explanations via neural networks. CoRR abs/1906.07633 (2019)
Google Scholar
Montavon, G., Kauffmann, J., Samek, W., Müller, K.R.: Explaining the predictions of unsupervised learning models. In: Holzinger, A., Goebel, R., Fong, R., Moon, T., Müller, KR., Samek, W. (eds.) xxAI - Beyond Explainable AI, xxAI 2020. Lecture Notes in Computer Science(), vol. 13200. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-04083-2_7, ISBN 978-3-031-04083-2
Morichetta, A., Casas, P., Mellia, M.: EXPLAIN-IT: towards explainable AI for unsupervised network traffic analysis. CoRR abs/2003.01670 (2020)
Google Scholar
Dasgupta, S., Frost, N., Moshkovitz, M., Rashtchian, C.: Explainable k- means and k-medians clustering. CoRR abs/2002.12538 (2020)
Google Scholar
Bandyapadhyay, S., Fomin, F.V., Golovach, P.A., Lochet, W., Purohit, N., Simonov, K.: How to find a good explanation for clustering? CoRR abs/2112.06580 (2021)
Google Scholar
Gamlath, B., Jia, X., Polak, A., Svensson, O.: Nearly-tight and oblivious algorithms for explainable clustering. CoRR abs/2106.16147 (2021). https://arxiv.org/abs/2106.16147
Fisher, R.A.: Iris. UCI Machine Learning Repository (1988). https://archive.ics.uci.edu/ml/index.php
Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001). https://doi.org/10.1023/A:1010950718922
Article MATH Google Scholar
Ke, G., et al.: Lightgbm: a highly efficient gradient boosting decision tree. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems, vol. 30. Curran Associates, Inc. (2017)
Google Scholar
Zwillinger, D., Kokoska, S.: CRC Standard Probability and Statistics Tables and Formulae. CRC Press, USA (1999)
Book MATH Google Scholar
Wine UCI. Wine. UCI Machine Learning Repository (1991). https://archive.ics.uci.edu/ml/index.php
Cinar, I., Koklu, M.: Classification of rice varieties using artificial intelligence methods. Int. J. Intell. Syst. Appl. Eng. 7(3), 188–194 (2019). https://doi.org/10.18201/ijisae.2019355381
Article Google Scholar

Download references

Author information

Authors and Affiliations

Tredence Inc., Bengaluru, 560048, India
Mahima Arora & Ankush Chopra

Authors

Mahima Arora
View author publications
You can also search for this author in PubMed Google Scholar
Ankush Chopra
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mahima Arora .

Editor information

Editors and Affiliations

Universiti Teknologi MARA, Shah Alam, Malaysia
Marina Yusoff
Baoji University of Arts and Sciences, Baoji, China
Tao Hai
Universiti Teknologi MARA, Shah Alam, Malaysia
Murizah Kassim
Universiti Teknologi MARA, Shah Alam, Malaysia
Azlinah Mohamed
Institute of Liberal Arts and Sciences, Nagoya, Japan
Eisuke Kita

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Arora, M., Chopra, A. (2023). Explainability for Clustering Models. In: Yusoff, M., Hai, T., Kassim, M., Mohamed, A., Kita, E. (eds) Soft Computing in Data Science. SCDS 2023. Communications in Computer and Information Science, vol 1771. Springer, Singapore. https://doi.org/10.1007/978-981-99-0405-1_1

Download citation

DOI: https://doi.org/10.1007/978-981-99-0405-1_1
Published: 17 March 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-0404-4
Online ISBN: 978-981-99-0405-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics