Skip to main content
Log in

Comparing Different Approaches to Archetypal Analysis as a Fuzzy Clustering Tool

  • Published:
International Journal of Fuzzy Systems Aims and scope Submit manuscript

Abstract

We summarize the results of an intensive simulation study carried out to compare the performance of three approaches to archetypal analysis regarded as a fuzzy clustering tool: the original approach, namely that of Cutler and Breiman (Technometrics 36(4):338–347, 1994), the Ding et al. (IEEE Trans Pattern Anal Mach Intell 32(1):45–55, 2010) proposal, and the factorized fuzzy c-means algorithm. The artificial data we use in our experiment are generated from polytopes in low-dimensional \({\mathbb {R}}^{n}\) spaces \(\left( 2\le n\le 7\right) \), and comprise a diversity of cluster contexts. The simulation results show that the original proposal is generally a more accurate method to uncover the cluster structure hidden in the data and to reproduce the data themselves. However, this supremacy, if any, is not clear for the data generated from real life problems, and devoted to unsupervised clustering problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Notes

  1. A special session entitled ’SS_37: Matrix Factorization for Fuzzy Clustering and Related Approaches’ took place at 2017 IEEE International Conference on Fuzzy Systems, in Naples, Italy: https://www.fuzzieee2017.org/specialSessions.php.

References

  1. Banerjee, A., Merugu, S., Dhillon, I.S., Ghosh, J.: Clustering with Bregman divergences. J. Mach. Learn. Res. 6, 1705–1749 (2005)

    MathSciNet  MATH  Google Scholar 

  2. Berry, M.W., Browne, M., Langville, A.N., Pauca, V.P., Plemmons, R.J.: Algorithms and applications for approximate nonnegative matrix factorization. Comput. Stat. Data Anal. 52, 155–173 (2007)

    Article  MathSciNet  Google Scholar 

  3. Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York (1981)

    Book  Google Scholar 

  4. Bauckhage, C.: A Note on Archetypal Analysis and the Approximation of Convex Hulls (2014). arXiv:1410.0642. Accessed 27 Nov 2017

  5. Casalino, G., Buono, N.D., Mencar, C.: Subtractive clustering for seeding non-negative matrix factorizations. Inf. Sci. 257, 369–387 (2014)

    Article  MathSciNet  Google Scholar 

  6. Chawla, N.V.: Data mining for imbalanced data: an overview. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, pp. 853–867. Springer, Cham (2005)

    Chapter  Google Scholar 

  7. Chen, Y., Mairal, J., Harchaoui Z.: Fast and robust archetypal analysis for representation learning. In: Proccedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1478–1485 (2014)

  8. Cichocki, A., Zdunek, R., Amari, S.: Csiszár’s divergence for non-negative matrix factorization: family of new algorithms. In: Rosca, J., Erdogmus, D., Príncipe, J.P., Haykin, S. (Eds.), Independent Component Analysis and Blind Signal Separation, Proceedings of 6th International Conference, ICA, pp. 32–39 (2006)

  9. Cichocki, A., Lee, H., Kim, Y.D., Choi, S.: Nonnegative Matrix factorization with \(\alpha \) -divergence. Pattern Recognition Letters 29(9), 1433–1440 (2008)

    Article  Google Scholar 

  10. Cichocki, A., Zdunek, R., Phan, A.H., Amari, S.: Nonnegative Matrix and Tensor Factorizations. John Wiley & Sons Ltd, Chichester, UK (2009)

    Book  Google Scholar 

  11. Cichocki, A., Cruces, S., Amari, S.: Generalized Alpha-Beta Divergences and Their Application to Robust Nonnegative Matrix Factorization. Entropy 13, 134–170 (2011). https://doi.org/10.3390/e13010134

    Article  Google Scholar 

  12. Cutler, A., Breiman, L.: Archetypal Analysis. Technometrics 36(4), 338–347 (1994)

    Article  MathSciNet  Google Scholar 

  13. Ding, C., Li, T., Jordan, M.: Convex and Semi-Nonnegative Matrix Factorizations. IEEE Transactions on Pattern Analysis and Machine Intelligence 32(1), 45–55 (2010)

    Article  Google Scholar 

  14. Donoho, D., Stodden, V.: “When does non-negative matrix factorization give a correct decomposition into parts?”. In Advances in Neural Information Processing Systems 16 - Proceedings of the 2003 Conference, NIPS 2003 (Advances in Neural Information Processing Systems). Neural information processing systems foundation (2004)

  15. Dua, D., Graff, C.: UCI Machine Learning Repository (2019). [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science

  16. Epifanio, I.: Functional Archetype and Archetypoid Analysis. Computational Statistics and Data Analysis 104, 24–34 (2017)

    Article  MathSciNet  Google Scholar 

  17. Eugster, M.J.A., Leisch, F.: From Spider-Man to Hero - Archetypal Analysis in R. Journal of Statistical Software 30(8), 1–23 (2009). https://doi.org/10.18637/jss.v030.i08

    Article  Google Scholar 

  18. Eugster, M.J.A., Leisch, F.: Weighted and Robust Archetypal Analysis. Computational Statistics and Data Analysis 55, 1215–1255 (2011)

    Article  MathSciNet  Google Scholar 

  19. Fernádez, A., López, V., Galar, M., Jesus, M.J., Herrera, F.: Analysing the Classification of Imbalanced Data-sets with Multiple Classes: Binarization Techniques and Ad-hoc Approaches. Knowledge-Based Systems 42, 97–110 (2013)

    Article  Google Scholar 

  20. Galar, M., Fernández, A., Barrenechea, E., Bustince, H., Herrera, F.: A Review on Ensembles for the Class Imbalance Problem: Bagging-, Boosting-, and Hybrid-Based Approaches. IEEE Transactions on Systems, Man, and Cybernetics, Part C 42(4), 463–484 (2012)

    Article  Google Scholar 

  21. Gawrilow, E., Joswig, M.: “polymake: a Framework for Analyzing Convex Polytopes”. In: Kalai G, Ziegler GM (eds) Polytopes Combinatorics and Computation. Birkhäuser, 43–74 (2000)

  22. Hüllermeier, E., Rifqi, M., Henzgen, S., Senge, R.: Comparing Fuzzy Partitions: A Generalization of the Rand Index and Related Measures. IEEE Transactions on Fuzzy Systems 20(3), 546–556 (2012)

    Article  Google Scholar 

  23. Jain, A.K., Murty, M.N., Flynn, P.J.: Data Clustering: A Review. ACM Computing Surveys 31(3), 1–68 (1999)

    Article  Google Scholar 

  24. Kompass, R.: A Generalized Divergence Measure for Nonnegative Matrix Factorization. Neural Computation 19, 780–791 (2007)

    Article  MathSciNet  Google Scholar 

  25. Koren, Y., Bell, R., Volinsky, C.: Matrix Factorization Techniques for Recommender Systems. Computer 42(8), 30–37 (2009)

    Article  Google Scholar 

  26. Lee, D.D., Seung, H.S.: Learning the Parts of Objects by Non-Negative Matrix Factorization. Nature 401, 788–791 (1999)

    Article  Google Scholar 

  27. Matsushita, R., Tanaka, T.: “Low-rank Matrix Reconstruction and Clustering via Approximate Message Passing”, in C.J.C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K.Q. Weinberger (Eds.) Advances in Neural Information Processing Systems 26, Curran Associates, Inc., pp. 917–925 (2013)

  28. McNamee, P.: A Comparison of the Grade of Membership Measure with Alternative Health Indicators in Explaining Cost for Older People. Health Economics 13, 379–395 (2004)

    Article  Google Scholar 

  29. Mendes, G.S., Nascimento, S.: “A Study of Fuzzy Clustering to Archetypal Analysis”, In: Yin H., Camacho D., Novais P., Tallón-Ballesteros A. (eds) Intelligent Data Engineering and Automated Learning – IDEAL 2018. IDEAL 2018. Lecture Notes in Computer Science, vol 11315, Springer, Cham, pp. 250–261. https://doi.org/10.1007/978-3-030-03496-2_28 (2018)

  30. Mirkin, B.G., Satarov, G.A.: Method of Fuzzy Additive Types for Analysis of Multidimensional Data I. Automation and Remote Control 51(5), 683–688 (1990)

    MathSciNet  MATH  Google Scholar 

  31. Mørup, M., Hansen, L.K.: Archetypal Analysis for Machine Learning and Data Mining. Neurocomputing 80, 54–63 (2012)

    Article  Google Scholar 

  32. nascimento, S., Mirkin, B., Moura-Pires, F.: Modeling Proportional Membership in Fuzzy Clustering. IEEE Transactions on Fuzzy Systems 11(2), 173–186 (2003)

    Article  Google Scholar 

  33. Nascimento, S.: Fuzzy Clestering with Proportional Membership Model. IOS Press, Amsterdam (2005)

    Google Scholar 

  34. Nascimento, S., Mirkin, B.: “Ideal Type Model and an Associated Method for Relational Fuzzy Clustering”, Procedings of the 2017 IEEE International Conference on Fuzzy Systems, https://doi.org/10.1109/FUZZ-IEEE.2017.8015473 (2017)

  35. Paatero, P., Tapper, U.: Positive Matrix Factorization: A Non-Negative Factor Model with Optimal Utilization of Error Estimates of Data Values. Environmetrics 5, 111–126 (1994)

    Article  Google Scholar 

  36. Pedrycz, W., Oliveira, J.V.: A Development of Fuzzy Encoding and Decoding Through Fuzzy Clustering. IEEE Transactions on Instrumentation and Measurement 57(4), 829–837 (2008)

    Article  Google Scholar 

  37. Suleman, A.: A Convex Semi-nonnegative Matrix Factorisation Approach to Fuzzy \(c\)-means Clustering. Fuzzy Sets and Systems 270, 90–110 (2015)

    Article  MathSciNet  Google Scholar 

  38. Suleman, A. (a): A Fuzzy Clustering Approach to Evaluate Individual Competencies from REFLEX Data. Journal of Applied Statistics 44(14), 2513–2533 (2017). https://doi.org/10.1080/02664763.2016.1257589

  39. Suleman, A. (b): “Validation of Archetypal Analysis”, Procedings of the 2017 IEEE International Conference on Fuzzy Systems (2017), https://doi.org/10.1109/FUZZ-IEEE.2017.8015385

  40. Suleman, A. (c): Assessing a Fuzzy Extension of Rand Index and Related Measures. IEEE Transactions on Fuzzy Systems 25(1), 237–244 (2017)

  41. Talbot, L.M., Talbot, B.G., Peterson, R.E., Tolley, H., Mecham, H.D.: Application of Fuzzy Grade-of-Membership Clustering to Analysis of Remote Sensing Data. Journal of Climate 12, 200–219 (1999)

    Article  Google Scholar 

  42. Thurau, C., Kersting, K., Wahabzada, M., Bauckhage, C.: Convex Non-Negative Matrix Factorization for Massive Datasets. Knowledg Information System 29, 457–478 (2011). https://doi.org/10.1007/s10115-010-0352-6

    Article  MATH  Google Scholar 

  43. Varki, S., Cooil, B., Rust, R.T.: “Modeling Fuzzy Data in Qualitative Marketing Research”, Journal of Marketing Research XXXVII, 480–489 (2000)

  44. Vinué, G., Epifanio, I., Alemany, S.: Archetypoids: A New Approach to define Representative Archetypal Data. Computational Statistics and Data Analysis 87, 102–115 (2015)

    Article  MathSciNet  Google Scholar 

  45. Wang, S., Yao, X.: Multiclass Imbalance Problems: Analysis and Potential Solutions. IEEE Transactions on Systems, Man, and Cybernetics, Part B 42(4), 1119–1130 (2012)

    Article  Google Scholar 

  46. Winkler, R., Klawonn, F., Kruse, R.: Fuzzy \(c\)-means in high dimensional spaces. Int. Jnl. of Fuzzy Syst. Appl. 1, 1–16 (2011)

    Google Scholar 

  47. Woodbury, M.A., Clive, J.: Clinical Pure Types as a Fuzzy Partition. Journal of Cybernetics 11, 277–298 (1974)

    MATH  Google Scholar 

  48. Xie, X.L., Beni, G.: A validity measure for fuzzy clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence 13(8), 841–847 (1991)

    Article  Google Scholar 

  49. Yang, Q., Wu, X.: 10 challenging problems in data mining research. International Journal of Information Techology & Decision Making 5(4), 597–604 (2006)

    Article  Google Scholar 

  50. Zhang, Z.-Y.: “Nonnegative Matrix factorization: Models, Algorithms and Applications”, in D.E. Holmes and L.C. Jain (Eds): Data Mining: Foundations and Intelligent Paradigms 24, pp. 99 – 134 (2012)

Download references

Acknowledgements

The author expresses his gratitude to two anonymous reviewers for their many suggestions, careful reading and helpful comments on the earlier version of this manuscript. This work was supported by Fundação para a Ciência e a Tecnologia (FCT), Grant UIDB/00315/2020.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Abdul Suleman.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Suleman, A. Comparing Different Approaches to Archetypal Analysis as a Fuzzy Clustering Tool. Int. J. Fuzzy Syst. 23, 2182–2199 (2021). https://doi.org/10.1007/s40815-021-01088-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40815-021-01088-9

Keywords

Navigation