Skip to main content
Log in

Fuzzy clustering to estimate the parameters of block mixture models

  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Finite mixture models are being increasingly used to provide model-based cluster analysis. To tackle the problem of block clustering which aims to organize the data into homogeneous blocks, recently we have proposed a block mixture model; we have considered this model under the classification maximum likelihood approach and we have developed a new algorithm for simultaneous partitioning based on the classification EM algorithm. From the estimation point of view, classification maximum likelihood approach yields inconsistent estimates of the parameters and in this paper we consider the block clustering problem under the maximum likelihood approach; unfortunately, the application of the classical EM algorithm for the block mixture model is not direct: difficulties arise due to the dependence structure in the model and approximations are required. Considering the block clustering problem under a fuzzy approach, we propose a fuzzy block clustering algorithm to approximate the EM algorithm. To illustrate our approach, we study the case of binary data by using a Bernoulli block mixture.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Ambroise C, Govaert G (2000) Clustering by maximizing a fuzzy classification maximum likelihood criterion. In: Bethlehem J, van der Heijden P (eds) Compstat 2000, Proceedings in computational statistics, 14th symposium held in Utrecht, The Netherlands Heidelberg 21-25 August 2000, pp 186–192

  2. Bezdek J (1974) Numerical taxonomy with fuzzy sets. J Math Biol 1:57–71

    Google Scholar 

  3. Bezdek J (1981) Pattern recognition with fuzzy objective function algorithms. Plenum Press, New York

  4. Bock H (1979) Simultaneous clustering of objects and variables. In: Diday E (ed) Analyse des Données et Informatique, INRIA, pp 187–203

  5. Bryant PG (1991) Large-sample results for optimisation-based clustering methods. J Classification 8:31–44

    Google Scholar 

  6. Bryant PG, Williamson JA (1978) Asymptotic behavior of classification maximum likelihood. Biometrika 65(2):273–281

    Google Scholar 

  7. Bryant PG, Williamson JA (1986) Maximum likelihood and classification: a comparison of three approaches. In: Gaul W, Schader M (eds) Classification as a tool of research Amsterdam, North Holland, pp 35–45

  8. Celeux G, Govaert G (1992) A classification em algorithm for clustering and two stochastic versions. Comput Stat Data Anal 14(3):315–332

    Google Scholar 

  9. Dang M, Govaert G (1998) Fuzzy clustering of spatial binary data. Kybernetika 34(4):393–398

    Google Scholar 

  10. Day NE (1969) Estimating the components of a mixture of normal distributions. Biometrika 56:463–474

    Google Scholar 

  11. Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the em algorithm (with discussion). J R Stat Soc B 39:1–38

    Google Scholar 

  12. Diday E (1979) et Collaborateurs. Optimisation et classification automatique. INRIA, Rocquencourt

  13. Dunn JC (1973) A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. J Cybern 3:32–57

    Google Scholar 

  14. Govaert G (1977) Algorithme de classification d'un tableau de contingence. In: First international symposium on data analysis and infomatics, Versailles, INRIA, pp 487–500

  15. Govaert G (1983) Classification croisée. Thèse d'état, Université Paris 6, France

  16. Govaert G (1995) Simultaneous clustering of rows and columns. Control Cybern 24(4):437–458

    Google Scholar 

  17. Govaert G, Nadif M (2002) Block clustering on continuous data. In: Workshop on clustering high dimensional data and its application, second SIAM international conference on data mining 11-13 April 2002 Arlington, pp 7–16

  18. Govaert G, Nadif M (2003) Clustering with block mixture models. Pattern Recognit 36:463–473

    Google Scholar 

  19. Hartigan JA (1975) Clustering algorithms. Wiley, New York

  20. Hathaway RJ (1986) Another interpretation of the EM algorithm for mixture distributions. Stat Probability Lett 4:53–56

    Google Scholar 

  21. McLachlan GJ, Krishman K (1997) The EM Algorithm. Wiley, New York

  22. McLachlan GJ, Peel D (2000) Finite Mixture Models. Wiley, New York

  23. Nadif M, Govaert G, Jollois FX (2002) A hybrid method for identifying homogenous blocks in large data sets. In: Second Euro-Japanese Workshop on Stochastic Risk Modelling for Finance, Insurance, Production and Reliability, 16–19 September 2002. Chamonix, France, pp 324–333

  24. Symons MJ (1981) Clustering criteria and multivariate normal mixture. Biometrics 37:35–43

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to G. Govaert.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Govaert, G., Nadif, M. Fuzzy clustering to estimate the parameters of block mixture models. Soft Comput 10, 415–422 (2006). https://doi.org/10.1007/s00500-005-0502-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-005-0502-z

Keywords

Navigation