Abstract
Finite mixture models are being increasingly used to provide model-based cluster analysis. To tackle the problem of block clustering which aims to organize the data into homogeneous blocks, recently we have proposed a block mixture model; we have considered this model under the classification maximum likelihood approach and we have developed a new algorithm for simultaneous partitioning based on the classification EM algorithm. From the estimation point of view, classification maximum likelihood approach yields inconsistent estimates of the parameters and in this paper we consider the block clustering problem under the maximum likelihood approach; unfortunately, the application of the classical EM algorithm for the block mixture model is not direct: difficulties arise due to the dependence structure in the model and approximations are required. Considering the block clustering problem under a fuzzy approach, we propose a fuzzy block clustering algorithm to approximate the EM algorithm. To illustrate our approach, we study the case of binary data by using a Bernoulli block mixture.
Similar content being viewed by others
References
Ambroise C, Govaert G (2000) Clustering by maximizing a fuzzy classification maximum likelihood criterion. In: Bethlehem J, van der Heijden P (eds) Compstat 2000, Proceedings in computational statistics, 14th symposium held in Utrecht, The Netherlands Heidelberg 21-25 August 2000, pp 186–192
Bezdek J (1974) Numerical taxonomy with fuzzy sets. J Math Biol 1:57–71
Bezdek J (1981) Pattern recognition with fuzzy objective function algorithms. Plenum Press, New York
Bock H (1979) Simultaneous clustering of objects and variables. In: Diday E (ed) Analyse des Données et Informatique, INRIA, pp 187–203
Bryant PG (1991) Large-sample results for optimisation-based clustering methods. J Classification 8:31–44
Bryant PG, Williamson JA (1978) Asymptotic behavior of classification maximum likelihood. Biometrika 65(2):273–281
Bryant PG, Williamson JA (1986) Maximum likelihood and classification: a comparison of three approaches. In: Gaul W, Schader M (eds) Classification as a tool of research Amsterdam, North Holland, pp 35–45
Celeux G, Govaert G (1992) A classification em algorithm for clustering and two stochastic versions. Comput Stat Data Anal 14(3):315–332
Dang M, Govaert G (1998) Fuzzy clustering of spatial binary data. Kybernetika 34(4):393–398
Day NE (1969) Estimating the components of a mixture of normal distributions. Biometrika 56:463–474
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the em algorithm (with discussion). J R Stat Soc B 39:1–38
Diday E (1979) et Collaborateurs. Optimisation et classification automatique. INRIA, Rocquencourt
Dunn JC (1973) A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. J Cybern 3:32–57
Govaert G (1977) Algorithme de classification d'un tableau de contingence. In: First international symposium on data analysis and infomatics, Versailles, INRIA, pp 487–500
Govaert G (1983) Classification croisée. Thèse d'état, Université Paris 6, France
Govaert G (1995) Simultaneous clustering of rows and columns. Control Cybern 24(4):437–458
Govaert G, Nadif M (2002) Block clustering on continuous data. In: Workshop on clustering high dimensional data and its application, second SIAM international conference on data mining 11-13 April 2002 Arlington, pp 7–16
Govaert G, Nadif M (2003) Clustering with block mixture models. Pattern Recognit 36:463–473
Hartigan JA (1975) Clustering algorithms. Wiley, New York
Hathaway RJ (1986) Another interpretation of the EM algorithm for mixture distributions. Stat Probability Lett 4:53–56
McLachlan GJ, Krishman K (1997) The EM Algorithm. Wiley, New York
McLachlan GJ, Peel D (2000) Finite Mixture Models. Wiley, New York
Nadif M, Govaert G, Jollois FX (2002) A hybrid method for identifying homogenous blocks in large data sets. In: Second Euro-Japanese Workshop on Stochastic Risk Modelling for Finance, Insurance, Production and Reliability, 16–19 September 2002. Chamonix, France, pp 324–333
Symons MJ (1981) Clustering criteria and multivariate normal mixture. Biometrics 37:35–43
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Govaert, G., Nadif, M. Fuzzy clustering to estimate the parameters of block mixture models. Soft Comput 10, 415–422 (2006). https://doi.org/10.1007/s00500-005-0502-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-005-0502-z