Semi-Supervised Clustering: Application to Image Segmentation

  • Mário A. T. Figueiredo
Part of the Studies in Classification, Data Analysis, and Knowledge Organization book series (STUDIES CLASS)


This paper describes a new approach to semi-supervised model-based clustering. The problem is formulated as penalized logistic regression, where the labels are only indirectly observed (via the component densities). This formulation allows deriving a generalized EM algorithm with closed-form update equations, which is in contrast with other related approaches which require expensive Gibbs sampling or suboptimal algorithms. We show how this approach can be naturally used for image segmentation under spatial priors, avoiding the usual hard combinatorial optimization required by classical Markov random fields; this opens the door to the use of sophisticated spatial priors (such as those based on wavelet representations) in a simple and computationally very efficient way.


Image Segmentation Neural Information Processing System Markov Random Field Modelling Iterate Conditional Mode Spatial Prior 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. BALRAM, N. and MOURA, J. (1993): Noncausal Gauss-Markov Random Fields: Parameter Structure and Estimation. IEEE Transactions on Information Theory, 39, 1333–1355.CrossRefzbMATHGoogle Scholar
  2. BANERJEE, A., MERUGU. S., DHILLON, I. and GHOSH, J. (2004): Clustering With Bregman Divergences. Proc. SIAM International Conference on Data Mining, Lake Buena Vista.Google Scholar
  3. BASU, S., BILENKO, M. and MOONEY, R. (2004): A Probabilistic Framework for Semi-supervised Clustering. Proc. International Conference on Knowledge Discovery and Data Mining, Seattle.Google Scholar
  4. BELKIN, M. and NIYOGI, P. (2003): Using Manifold Structure for Partially Labelled Classification. Proc. Neural Information Processing Systems 15, MIT Press, Cambridge.Google Scholar
  5. BÖHNING, D. (1992): Multinomial Logistic Regression Algorithm. Annals of the Institute of Statistical Mathematics, 44, 197–200.CrossRefMathSciNetzbMATHGoogle Scholar
  6. CEBRON, N. and BERTHOLD, M. (2006): Mining of Cell Assay Images Using Active Semi-supervised Clustering. Proc. Workshop on Computational Intelligence in Data Mining, Houston.Google Scholar
  7. FIGUEIREDO, M. (2005): Bayesian Image Segmentation Using Wavelet-based Priors. Proc. IEEE Conference on Computer Vision and Pattern Recognition, San Diego.Google Scholar
  8. GRIRA, N., CRUCIANU, M. and BOUJEMAA, N. (2005): Active and Semi-supervised Clustering for Image Database Categorization. Proc. IEEE/EURASIP Workshop on Content Based Multimedia Indexing, Riga, Latvia.Google Scholar
  9. HASTIE, T., TIBSHIRANI, R. and FRIEDMAN, J. (2001): The Elements of Statistical Learning. Springer, New York.CrossRefzbMATHGoogle Scholar
  10. KRISHNAPURAM, B., WILLIAMS, D., XUE, Y., HARTEMINK, A., CARIN, L. and FIGUEIREDO, M. (2005): On Semi-supervised Classification. Proc. Neural Information Processing Systems 17, MIT Press, Cambridge.Google Scholar
  11. LANGE, K., HUNTER, D. and YANG, I. (2000): Optimization Transfer Using Surrogate Objective Functions. Jour. Computational and Graphical Statistics, 9, 1–59.MathSciNetGoogle Scholar
  12. LAW, M., TOPCHY, A. and JAIN, A. K. (2005): Model-based Clustering With Probabilistic Constraints. Proc. SIAM Conference on Data Mining, Newport Beach.Google Scholar
  13. LI, S. (2001): Markov Random Field Modelling in Computer Vision, Springer, Tokyo.Google Scholar
  14. LU, Z. and LEEN, T. (2005): Probabilistic Penalized Clustering. Proc. Neural Information Processing Systems 17, MIT Press, Cambridge.Google Scholar
  15. MALLAT, S. (1998): A Wavelet Tour of Signal Processing. Academic Press, San Diego, USA.zbMATHGoogle Scholar
  16. MCLACHLAN, G. and KRISHNAN, T. (1997): The EM Algorithm and Extensions. Wiley, New York.zbMATHGoogle Scholar
  17. MOULIN, P. and LIU, J. (1999): Analysis of Multiresolution Image Denoising Schemes Using Generalized-Gaussian and Ccomplexity Priors. IEEE Transactions on Information Theory, 45, 909–919.CrossRefMathSciNetzbMATHGoogle Scholar
  18. NIKKILÄ, J., TÖRÖNEN, P., SINKKONEN, J. and KASKI, S. (2001): Analysis of Gene Expression Data Using Semi-supervised Clustering. Proc. Bioinformatics 2001, Skövde.Google Scholar
  19. SEEGER, M. (2001): Learning With Labelled and Unlabelled Data. Technical Report, Institute for Adaptive and Neural Computation, University of Edinburgh.Google Scholar
  20. SHENTAL, N., BAR-HILLEL, A., HERTZ, T. and WEINSHALL, D. (2003): Computing Gaussian Mixture Models With EM Using Equivalence Constraints. Proc. Neural Information Processing Systems 15, MIT Press, Cambridge.Google Scholar
  21. WAGSTAFF, K., CARDIE, C., ROGERS, S. and SCHRÖDL, S. (2001): Constrained K-means Clustering With Background Knowledge. Proc. International Conference on Machine Learning, Williamstown.Google Scholar
  22. WU, C. (1983): On the Convergence Properties of the EM Algorithm. Annals of Statistics, 11, 95–103.CrossRefMathSciNetzbMATHGoogle Scholar
  23. ZHONG, S. (2006): Semi-supervised Model-based Document Clustering: A Comparative Study. Machine Lerning, 2006 (in press).Google Scholar
  24. ZHU, X. (2006): Semi-Supervised Learning Literature Survey. Technical Report, Computer Sciences Department, University of Wisconsin, Madison.Google Scholar
  25. ZHU, X., GHAHRAMANI, Z. and LAFFERTY, J. (2003): Semi-supervised Learning Using Gaussian Fields and Harmonic Functions. Proc. International Conference on Machine Learning, Washington DC.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Mário A. T. Figueiredo
    • 1
  1. 1.Instituto de Telecomunicações and Instituto Superior TécnicoTechnical University of LisbonLisboaPortugal

Personalised recommendations