Pattern Analysis and Applications

, Volume 14, Issue 2, pp 193–205 | Cite as

MLiT: mixtures of Gaussians under linear transformations

  • Ahmed Fawzi OtoomEmail author
  • Hatice Gunes
  • Oscar Perez Concha
  • Massimo Piccardi
Short Papers


The curse of dimensionality hinders the effectiveness of density estimation in high dimensional spaces. Many techniques have been proposed in the past to discover embedded, locally linear manifolds of lower dimensionality, including the mixture of principal component analyzers, the mixture of probabilistic principal component analyzers and the mixture of factor analyzers. In this paper, we propose a novel mixture model for reducing dimensionality based on a linear transformation which is not restricted to be orthogonal nor aligned along the principal directions. For experimental validation, we have used the proposed model for classification of five “hard” data sets and compared its accuracy with that of other popular classifiers. The performance of the proposed method has outperformed that of the mixture of probabilistic principal component analyzers on four out of the five compared data sets with improvements ranging from 0.5 to 3.2%. Moreover, on all data sets, the accuracy achieved by the proposed method outperformed that of the Gaussian mixture model with improvements ranging from 0.2 to 3.4%.


Dimensionality reduction Regularized maximum-likelihood Mixture models Linear transformations Object classification 



The authors wish to thank the Australian Research Council and iOmniscient Pty Ltd that have partially supported this work under the Linkage Project funding scheme, grant LP0668325.


  1. 1.
    Bellman R (ed) (1961) Adaptive control processes—a guided tour. Princeton University Press, Princeton, New JerseyGoogle Scholar
  2. 2.
    Tipping ME, Bishop CM (1999) Probabilistic principal component analysis. J R Stat Soc Ser B (Stat Methodol) 61(3):611–622MathSciNetCrossRefzbMATHGoogle Scholar
  3. 3.
    Roweis S (1997) EM algorithms for PCA and SPCA. In: Advances in neural information processing systems, vol 10. The MIT Press, Colorado, pp 626–632Google Scholar
  4. 4.
    Bartholomew DJ (ed) (1987) Latent variable models and factor analysis. Charles Griffin, LondonGoogle Scholar
  5. 5.
    Basilevsky A (ed) (1994) Statistical factor analysis and related methods. Wiley, New YorkGoogle Scholar
  6. 6.
    Schölkopf B, Smola A, Müller K-R (1998) Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput 10:1299–1319CrossRefGoogle Scholar
  7. 7.
    Chin T-J, Suter D (2007) Incremental kernel principal component analysis. IEEE Trans Image Process 16(6):1662–1674MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Hinton GE, Dayan P, Revow M (1997) Modeling the manifolds of images of handwritten digits. IEEE Trans Neural Netw 8(1):65–74CrossRefGoogle Scholar
  9. 9.
    Tipping ME, Bishop CM (1999) Mixtures of probabilistic principal component analyzers. Neural Comput 11(2):443–482CrossRefGoogle Scholar
  10. 10.
    Ghahramani Z, Hinton GE (1997) The EM algorithm for mixtures of factor analyzers. Technical Report CRG-TR-96-1, University of Toronto (the original paper for the mixture of factor analyzers)Google Scholar
  11. 11.
    Ridder DD, Franc V (2003) Robust subspace mixture models using t-distribution. In: 14th British Machine Vision Conference (BMVC), London, UK, pp 319–328Google Scholar
  12. 12.
    Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc 39(1):1–38MathSciNetzbMATHGoogle Scholar
  13. 13.
    Bishop CM (ed) (2006) Pattern recognition and machine learning. SpringerGoogle Scholar
  14. 14.
    Kittler JV (1998) Combining classifiers: a theoretical framework. Pattern Anal Appl 1(1):18–27MathSciNetCrossRefGoogle Scholar
  15. 15.
    Neal R, Hinton G (1999) A view of the EM algorithm that justifies incremental, sparse, and other variants. In: Jordan MI (ed) Learning in graphical models. MIT Press, Cambridge, MA, pp 355–368Google Scholar
  16. 16.
    Figueiredo MAF, Jain AK (2002) Unsupervised learning of finite mixture models. IEEE Trans Pattern Anal Mach Intell 24(3):381–396 (avoid singularity by applying deterministic annealing)Google Scholar
  17. 17.
    Bolton RJ, Krzanowski WJ (1999) A characterization of principal components for projection pursuit. Am Stat 53(2):108–109CrossRefGoogle Scholar
  18. 18.
    Asuncion A, Newman DJ (2007) UCI machine learning repositoryGoogle Scholar
  19. 19.
    Jain AK, Duin RPW, Jianchang M (2000) Statistical pattern recognition: a review. IEEE Trans Pattern Anal Mach Intell 22(1):4–37CrossRefGoogle Scholar
  20. 20.
    Breiman L, Spector P (1992) Submodel selection and evaluation in regression: the x-random case. Int Stat Rev 60(3):291–319CrossRefGoogle Scholar
  21. 21.
    Bilmes JA (1998) A gentle tutorial of the EM algorithm and its application to parameter estimation for Gaussian Mixture and Hidden Markov ModelGoogle Scholar
  22. 22.
    Schoenberg R (1997) Constrained maximum likelihood. Comput Econ 10:251–266CrossRefzbMATHGoogle Scholar
  23. 23.
    Golub GH, van Loan CF (1996) Matrix computations. Johns Hopkins University Press, 3rd ednGoogle Scholar

Copyright information

© Springer-Verlag London Limited 2011

Authors and Affiliations

  • Ahmed Fawzi Otoom
    • 1
    Email author
  • Hatice Gunes
    • 1
    • 2
  • Oscar Perez Concha
    • 1
  • Massimo Piccardi
    • 1
  1. 1.School of Computing and Communications, Faculty of Engineering and ITUniversity of Technology, Sydney (UTS)SydneyAustralia
  2. 2.Department of ComputingImperial CollegeLondonUK

Personalised recommendations