Latent structure models involve real, potentially observable variables and latent, unobservable variables. The framework includes various particular types of model, such as factor analysis, latent class analysis, latent trait analysis, latent profile models, mixtures of factor analysers, state-space models and others. The simplest scenario, of a single discrete latent variable, includes finite mixture models, hidden Markov chain models and hidden Markov random field models. The paper gives a brief tutorial of the application of maximum likelihood and Bayesian approaches to the estimation of parameters within these models, emphasising especially the fact that computational complexity varies greatly among the different scenarios. In the case of a single discrete latent variable, the issue of assessing its cardinality is discussed. Techniques such as the EM algorithm, Markov chain Monte Carlo methods and variational approximations are mentioned.


Mixture Model Hide Markov Model Markov Chain Monte Carlo Latent Class Analysis Markov Chain Monte Carlo Method 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bartholomew, D.J.: The foundations of factor analysis. Biometrika 71, 221–232 (1984)MathSciNetCrossRefMATHGoogle Scholar
  2. 2.
    Gibson, W.A.: Three multivariate models: factor analysis, latent structure analysis and latent profile analysis. Psychometrika 24, 229–252 (1959)MathSciNetCrossRefMATHGoogle Scholar
  3. 3.
    Ghahramani, Z.: Factorial learning and the EM algorithm. In: Tesauro, G., Touretzky, D.S., Leen, T.K. (eds.) Advances in Neural Information Processing Systems, vol. 7. MIT Press, Cambridge (1996)Google Scholar
  4. 4.
    Bartholomew, D.J.: Latent Variable Models and Factor Analysis. Griffin, London (1987)MATHGoogle Scholar
  5. 5.
    MacKay, D.J.C.: Bayesian neural networks and density networks. Instr. Meth. Phys. Res. A 354, 73–80 (1995)CrossRefGoogle Scholar
  6. 6.
    Hagenaars, J.A.: Categorical Longitudinal Data. Sage, London (1990)Google Scholar
  7. 7.
    Neal, R.M.: Probabilistic inference using Markov chain Monte Carlo methods. Tech. Report CRG-TR-93-1, Dept. Comp. Sci., Univ. Toronto (1993)Google Scholar
  8. 8.
    Dunmur, A.P., Titterington, D.M.: Analysis of latent structure models with multidimensional latent variables. In: Kay, J.W., Titterington, D.M. (eds.) Statistics and Neural Networks: Recent Advances at the Interface, pp. 165–194. Oxford University Press, Oxford (1999)Google Scholar
  9. 9.
    Ghahramani, Z., Beal, M.: Variational inference for Bayesian mixtures of factor analyzers. In: Solla, S.A., Leen, T.K., Müller, K.-R. (eds.) Advances in Neural Information Processing, vol. 12, pp. 449–455. MIT Press, Cambridge (2000)Google Scholar
  10. 10.
    Fokoué, E., Titterington, D.M.: Mixtures of factor analysers: Bayesian estimation and inference by stochastic simulation. Machine Learning 50, 73–94 (2003)CrossRefMATHGoogle Scholar
  11. 11.
    Geman, S., Geman, D.: Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Trans. Patt. Anal. Mach. Intell. 6, 721–741 (1984)CrossRefMATHGoogle Scholar
  12. 12.
    Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm (with discussion). J. R. Statist. Soc. B 39, 1–38 (1977)MathSciNetMATHGoogle Scholar
  13. 13.
    Rabiner, L.: A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77, 257–285 (1989)CrossRefGoogle Scholar
  14. 14.
    Younes, L.: Parameter estimation for imperfectly observed Gibbsian fields. Prob. Theory Rel. Fields 82, 625–645 (1989)CrossRefMATHGoogle Scholar
  15. 15.
    Geyer, C.J., Thompson, E.A.: Constrained Monte Carlo maximum likelihood for dependent data (with discussion). J.R. Statist. Soc. B 54, 657–699 (1992)MathSciNetGoogle Scholar
  16. 16.
    Qian, W., Titterington, D.M.: Estimation of parameters in hidden Markov models. Phil. Trans. R. Soc. Lond. A 337, 407–428 (1991)CrossRefMATHGoogle Scholar
  17. 17.
    Besag, J.E.: On the statistical analysis of dirty pictures (with discussion). J.R. Statist. Soc. B 48, 259–302 (1986)MATHGoogle Scholar
  18. 18.
    Besag, J.E.: Statistical analysis of non-lattice data. The Statistician 24, 179–195 (1975)CrossRefGoogle Scholar
  19. 19.
    Hall, P., Humphreys, K., Titterington, D.M.: On the adequacy of variational lower bounds for likelihood-based inference in Markovian models with missing values. J. R. Statist. Soc. B 64, 549–564 (2002)MathSciNetCrossRefMATHGoogle Scholar
  20. 20.
    Bishop, C.M., Lawrence, N., Jaakkola, T.S., Jordan, M.I.: Approximating posterior distributions in belief networks using mixtures. In: Jordan, M.I., Kearns, M.J., Solla, S.A. (eds.) Advances in Neural Information Processing Systems, vol. 10, pp. 416–422. MIT Press, Cambridge (1998)Google Scholar
  21. 21.
    Humphreys, K., Titterington, D.M.: Improving the mean field approximation in belief networks using Bahadur’s reparameterisation of the multivariate binary distribution. Neural Processing Lett. 12, 183–197 (2000)CrossRefMATHGoogle Scholar
  22. 22.
    Wainwright, M.J., Jordan, M.I.: Graphical models, exponential families, and variational approximations. Technical Report 649, Dept. Statistics, Univ. California, Berkeley (2003)Google Scholar
  23. 23.
    Jordan, M.I., Gharamani, Z., Jaakkola, T.S., Saul, L.K.: An introduction to variational methods for graphical models. In: Jordan, M. (ed.) Learning in Graphical Models, pp. 105–162. MIT Press, Cambridge (1999)Google Scholar
  24. 24.
    Zhang, J.: The Mean Field Theory in EM procedures for Markov random fields. IEEE Trans. Signal Processing 40, 2570–2583 (1992)CrossRefMATHGoogle Scholar
  25. 25.
    Zhang, J.: The Mean Field Theory in EM procedures for blind Markov random field image restoration. IEEE Trans. Image Processing 2, 27–40 (1993)CrossRefGoogle Scholar
  26. 26.
    Robert, C.P.: The Bayesian Choice, 2nd edn. Springer, Heidelberg (2001)Google Scholar
  27. 27.
    Tierney, L., Kadane, J.B.: Accurate approximations to posterior moments and marginal densities. J. Amer. Statist. Assoc. 81, 82–86 (1986)MathSciNetCrossRefMATHGoogle Scholar
  28. 28.
    Diebolt, J., Robert, C.P.: Estimation of finite mixture distributions through Bayesian sampling. J.R. Statist. Soc. B 56, 363–375 (1994)MathSciNetMATHGoogle Scholar
  29. 29.
    Robert, C.P., Celeux, G., Diebolt, J.: Bayesian estimation of hidden Markov chains: a stochastic implementation. Statist. Prob. Lett. 16, 77–83 (1993)MathSciNetCrossRefMATHGoogle Scholar
  30. 30.
    Rydén, T., Titterington, D.M.: Computational Bayesian analysis of hidden Markov models. J. Comp. Graph. Statist. 7, 194–211 (1998)MathSciNetGoogle Scholar
  31. 31.
    Gelfand, A.E., Smith, A.F.M.: Sampling-based approaches to calculating marginal densities. J. Amer. Statist. Assoc. 85, 398–409 (1990)MathSciNetCrossRefMATHGoogle Scholar
  32. 32.
    Gilks, W.R., Richardson, S., Spiegelhalter, D.J. (eds.): Markov Chain Monte Carlo in Practice. Chapman and Hall, Boca RatonGoogle Scholar
  33. 33.
    Doucet, A., de Freitas, N., Gordon, N. (eds.): Sequential Monte Carlo Methods in Practice. Springer, HeidelbergGoogle Scholar
  34. 34.
    Murray, I., Ghahramani, Z.: Bayesian learning in undirected graphical models: approximate MCMC algorithms. In: Chickering, M., Halperin, J. (eds.) Proc. 20th Conf. Uncertainty in Artificial Intell., pp. 577–584. AUAI Press (2004)Google Scholar
  35. 35.
    Corduneanu, A., Bishop, C.M.: Variational Bayesian model selection for mixture distributions. In: Richardson, T., Jaakkola, T. (eds.) Proc. 8th Int. Conf. Artific. Intell. Statist., pp. 27–34. Morgan Kaufmann, San Mateo (2001)Google Scholar
  36. 36.
    Ueda, N., Ghahramani, Z.: Bayesian model search for mixture models based on optimizing variational bounds. Neural Networks 15, 1223–1241 (2003)CrossRefGoogle Scholar
  37. 37.
    MacKay, D.J.C.: Ensemble learning for hidden Markov models. Technical Report, Cavendish Lab., Univ. Cambridge (1997)Google Scholar
  38. 38.
    McGrory, C.A.: Ph.D. Dissertation, Dept. Statist., Univ. Glasgow (2005)Google Scholar
  39. 39.
    Wang, B., Titterington, D.M.: Convergence properties of a general algorithm for calculating variational Bayesian estimates for a normal mixture model. Bayesian Analysis 1 (to appear, 2006)Google Scholar
  40. 40.
    Wang, B., Titterington, D.M.: Variational Bayes estimation of mixing coefficients. In: Winkler, J.R., Niranjan, M., Lawrence, N.D. (eds.) Deterministic and Statistical Methods in Machine Learning. LNCS, vol. 3635, pp. 281–295. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  41. 41.
    Titterington, D.M.: Bayesian methods for neural networks and related models. Statist. Sci. 19, 128–139 (2004)MathSciNetCrossRefMATHGoogle Scholar
  42. 42.
    Akaike, H.: Information theory and an extension of the maximum likelihood principle. In: Petrov, B.N., Csaki, F. (eds.) Proc. 2nd Int. Symp. Info. Theory, pp. 267–281. Akadémiai Kiadó, Budapest (1973)Google Scholar
  43. 43.
    Schwarz, G.: Estimating the dimension of a model. Ann. Statist. 6, 461–466 (1978)MathSciNetCrossRefMATHGoogle Scholar
  44. 44.
    McLachlan, G.J.: On bootstrapping the likelihood ratio test statistics for the number of components in a normal mixture. Appl. Statist. 36, 318–324 (1987)CrossRefGoogle Scholar
  45. 45.
    Keribin, C.: Consistent estimation of the order of mixture models. Sankhya A 62, 49–66 (2000)MathSciNetMATHGoogle Scholar
  46. 46.
    Kass, R.E., Raftery, A.: Bayes factors. J. Amer. Statist. Assoc. 90, 773–795 (1995)MathSciNetCrossRefMATHGoogle Scholar
  47. 47.
    Spiegelhalter, D.J., Best, N.G., Carlin, B.P., van der Linde, A.: Bayesian measures of complexity and fit (with discussion). J. R. Statist. Soc. B 64, 583–639 (2002)CrossRefMATHGoogle Scholar
  48. 48.
    Celeux, G., Forbes, F., Robert, C.P., Titterington, D.M.: Deviation information criteria for missing data models (submitted, 2005)Google Scholar
  49. 49.
    Green, P.J.: Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82, 711–732 (1995)MathSciNetCrossRefMATHGoogle Scholar
  50. 50.
    Richardson, S., Green, P.J.: On Bayesian analysis of mixtures with an unknown number of components (with discussion). J. R. Statist. Soc. B 59, 731–792 (1997)CrossRefMATHGoogle Scholar
  51. 51.
    Robert, C.P., Rydén, T., Titterington, D.M.: Bayesian inference in hidden Markov models through the reversible jump Markov chain Monte Carlo method. J. R. Statist. Soc. B 62, 57–75 (2000)MathSciNetCrossRefMATHGoogle Scholar
  52. 52.
    Green, P.J., Richardson, S.: Hidden Markov models and disease mapping. J. Amer. Statist. Assoc. 97, 1055–1070 (2002)MathSciNetCrossRefMATHGoogle Scholar
  53. 53.
    Stephens, M.: Bayesian analysis of mixtures with an unknown number of components - an alternative to reversible jump methods. Ann. Statist. 28, 40–74 (2000)MathSciNetCrossRefMATHGoogle Scholar
  54. 54.
    Cappé, O., Robert, C.P., Rydén, T.: Reversible jump, birth-and-death and more general continuous time Markov chain Monte Carlo. J. R. Statist. Soc. B 65, 679–699 (2003)MathSciNetCrossRefMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • D. M. Titterington
    • 1
  1. 1.University of GlasgowGlasgow, ScotlandUK

Personalised recommendations