Learning Conditional Linear Gaussian Classifiers with Probabilistic Class Labels

  • Pedro L. López-Cruz
  • Concha Bielza
  • Pedro Larrañaga
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8109)


We study the problem of learning Bayesian classifiers (BC) when the true class label of the training instances is not known, and is substituted by a probability distribution over the class labels for each instance. This scenario can arise, e.g., when a group of experts is asked to individually provide a class label for each instance. We particularize the generalized expectation maximization (GEM) algorithm in [1] to learn BCs with different structural complexities: naive Bayes, averaged one-dependence estimators or general conditional linear Gaussian classifiers. An evaluation conducted on eight datasets shows that BCs learned with GEM perform better than those using either the classical Expectation Maximization algorithm or potentially wrong class labels. BCs achieve similar results to the multivariate Gaussian classifier without having to estimate the full covariance matrices.


Bayesian classifiers probabilistic class labels partially supervised learning belief functions 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Côme, E., Oukhellou, L., Denoeux, T., Aknin, P.: Learning from partially supervised data using mixture models and belief functions. Pattern Recognit. 42, 334–348 (2009)CrossRefzbMATHGoogle Scholar
  2. 2.
    Smets, P., Kennes, R.: The transferable belief model. Artif. Intell. 66, 191–243 (1994)MathSciNetCrossRefzbMATHGoogle Scholar
  3. 3.
    Pearl, J.: Probabilistic Reasoning in Intelligent Systems. Morgan Kaufmann (1988)Google Scholar
  4. 4.
    Friedman, N., Goldszmidt, M., Lee, T.J.: Bayesian network classification with continuous attributes: Getting the best of both discretization and parametric fitting. In: Shavlik, J.W. (ed.) Proceedings of the 15th ICML, pp. 179–187. Morgan Kaufmann (1998)Google Scholar
  5. 5.
    Vannoorenberghe, P., Smets, P.: Partially supervised learning by a credal EM approach. In: Godo, L. (ed.) ECSQARU 2005. LNCS (LNAI), vol. 3571, pp. 956–967. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  6. 6.
    Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B-Stat. Methodol. 39, 1–38 (1977)MathSciNetzbMATHGoogle Scholar
  7. 7.
    Lauritzen, S.L., Wermuth, N.: Graphical models for associations between variables, some of which are qualitative and some quantitative. Ann. Stat. 17, 31–57 (1989)MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Minsky, M.: Steps toward artificial intelligence. Proc. Inst. Radio Eng. 49, 8–30 (1961)MathSciNetGoogle Scholar
  9. 9.
    Webb, G.I., Boughton, J.R., Wang, Z.: Not so naive Bayes: Aggregating one-dependence estimators. Mach. Learn. 58, 5–24 (2005)CrossRefzbMATHGoogle Scholar
  10. 10.
    Friedman, N.: Learning belief networks in the presence of missing values and hidden variables. In: Fisher, D.H. (ed.) 14th ICML, pp. 125–133. Morgan Kaufmann (1997)Google Scholar
  11. 11.
    García, S., Herrera, F.: An extension on “Statistical comparisons of classifiers over multiple data sets” for all pairwise comparisons. J. Mach. Learn. Res. 9, 2677–2694 (2008)zbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Pedro L. López-Cruz
    • 1
  • Concha Bielza
    • 1
  • Pedro Larrañaga
    • 1
  1. 1.Computational Intelligence Group, Departamento de Inteligencia Artificial Facultad de InformáticaUniversidad Politécnica de MadridSpain

Personalised recommendations