Efficiently Approximating Markov Tree Bagging for High-Dimensional Density Estimation

  • François Schnitzler
  • Sourour Ammar
  • Philippe Leray
  • Pierre Geurts
  • Louis Wehenkel
Conference paper

DOI: 10.1007/978-3-642-23808-6_8

Part of the Lecture Notes in Computer Science book series (LNCS, volume 6913)
Cite this paper as:
Schnitzler F., Ammar S., Leray P., Geurts P., Wehenkel L. (2011) Efficiently Approximating Markov Tree Bagging for High-Dimensional Density Estimation. In: Gunopulos D., Hofmann T., Malerba D., Vazirgiannis M. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2011. Lecture Notes in Computer Science, vol 6913. Springer, Berlin, Heidelberg

Abstract

We consider algorithms for generating Mixtures of Bagged Markov Trees, for density estimation. In problems defined over many variables and when few observations are available, those mixtures generally outperform a single Markov tree maximizing the data likelihood, but are far more expensive to compute. In this paper, we describe new algorithms for approximating such models, with the aim of speeding up learning without sacrificing accuracy. More specifically, we propose to use a filtering step obtained as a by-product from computing a first Markov tree, so as to avoid considering poor candidate edges in the subsequently generated trees. We compare these algorithms (on synthetic data sets) to Mixtures of Bagged Markov Trees, as well as to a single Markov tree derived by the classical Chow-Liu algorithm and to a recently proposed randomized scheme used for building tree mixtures.

Keywords

mixture models Markov trees bagging randomization 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • François Schnitzler
    • 1
  • Sourour Ammar
    • 2
  • Philippe Leray
    • 2
  • Pierre Geurts
    • 1
  • Louis Wehenkel
    • 1
  1. 1.Department of EECS and GIGA-ResearchUniversité de LiègeLiègeBelgium
  2. 2.Knowledge and Decision Team, Laboratoire d’Informatique de Nantes Atlantique (LINA) UMR 6241Ecole Polytechnique de l’Université de NantesFrance

Personalised recommendations