Learning Stochastic Tree Edit Distance

  • Marc Bernard
  • Amaury Habrard
  • Marc Sebban
Conference paper

DOI: 10.1007/11871842_9

Part of the Lecture Notes in Computer Science book series (LNCS, volume 4212)
Cite this paper as:
Bernard M., Habrard A., Sebban M. (2006) Learning Stochastic Tree Edit Distance. In: Fürnkranz J., Scheffer T., Spiliopoulou M. (eds) Machine Learning: ECML 2006. ECML 2006. Lecture Notes in Computer Science, vol 4212. Springer, Berlin, Heidelberg

Abstract

Trees provide a suited structural representation to deal with complex tasks such as web information extraction, RNA secondary structure prediction, or conversion of tree structured documents. In this context, many applications require the calculation of similarities between tree pairs. The most studied distance is likely the tree edit distance (ED) for which improvements in terms of complexity have been achieved during the last decade. However, this classic ED usually uses a priori fixed edit costs which are often difficult to tune, that leaves little room for tackling complex problems. In this paper, we focus on the learning of a stochastic tree ED. We use an adaptation of the Expectation-Maximization algorithm for learning the primitive edit costs. We carried out series of experiments that confirm the interest to learn a tree ED rather than a priori imposing edit costs.

Keywords

Stochastic tree edit distance EM algorithm generative models discriminative models 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Marc Bernard
    • 1
  • Amaury Habrard
    • 2
  • Marc Sebban
    • 1
  1. 1.EURISEUniversité Jean Monnet de Saint-EtienneSaint-EtienneFrance
  2. 2.LIFUniversité de ProvenceMarseilleFrance

Personalised recommendations