Measuring Tree Similarity for Natural Language Processing Based Information Retrieval

  • Zhiwei Lin
  • Hui Wang
  • Sally McClean
Conference paper

DOI: 10.1007/978-3-642-13881-2_2

Part of the Lecture Notes in Computer Science book series (LNCS, volume 6177)
Cite this paper as:
Lin Z., Wang H., McClean S. (2010) Measuring Tree Similarity for Natural Language Processing Based Information Retrieval. In: Hopfe C.J., Rezgui Y., Métais E., Preece A., Li H. (eds) Natural Language Processing and Information Systems. NLDB 2010. Lecture Notes in Computer Science, vol 6177. Springer, Berlin, Heidelberg

Abstract

Natural language processing based information retrieval (NIR) aims to go beyond the conventional bag-of-words based information retrieval (KIR) by considering syntactic and even semantic information in documents. NIR is a conceptually appealing approach to IR, but is hard due to the need to measure distance/similarity between structures. We aim to move beyond the state of the art in measuring structure similarity for NIR.

In this paper, a novel tree similarity measurement dtwAcs is proposed in terms of a novel interpretation of trees as multi dimensional sequences. We calculate the distance between trees by the way of computing the distance between multi dimensional sequences, which is conducted by integrating the all common subsequences into the dynamic time warping method. Experimental result shows that dtwAcs outperforms the state of the art.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Zhiwei Lin
    • 1
  • Hui Wang
    • 1
  • Sally McClean
    • 1
  1. 1.Faculty of Computing and EngineeringUniversity of UlsterNorthern Ireland, United Kingdom

Personalised recommendations