A Two-Tire Index Structure for Approximate String Matching with Block Moves

  • Bin Wang
  • Long Xie
  • Guoren Wang
Conference paper

DOI: 10.1007/978-3-642-04205-8_17

Part of the Lecture Notes in Computer Science book series (LNCS, volume 5667)
Cite this paper as:
Wang B., Xie L., Wang G. (2009) A Two-Tire Index Structure for Approximate String Matching with Block Moves. In: Chen L., Liu C., Liu Q., Deng K. (eds) Database Systems for Advanced Applications. DASFAA 2009. Lecture Notes in Computer Science, vol 5667. Springer, Berlin, Heidelberg

Abstract

Many applications need to solve the problem of approximate string matching with block moves. It is an NP-Complete problem to compute block edit distance between two strings. Our goal is to filter non-candidate strings as much as possible. Based on the two matured filter strategies, frequency distance and positional q-gram, we propose a two-tire index structure to make the use of the two filters more efficiently. We give a full specification of the index structure, including how to choose character order to achieve a better filterability and how to balance number of strings in different clusters. We present our experiments on real data sets to evaluate our technique and show the proposed index structure can provide a good performance.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Bin Wang
    • 1
    • 2
  • Long Xie
    • 3
  • Guoren Wang
    • 1
    • 2
  1. 1.Key Laboratory of Medical Image Computing (Northeastern University)Ministry of Education 
  2. 2.School of Information Science and EngineeringNortheastern UniversityShenyangChina
  3. 3.Information SchoolLiaoning UniversityShenyangChina

Personalised recommendations