Fast Approximate Point Set Matching for Information Retrieval

  • Raphaël Clifford
  • Benjamin Sach
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4362)

Abstract

We investigate randomised algorithms for subset matching with spatial point sets—given two sets of d-dimensional points: a data set T consisting of n points and a pattern P consisting of m points, find the largest match for a subset of the pattern in the data set. This problem is known to be 3-SUM hard and so unlikely to be solvable exactly in subquadratic time. We present an efficient bit-parallel O(nm) time algorithm and an O(nlogm) time solution based on correlation calculations using fast Fourier transforms. Both methods are shown experimentally to give answers within a few percent of the exact solution and provide a considerable practical speedup over existing deterministic algorithms.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer Berlin Heidelberg 2007

Authors and Affiliations

  • Raphaël Clifford
    • 1
  • Benjamin Sach
    • 1
  1. 1.University of Bristol, Department of Computer Science, Woodland Road, Bristol, BS8 1UBUK

Personalised recommendations