Fast Approximate Point Set Matching for Information Retrieval

  • Raphaël Clifford
  • Benjamin Sach
Conference paper

DOI: 10.1007/978-3-540-69507-3_17

Part of the Lecture Notes in Computer Science book series (LNCS, volume 4362)
Cite this paper as:
Clifford R., Sach B. (2007) Fast Approximate Point Set Matching for Information Retrieval. In: van Leeuwen J., Italiano G.F., van der Hoek W., Meinel C., Sack H., Plášil F. (eds) SOFSEM 2007: Theory and Practice of Computer Science. SOFSEM 2007. Lecture Notes in Computer Science, vol 4362. Springer, Berlin, Heidelberg

Abstract

We investigate randomised algorithms for subset matching with spatial point sets—given two sets of d-dimensional points: a data set T consisting of n points and a pattern P consisting of m points, find the largest match for a subset of the pattern in the data set. This problem is known to be 3-SUM hard and so unlikely to be solvable exactly in subquadratic time. We present an efficient bit-parallel O(nm) time algorithm and an O(nlogm) time solution based on correlation calculations using fast Fourier transforms. Both methods are shown experimentally to give answers within a few percent of the exact solution and provide a considerable practical speedup over existing deterministic algorithms.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer Berlin Heidelberg 2007

Authors and Affiliations

  • Raphaël Clifford
    • 1
  • Benjamin Sach
    • 1
  1. 1.University of Bristol, Department of Computer Science, Woodland Road, Bristol, BS8 1UBUK

Personalised recommendations