Fast Approximate Point Set Matching for Information Retrieval
- Cite this paper as:
- Clifford R., Sach B. (2007) Fast Approximate Point Set Matching for Information Retrieval. In: van Leeuwen J., Italiano G.F., van der Hoek W., Meinel C., Sack H., Plášil F. (eds) SOFSEM 2007: Theory and Practice of Computer Science. SOFSEM 2007. Lecture Notes in Computer Science, vol 4362. Springer, Berlin, Heidelberg
We investigate randomised algorithms for subset matching with spatial point sets—given two sets of d-dimensional points: a data set T consisting of n points and a pattern P consisting of m points, find the largest match for a subset of the pattern in the data set. This problem is known to be 3-SUM hard and so unlikely to be solvable exactly in subquadratic time. We present an efficient bit-parallel O(nm) time algorithm and an O(nlogm) time solution based on correlation calculations using fast Fourier transforms. Both methods are shown experimentally to give answers within a few percent of the exact solution and provide a considerable practical speedup over existing deterministic algorithms.
Unable to display preview. Download preview PDF.