Approximate dictionary queries
Given a set of n binary strings of length m each. We consider the problem of answering d-queries. Given a binary query string α of length m, a d-query is to report if there exists a string in the set within Hamming distance d of α.
We present a data structure of size O(nm) supporting 1-queries in time O(m) and the reporting of all strings within Hamming distance 1 of α in time O(m). The data structure can be constructed in time O(nm). A slightly modified version of the data structure supports the insertion of new strings in amortized time O(m).
KeywordsBinary String Bloom Filter Additional Link Query String Feasible Pair
Unable to display preview. Download preview PDF.
- 1.Alfred V. Aho, John E. Hopcroft, and Jeffrey D. Ullman. Data Structures and Algorithms. Addison-Wesley, Reading, MA, 1983.Google Scholar
- 2.B. H. Bloom. Space/time trade-offs in hash coding with allowable errors. Communications of the ACM, 13:422–426, 1970.Google Scholar
- 3.Paul F. Dietz and Daniel D. Sleator. Two algorithms for maintaining order in a list. In Proc. 19th Ann. ACM Symp. on Theory of Computing (STOC), pages 365–372, 1987.Google Scholar
- 4.Danny Dolev, Yuval Harari, Nathan Linial, Noam Nisan, and Michael Parnas. Neighborhood preserving hashing and approximate queries. In Proc. 5th ACMSIAM Symposium on Discrete Algorithms (SODA), pages 251–259, 1994.Google Scholar
- 5.Danny Dolev, Yuval Harari, and Michael Parnas. Finding the neighborhood of a query in a dictionary. In Proc. 2nd Israel Symposium on Theory of Computing and Systems, pages 33–42, 1993.Google Scholar
- 6.E. Fredkin. Trie memory. Communications of the ACM, 3:490–499, 1962.Google Scholar
- 7.Michael L. Fredman, Janós Komlós, and Endre Szemerédi. Storing a sparse table with O(1) worst case access time. Journal of the ACM, 31(3):538–544, 1984.Google Scholar
- 8.Harold N. Gabow and Robert Endre Tarjan. A linear-time algorithm for a special case of disjoint set union. Journal of Computer and System Sciences, 30:209–221, 1985.Google Scholar
- 9.Dan Greene, Michal Parnas, and Frances Yao. Multi-index hashing for information retrieval. In Proc. 35th Ann. Symp. on Foundations of Computer Science (FOCS), pages 722–731, 1994.Google Scholar
- 11.Udi Manber and Sun Wu. An algorithm for approximate membership checking with application to password security. Information Processing Letters, 50:191–197, 1994.Google Scholar
- 12.M. Minsky and S. Papert. Perceptrons. MIT Press, Cambridge, Mass., 1969.Google Scholar
- 13.P. van Emde Boas. Machine models and simulations. In J. van Leeuwen, editor, Handbook of Theoretical Computer Science, Volume A: Algorithms and Complexity. MIT Press/Elsevier, 1990.Google Scholar
- 15.Andrew C. Yao and Frances F. Yao. Dictionary look-up with small errors. In Proc. 6th Combinatorial Pattern Matching, volume 937 of Lecture Notes in Computer Science, pages 388–394. Springer Verlag, Berlin, 1995.Google Scholar