String Indexing for Patterns with Wildcards

  • Philip Bille
  • Inge Li Gørtz
  • Hjalte Wedel Vildhøj
  • Søren Vind
Conference paper

DOI: 10.1007/978-3-642-31155-0_25

Part of the Lecture Notes in Computer Science book series (LNCS, volume 7357)
Cite this paper as:
Bille P., Gørtz I.L., Vildhøj H.W., Vind S. (2012) String Indexing for Patterns with Wildcards. In: Fomin F.V., Kaski P. (eds) Algorithm Theory – SWAT 2012. SWAT 2012. Lecture Notes in Computer Science, vol 7357. Springer, Berlin, Heidelberg

Abstract

We consider the problem of indexing a string t of length n to report the occurrences of a query pattern p containing m characters and j wildcards. Let occ be the number of occurrences of p in t, and σ the size of the alphabet. We obtain the following results.

  • A linear space index with query time O(m + σj loglogn + occ). This significantly improves the previously best known linear space index by Lam et al. [ISAAC 2007], which requires query time Θ(jn) in the worst case.

  • An index with query time O(m + j + occ) using space \(O(\sigma^{k^2} n \log^k\log n)\), where k is the maximum number of wildcards allowed in the pattern. This is the first non-trivial bound with this query time.

  • A time-space trade-off, generalizing the index by Cole et al. [STOC 2004].

Our results are obtained using a novel combination of well-known and new techniques, which could be of independent interest.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Philip Bille
    • 1
  • Inge Li Gørtz
    • 1
  • Hjalte Wedel Vildhøj
    • 1
  • Søren Vind
    • 1
  1. 1.DTU InformaticsTechnical University of DenmarkDenmark

Personalised recommendations