Advertisement

Succinct Indexes for Circular Patterns

  • Wing-Kai Hon
  • Chen-Hua Lu
  • Rahul Shah
  • Sharma V. Thankachan
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7074)

Abstract

Circular patterns are those patterns whose circular permutations are also valid patterns. These patterns arise naturally in bioinformatics and computational geometry. In this paper, we consider succinct indexing schemes for a set of d circular patterns of total length n, with each character drawn from an alphabet of size σ. Our method is by defining the popular Burrows-Wheeler transform (BWT) on circular patterns, based on which we achieve succinct indexes with space nlogσ(1 + o(1)) + O(n) + O(dlogn) bits, while pattern matching or dictionary matching queries can be supported efficiently.

Keywords

Locus Node Indexing Scheme Circular Pattern Circular Permutation Length Array 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Aho, A., Corasick, M.: Efficient String Matching: An Aid to Bibliographic Search. Communications of the ACM 18(6), 333–340 (1975)MathSciNetCrossRefzbMATHGoogle Scholar
  2. 2.
    Belazzougui, D.: Succinct Dictionary Matching With No Slowdown. In: Amir, A., Parida, L. (eds.) CPM 2010. LNCS, vol. 6129, pp. 88–100. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  3. 3.
    Bender, M.A., Farach-Colton, M.: The Level Ancestor Problem Simplified. Theoretical Computer Science 321(1), 5–12 (2004)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Burrows, M., Wheeler, D.J.: A Block-Sorting Lossless Data Compression Algorithm, Technical Report 124, Digital Equipment Corporation, USA (1994)Google Scholar
  5. 5.
    Chan, H.L., Hon, W.K., Lam, T.W., Sadakane, K.: Compressed Indexes for Dynamic Text Collections. ACM Transactions on Algorithms 3(2) (2007)Google Scholar
  6. 6.
    Eisen, J.A.: Environmental Shotgun Sequencing: Its Potential and Challenges for Studying the Hidden World of Microbes. PLoS Biology 5(3), e82 (2007)CrossRefGoogle Scholar
  7. 7.
    Ferragina, P., Manzini, G.: Indexing Compressed Text. Journal of the ACM 52(4), 552–581 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Ferragina, P., Venturini, R.: The Compressed Permuterm Index. ACM Transactions on Algorithms 7(1) (2010)Google Scholar
  9. 9.
    Grossi, R., Vitter, J.S.: Compressed Suffix Arrays and Suffix Trees with Applications to Text Indexing and String Matching. SIAM Journal on Computing 35(2), 378–407 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Grossi, R., Gupta, A., Vitter, J.S.: High-Order Entropy-Compressed Text Indexes. In: SODA, pp. 841–850 (2003)Google Scholar
  11. 11.
    Hon, W.K., Ku, T.H., Shah, R., Thankachan, S.V., Vitter, J.S.: Faster Compressed Dictionary Matching. In: Chavez, E., Lonardi, S. (eds.) SPIRE 2010. LNCS, vol. 6393, pp. 191–200. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  12. 12.
    Hon, W.K., Lam, T.W., Shah, R., Tam, S.L., Vitter, J.S.: Compressed Index for Dictionary Matching. In: DCC, pp. 23–32 (2008)Google Scholar
  13. 13.
    Hon, W.K., Shah, R., Vitter, J.S.: Compression, Indexing, and Retrieval for Massive String Data. In: Amir, A., Parida, L. (eds.) CPM 2010. LNCS, vol. 6129, pp. 260–274. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  14. 14.
    Iliopoulos, C.S., Rahman, M.S.: Indexing Circular Patterns. In: Nakano, S.-i., Rahman, M. S. (eds.) WALCOM 2008. LNCS, vol. 4921, pp. 46–57. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  15. 15.
    Kasai, T., Lee, G., Arimura, H., Arikawa, S., Park, K.: Linear-Time Longest-Common-Prefix Computation in Suffix Arrays and Its Applications. In: Amir, A., Landau, G.M. (eds.) CPM 2001. LNCS, vol. 2089, pp. 181–192. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  16. 16.
    Manber, U., Myers, G.: Suffix Arrays: A New Method for On-Line String Searches. SIAM Journal on Computing 22(5), 935–948 (1993)MathSciNetCrossRefzbMATHGoogle Scholar
  17. 17.
    Munro, J.I., Raman, V.: Succinct Representation of Balanced Parentheses and Static Trees. SIAM Journal on Computing 31(3), 762–776 (2001)MathSciNetCrossRefzbMATHGoogle Scholar
  18. 18.
    Raman, R., Raman, V., Rao, S.S.: Succinct Indexable Dictionaries with Applications to Encoding k-ary Trees, Prefix Sums and Multisets. ACM Transactions on Algorithms 3(4) (2007)Google Scholar
  19. 19.
    Sadakane, K.: Compressed Suffix Trees with Full Functionality. Theory of Computing Systems, pp. 589–607 (2007)Google Scholar
  20. 20.
    Simon, C., Daniel, R.: Metagenomic Analyses: Past and Future Trends. Applied and Environmental Microbiology 77(4), 1153–1161 (2011)CrossRefGoogle Scholar
  21. 21.
    Strang, B.L., Stow, N.D.: Circularization of the Herpes Simplex Virus Type 1 Genome upon Lytic Infection. Journal of Virology 79(19), 12487–12494 (2005)CrossRefGoogle Scholar
  22. 22.
    Weiner, P.: Linear Pattern Matching Algorithms. In: Proceedings of Symposium on Switching and Automata Theory, pp. 1–11 (1973)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Wing-Kai Hon
    • 1
  • Chen-Hua Lu
    • 2
  • Rahul Shah
    • 3
  • Sharma V. Thankachan
    • 3
  1. 1.National Tsing Hua UniversityTaiwan
  2. 2.Academia SinicaTaiwan
  3. 3.Louisiana State UniversityUSA

Personalised recommendations