Faster Pattern Matching Algorithm for Arc-Annotated Sequences

  • Takuya Kida
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3847)


We present an improvement of pattern matching algorithm for arc-annotated sequences. Arc-annotated sequences are used for representing the structural information, e.g., RNA and protein sequences in molecular biology. Given two sequences with arcs, a text of length n and a pattern of length m, the problem is to determine whether the pattern is an arc-preserving subsequence of the text. Although it is NP-complete in a general case, an O(mn) algorithm has been proposed if the given sequences have no crossing-arcs. Our contribution is to revise it and to obtain more simple one. We also present our experimental results of the running time.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Alber, J., Gramm, J., Guo, J., Niedermeier, R.: Computing the similarity of two sequences with nested arc annotations. Theoretical Computer Science 312(2-3), 337–358 (2004)zbMATHCrossRefMathSciNetGoogle Scholar
  2. 2.
    Consortium, G.O.: Gene ontology: Tool for the unification of biology. Nature Genetics 25, 25–29 (2000), CrossRefGoogle Scholar
  3. 3.
    Crochemore, M., Rytter, W.: Jewels of Stringology. World Scientific Publishing, Singapore (2002)CrossRefGoogle Scholar
  4. 4.
    El-Mabrouk, N., Raffinot, M.: Approximate matching of secondary structures. In: Proc. RECOMB, pp. 156–164. ACM Press, New York (2002)Google Scholar
  5. 5.
    Evans, P.A.: Finding common subsequences with arcs and pseudoknots. In: Crochemore, M., Paterson, M. (eds.) CPM 1999. LNCS, vol. 1645, pp. 270–280. Springer, Heidelberg (1999)CrossRefGoogle Scholar
  6. 6.
    Gramm, J., Guo, J., Niedermeier, R.: Pattern matching for arc-annotated sequences. In: Agrawal, M., Seth, A.K. (eds.) FSTTCS 2002. LNCS, vol. 2556, pp. 182–193. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  7. 7.
    Gusfield, D.: Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology. Cambridge University Press, Cambridge (1997)zbMATHCrossRefGoogle Scholar
  8. 8.
    L. Japan Electronic Dictionary Research Institute. Edr electronic dictionary technical guide (2nd edition). Technical Report TR-045 (1995),
  9. 9.
    Jiang, T., Lin, G.-H., Ma, B., Zhang, K.: The longest common subsequence problem for arc-annotated sequences. Journal of Discrete Algorithms 2(2), 257–270 (2004)zbMATHCrossRefMathSciNetGoogle Scholar
  10. 10.
    Kida, T., Arimura, H.: Pattern matching with taxonomic information. In: Proc. Asia Information Retrieval Symposium, pp. 265–268 (October 2004)Google Scholar
  11. 11.
    Ma, B., Wang, L., Zhang, K.: Computing similarity between rna structures. Theoretical Computer Science (276), 111–132 (2002)Google Scholar
  12. 12.
    Navarro, G., Raffinot, M.: Flexible Pattern Matching in Strings: Practical on-line search algorithms for texts and biological sequences. Cambridge University Press, Cambridge (2002)zbMATHGoogle Scholar
  13. 13.
    Stevens, R., Horrocks, I., Goble, C., Bechhofer, S.: Building a reson-able bioinformatics ontology. IEEE Transactions on Information Technology and Biomedicine 6(2), 136–141 (2002)Google Scholar
  14. 14.
    Takeda, M., Miyamoto, S., Kida, T., Shinohara, A., Fukamachi, S., Shinohara, T., Arikawa, S.: Processing text files as is: Pattern matching over compressed texts, multi-byte character texts, and semi-structured texts. In: Laender, A.H.F., Oliveira, A.L. (eds.) SPIRE 2002. LNCS, vol. 2476, pp. 170–186. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  15. 15.
    Vialette, S.: On the computational complexity of 2-interval pattern matching problems. Theoretical Computer Science 312(2-3), 223–249 (2004)zbMATHCrossRefMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Takuya Kida
    • 1
  1. 1.Hokkaido UniversitySapporoJapan

Personalised recommendations