Window Subsequence Problems for Compressed Texts

  • Patrick Cégielski
  • Irène Guessarian
  • Yury Lifshits
  • Yuri Matiyasevich
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3967)


Given two strings (a text t of length n and a pattern p) and a natural number w, window subsequence problems consist in deciding whether p occurs as a subsequence of t and/or finding the number of size (at most) w windows of text t which contain pattern p as a subsequence, i.e. the letters of pattern p occur in the text window, in the same order as in p, but not necessarily consecutively (they may be interleaved with other letters). We are searching for subsequences in a text which is compressed using Lempel-Ziv-like compression algorithms, without decompressing the text, and we would like our algorithms to be almost optimal, in the sense that they run in time O(m) where m is the size of the compressed text. The pattern is uncompressed (because the compression algorithms are evolutive: various occurrences of a same pattern look different in the text).


Pattern Match Message Sequence Chart Text Size Compress Text Text Window 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [ABF95]
    Amir, A., Benson, G., Farach, M.: Let sleeping files lie: pattern matching in Z–compressed files. J. Comput. Syst. Sci. 52(2), 299–307 (1996)MathSciNetCrossRefMATHGoogle Scholar
  2. [BKLPR02]
    Berman, P., Karpinski, M., Larmore, L., Plandowski, W., Rytter, W.: On the Complexity of Pattern Matching for Highly Compressed Two-Dimensional Texts. Journal of Computer and Systems Science 65(2), 332–350 (2002)MathSciNetCrossRefMATHGoogle Scholar
  3. [C88]
    Crochemore, M.: String-matching with constraints. In: Koubek, V., Janiga, L., Chytil, M.P. (eds.) MFCS 1988. LNCS, vol. 324, pp. 44–58. Springer, Heidelberg (1988)CrossRefGoogle Scholar
  4. [GKPR96]
    Gasieniec, L., Karpinski, M., Plandowski, W., Rytter, W.: Efficient Algorithms for Lempel-Ziv Encoding (Extended Abstract). In: Karlsson, R., Lingas, A. (eds.) SWAT 1996. LNCS, vol. 1097, pp. 392–403. Springer, Heidelberg (1996)CrossRefGoogle Scholar
  5. [GM02]
    Genest, B., Muscholl, A.: Pattern Matching and Membership for Hierarchical Message Sequence Charts. In: Rajsbaum, S. (ed.) LATIN 2002. LNCS, vol. 2286, pp. 326–340. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  6. [LZ77]
    Ziv, G., Lempel, A.: A universal algorithm for sequential data compresssion. IEEE Transactions on Information Theory 23(3), 337–343 (1977)MathSciNetCrossRefMATHGoogle Scholar
  7. [LZ78]
    Ziv, G., Lempel, A.: Compresssion of individual sequences via variable-rate coding. IEEE Transactions on Information Theory 24, 530–536 (1978)MathSciNetCrossRefMATHGoogle Scholar
  8. [L05]
    Lifshits, Y.: On the computational complexity of embedding of compressed texts, St.Petersburg State University Diploma thesis, (2005),
  9. [LL05]
    Lifshits, Y., Lohrey, M.: Querying and Embedding Compressed Texts (to appear, 2005)Google Scholar
  10. [Loh04]
    Lohrey, M.: Word problems on compressed word. In: Díaz, J., Karhumäki, J., Lepistö, A., Sannella, D. (eds.) ICALP 2004. LNCS, vol. 3142, pp. 906–918. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  11. [M02]
    Mannila, H.: Local and Global Methods in Data Mining: Basic Techniques and open Problems. In: Widmayer, P., Triguero, F., Morales, R., Hennessy, M., Eidenbenz, S., Conejo, R. (eds.) ICALP 2002. LNCS, vol. 2380, pp. 57–68. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  12. [MS04]
    Markey, N., Schnoebelen, P.: A PTIME-complete matching problem for SLP-compressed words. Information Processing Letters 90(1), 3–6 (2004)MathSciNetCrossRefMATHGoogle Scholar
  13. [Ma71]
    Matiyasevich, Y.: Real-time recognition of the inclusion relation. Zapiski Nauchnykh Leningradskovo Otdeleniya Mat. Inst. Steklova Akad. Nauk SSSR 20, 104–114 (1971); Translated into English, Journal of Soviet Mathematics 1, 64–70 (1973),
  14. [R99]
    Rytter, W.: Algorithms on compressed strings and arrays. In: Bartosek, M., Tel, G., Pavelka, J. (eds.) SOFSEM 1999. LNCS, vol. 1725, pp. 48–65. Springer, Heidelberg (1999)CrossRefGoogle Scholar
  15. [R03]
    Rytter, W.: Application of Lempel-Ziv factorization to the approximation of grammar-based compression. TCS 1-3(299), 763–774 (2003)CrossRefMATHGoogle Scholar
  16. [S71]
    Slissenko, A.: String-matching in real time. In: Winkowski, J. (ed.) MFCS 1978. LNCS, vol. 64, pp. 493–496. Springer, Heidelberg (1978)CrossRefGoogle Scholar
  17. [W84]
    Welch, T.: A technique for high performance data compresssion. Computer, 8–19 (June 1984)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Patrick Cégielski
    • 1
  • Irène Guessarian
    • 2
  • Yury Lifshits
    • 3
  • Yuri Matiyasevich
    • 3
  1. 1.LACL, UMR-FRE 2673Université Paris 12FontainebleauFrance
  2. 2.LIAFA, UMR 7089 and Université Paris 6ParisFrance
  3. 3.Steklov Institute of MathematicsSt. PetersburgRussia

Personalised recommendations