Skip to main content

Training Parse Trees for Efficient VF Coding

  • Conference paper
String Processing and Information Retrieval (SPIRE 2010)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6393))

Included in the following conference series:

Abstract

We address the problem of improving variable-length-to- fixed-length codes (VF codes), which have favourable properties for fast compressed pattern matching but moderate compression ratios. Compression ratio of VF codes depends on the parse tree that is used as a dictionary. We propose a method that trains a parse tree by scanning an input text repeatedly, and we show experimentally that it improves the compression ratio of VF codes rapidly to the level of state-of-the-art compression methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Brisaboa, N.R., Fariña, A., Navarro, G., Esteller, M.F.: (s, c)-dense coding: An optimized compression code for natural language text databases. In: Nascimento, M.A., de Moura, E.S., Oliveira, A.L. (eds.) SPIRE 2003. LNCS, vol. 2857, pp. 122–136. Springer, Heidelberg (2003)

    Google Scholar 

  2. Brisaboa, N.R., Iglesias, E.L., Navarro, G., Paramá, J.R.: An efficient compression code for text databases. In: Sebastiani, F. (ed.) ECIR 2003. LNCS, vol. 2633, pp. 468–481. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  3. Kida, T.: Suffix tree based VF-coding for compressed pattern matching. In: Data Compression Conference 2009, p. 449. IEEE Computer Society, Los Alamitos (March 2009)

    Google Scholar 

  4. Klein, S.T., Shapira, D.: Improved variable-to-fixed length codes. In: Amir, A., Turpin, A., Moffat, A. (eds.) SPIRE 2008. LNCS, vol. 5280, pp. 39–50. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  5. Maruyama, S., Tanaka, Y., Sakamoto, H., Takeda, M.: Context-sensitive grammar transform: Compression and pattern matching. In: Amir, A., Turpin, A., Moffat, A. (eds.) SPIRE 2008. LNCS, vol. 5280, pp. 27–38. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  6. Savari, S.A.: Variable-to-fixed length codes for predictable sources. In: Data Compression Conference 1998, pp. 481–490. IEEE Computer Society, Los Alamitos (1998)

    Google Scholar 

  7. Tunstall, B.P.: Synthesis of noiseless compression codes. Ph.D. thesis, Georgia Inst. Technol., Atlanta, GA (1967)

    Google Scholar 

  8. Weiner, P.: Linear pattern matching algorithms. In: 14th IEEE Symposium on Switching and Automata Theory, pp. 1–11 (1973)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Uemura, T., Yoshida, S., Kida, T., Asai, T., Okamoto, S. (2010). Training Parse Trees for Efficient VF Coding. In: Chavez, E., Lonardi, S. (eds) String Processing and Information Retrieval. SPIRE 2010. Lecture Notes in Computer Science, vol 6393. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16321-0_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-16321-0_17

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-16320-3

  • Online ISBN: 978-3-642-16321-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics