Abstract
Standard compression algorithms are not able to compress DNA sequences. Recently, new algorithms have been introduced specifically for this purpose, often using detection of long approximate repeats. In this paper, we present another algorithm, DNAPack, based on dynamic programming. In comparison with former existing programs, it compresses DNA slightly better, while the cost of dynamic programming is almost negligible.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Apostolico, A., Fraenkel, A.S.: Robust transmission of unbounded strings using Fibonacci representations. IEEE trans. inform. 33(2), 238–245 (1987)
Apostolico, A., Lonardi, S.: Compression of biological sequences by greedy offline textual substitution. In: Proc. DCC 2000, pp. 143–152 (2000)
Chen, X., Kwong, S., Li, M.: A compression Algorithm for DNA sequences and its applications in genome comparison. In: The 10th workshop on Genome Informtics (GIW 1999), Tokyo, Japan, pp. 51–61 (1999)
Chen, X., Kwong, S., Li, M.: A compression Algorithm for DNA sequences. IEEE Engineering in Medicine and Biolgoy Magazine 20(4), 61–66 (2001)
Chen, X., Li, M., Ma, B., Tromp, J.: DNACompress: fast and effective DNA sequence compression. Bioinformatics 18, 1696–1698 (2002)
Chang, C.-H.: DNAC: A Compression Algorithm for DNA Sequences by Nonoverlapping Approximate Repeats. Master Thesis (2004)
Grumbach, S., Tahi, F.: Compression of DNA Sequences. In: Data compression conference, pp. 340–350. IEEE Computer Society Press, Los Alamitos (1993)
Grumbach, S., Tahi, F.: A new Challenge for compression algorithms: genetic sequences. Journal of Information Processing and Management 30, 866–875 (1994)
Li, M., Badger, J.H., Chen, X., Kwong, S., Kearney, P., Zhang, H.: An information based sequences distance and its application to whole motochondrial genome phylogeny. Bioinformatics 17(2), 149–154 (2001)
Matsumuto, T., Sadakane, K., Imai, H.: Biological sequence compression algorithms. Genome Inform. Ser. Wokrshop Genome Inform. 11, 43–52 (2000)
Ma, B., Tromp, J., Li, M.: PatternHunter–faster and more sensitive homology search. Bioinformatics 18, 440–445 (2002)
Manzini, G., Rastero, M.: A simple and fast DNA compressor. Sofware: Practice and Experience 34(14), 1397–1411 (2004)
Rivals, E., Delahaye, J.-P., Dauchet, M., Delgrange, O.: A Guaranteed Compression Scheme for Repetitive DNA Sequences. In: Data Compression Conference (1996)
Willems, F.M.J., Shtrakov, Y.M., Tjalkens, T.J.: The Context Tree Weighting Method: Basic Properties. IEEE Trans. Inform. Theory IT-41(3), 653–664 (1995)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Behzadi, B., Le Fessant, F. (2005). DNA Compression Challenge Revisited: A Dynamic Programming Approach. In: Apostolico, A., Crochemore, M., Park, K. (eds) Combinatorial Pattern Matching. CPM 2005. Lecture Notes in Computer Science, vol 3537. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11496656_17
Download citation
DOI: https://doi.org/10.1007/11496656_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26201-5
Online ISBN: 978-3-540-31562-9
eBook Packages: Computer ScienceComputer Science (R0)