Benoit, G., Lemaitre, C., Lavenier, D., Drezen, E., Dayris, T., Uricaru, R., Rizk, G.: Reference-free compression of high throughput sequencing data with a probabilistic de Bruijn graph de Bruijn graph. BMC Bioinf. 16, 288 (2015)
CrossRef
Google Scholar
Bonfield, J.K., Mahoney, M.V.: Compression of fastq and sam format sequencing data. PLoS One 8(3), e59190 (2013)
CrossRef
Google Scholar
Burrows, M., Wheeler, D.: A block sorting lossless data compression algorithm. Technical Report 124, Digital Equipment Corporation (1994)
Google Scholar
Cánovas, R., Moffat, A., Turpin, A.: Lossy compression of quality scores in genomic data. Bioinformatics 30(15), 2130–2136 (2014)
CrossRef
Google Scholar
Chikhi, R., Rizk, G.: Space-efficient and exact de bruijn graph representation based on a bloom filter. Algorithms Mol. Biol. 8(1), 22 (2013)
CrossRef
Google Scholar
Cox, A.J., Bauer, M.J., Jakobi, T., Rosone, G.: Large-scale compression of genomic sequence databases with the burrows-wheeler transform. Bioinformatics 28(11), 1415–1419 (2012)
CrossRef
Google Scholar
Deorowicz, S., Grabowski, S.: Compression of DNA sequence reads in fastq format. Bioinformatics 27(6), 860–862 (2011)
CrossRef
Google Scholar
Deutsch, P., Gailly, J.: Zlib compressed data format specification version 3.3. RFC 1950 (1996)
Google Scholar
Grabowski, S., Deorowicz, S., Roguski, Ł.: Disk-based compression of data from genome sequencing. Bioinformatics 31(9), 1389–1395 (2014)
CrossRef
Google Scholar
Hach, F., Numanagic, I., Alkan, C., Sahinalp, S.C.: Scalce: boosting sequence compression algorithms using locally consistent encoding. Bioinformatics 28(23), 3051–3057 (2012)
CrossRef
Google Scholar
Huffman, D.: A method for the construction of minimum-redundancy codes. In: Proceedings of the Institute of Radio Engineers (1952)
MATH
Google Scholar
Janin, L., Rosone, G., Cox, A.J.: Adaptive reference-free compression of sequence quality scores. Bioinformatics 30(1), 24–30 (2014)
CrossRef
Google Scholar
Jones, D.C., Ruzzo, W.L., Peng, X., Katze, M.G.: Compression of next-generation sequencing reads aided by highly efficient de novo assembly. Nucleic Acids Res. 40(22), e171 (2012)
CrossRef
Google Scholar
Li, H., Durbin, R.: Fast and accurate short read alignment with burrows–wheeler transform. Bioinformatics 25(14), 1754–1760 (2009)
CrossRef
Google Scholar
Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., Abecasis, G., Durbin, R., Subgroup, G.P.D.P.: The sequence alignment/map format and samtools. Bioinformatics 25(16), 2078–2079 (2009). doi: 10.1093/bioinformatics/btp352
CrossRef
Google Scholar
Mahoney, M.: (2000) http://mattmahoney.net/dc/
Mahoney, M.: Adaptive weighing of context models for lossless data compression. Florida Tech. Technical Report (2005)
Google Scholar
Moffat, A.: Implementing the PPM data compression scheme. IEEE Trans. Commun. 38, 1917–1921 (1990)
CrossRef
Google Scholar
Rizk, G., Lavenier, D., Chikhi, R.: DSK: k-mer counting with very low memory usage. Bioinformatics 29(5), 652–653 (2013)
CrossRef
Google Scholar
Roguski, L., Deorowicz, S.: DSRC 2-industry-oriented compression of FASTQ files. Bioinformatics 30(15), 2213–2215 (2014)
CrossRef
Google Scholar
Saha, S., Rajasekaran, S.: Efficient algorithms for the compression of fastq files. In: 2014 IEEE International Conference on Bioinformatics and Biomedicine (2014)
Google Scholar
Sahinalp, S.C., Vishkin, U.: Efficient approximate and dynamic matching of patterns using a labeling paradigm. In: Proceedings of the 37th Annual Symposium on Foundations of Computer Science, FOCS ’96, Washington, DC, pp. 320–328. IEEE Computer Society, Los Alamitos (1996). http://dl.acm.org/citation.cfm?id=874062.875524
Seward, J.: (1996) bzip2: http://www.bzip.org/1.0.3/html/reading.html
Shannon, C., Weaver, W.: The Mathematical Theory of Communication. University of Illinois Press, Urbana (1949)
MATH
Google Scholar
Tembe, W., Lowey, J., Suh, E.: G-SQZ: compact encoding of genomic sequence and quality data. Bioinformatics 26(17), 2192–2194 (2010)
CrossRef
Google Scholar
Wan, R., Anh, V.N., Asai, K.: Transformations for the compression of fastq quality scores of next-generation sequencing data. Bioinformatics 28(5), 628–635 (2012)
CrossRef
Google Scholar
Welch, T.: A technique for high-performance data compression. Computer 6, 8–19 (1984)
CrossRef
Google Scholar
Witten, I., Neal, R., Cleary, J.: Arithmetic coding for data compression. Commun. ACM 30, 520–540 (1987)
CrossRef
Google Scholar
Yanovsky, V.: Recoil - an algorithm for compression of extremely large datasets of dna data. Algorithms Mol. Biol. 6, 23 (2011)
CrossRef
Google Scholar
Yu, Y.W., Yorukoglu, D., Berger, B.: Traversing the k-mer landscape of ngs read datasets for quality score sparsification. In: Research in Computational Molecular Biology, pp. 385–399. Springer, Berlin (2014)
Google Scholar
Ziv, J., Lempel, A.: A universal algorithm for sequential data compression. IEEE Trans. Inf. Theory 23(3), 337–343 (1977)
MathSciNet
CrossRef
MATH
Google Scholar