Abstract
Due to the rapid development of the technology, next-generation sequencers can produce huge amount of short DNA fragments covering a genomic sequence of an organism in short time. There is a need for the time-efficient algorithms which could assembly these fragments together and reconstruct the examined DNA sequence. Previously proposed algorithm for de novo assembly, SR-ASM, produced results of high quality, but required a lot of time for computations. The proposed hybrid parallel programming strategy allows one to use the two-level hierarchy: computations in threads (on a single node with many cores) and computations on different nodes in a cluster. The tests carried out on real data of Prochloroccocus marinus coming from Roche sequencer showed, that the algorithm was speeded up 20 times in comparison to the sequential approach with the maintenance of the high accuracy and beating results of other algorithms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bennett, S.: Solexa Ltd. Pharmacogenomics 5, 433–438 (2004)
Blazewicz, J., Figlerowicz, M., Gawron, P., Kasprzak, M., Kirton, E., Platt, D., Swiercz, A., Szajkowski, L.: Whole genome assembly from 454 sequencing output via modified DNA graph concept. Comput. Biol. Chem. 33, 224–230 (2009)
Blazewicz, J., Formanowicz, P., Kasprzak, M., Markiewicz, W.T., Weglarz, J.: DNA sequencing with positive and negative errors. J. Comput. Biol. 6, 113–123 (1999)
Blazewicz, J., Hertz, A., Kobler, D., de Werra, D.: On some properties of DNA graphs. Discrete Appl. Math. 98, 1–19 (1999)
Blazewicz, J., Kasprzak, M., Swiercz, A., Figlerowicz, M., Gawron, P., Platt, D., Szajkowski, L.: Parallel implementation of the novel approach to genome assembly. In: Lee, R., Muenchaisri, P., Dosch, W. (eds.) Proceedings of SNPD 2008, pp. 732–737. IEEE Computer Society, Los Alamitos (2008)
Blazewicz, J., Oguz, C., Swiercz, A., Weglarz, J.: DNA sequencing by hybridization via genetic search. Oper. Res. 54, 1185–1192 (2006)
Chen, F., Alessi, J., Kirton, E., Singan, V., Richardson, P.: Comparison of 454 sequencing platform with traditional Sanger sequencing: a case study with de novo sequencing of Prochlorococcus marinus NATL2A genome. In: Plant and Animal Genomes Conference (2006), http://www.jgi.doe.gov/science/posters/chenPAG2006.pdf
Fu, Y., Peckham, H.E., McLaughlin, S.F., Rhodes, M.D., Malek, J.A., McKernan, K.J., Blanchard, A.P.: SOLiD sequencing and Z-Base encoding. In: The Biology of Genomes Meeting. Cold Spring Harbour Laboratory (2008)
Gallant, J., Maier, D., Storer, J.: On finding minimal length superstrings. J. Comput. Sys. Sci. 20, 50–58 (1980)
Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman and Company, San Francisco (1979)
Green, P.: Documentation for PHRAP. Genome Center, University of Washington, Seattle (1996)
Huang, X., Madan, A.: CAP3: a DNA sequence assembly program. Genome Res. 9, 868–877 (1999)
Kececioglu, J.D., Myers, E.W.: Combinatorial algorithms for DNA sequence assembly. Algorithmica 13, 7–51 (1995)
Margulies, M., Egholm, M., Altman, W.E., Attiya, S., Bader, J.S.: Genome sequencing in microfabricated high-density picolitre reactors. Nature 437, 376–380 (2005)
Maxam, A.M., Gilbert, W.: A new method for sequencing DNA. Proc. Natl. Acad. Sci. USA 74, 560–564 (1977)
Sanger, F., Nicklen, S., Coulson, A.R.: DNA sequencing with chain-terminating inhibitors. Proc. Natl. Acad. Sci. USA 74, 5463–5467 (1977)
Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. J. Mol. Biol. 147, 195–197 (1981)
Waterman, M.S.: Introduction to Computational Biology. Maps, Sequences and Genomes. Chapman & Hall, London (1995)
Zerbino, D.R., Birney, E.: Velvet: algorithms for de novo short reads assembly using de Bruijn graphs. Genome Res. 8, 821–829 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Blazewicz, J. et al. (2012). Highly Efficient Parallel Approach to the Next-Generation DNA Sequencing. In: Wyrzykowski, R., Dongarra, J., Karczewski, K., Waśniewski, J. (eds) Parallel Processing and Applied Mathematics. PPAM 2011. Lecture Notes in Computer Science, vol 7204. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31500-8_27
Download citation
DOI: https://doi.org/10.1007/978-3-642-31500-8_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-31499-5
Online ISBN: 978-3-642-31500-8
eBook Packages: Computer ScienceComputer Science (R0)