Abstract
New generation sequencing systems are changing how molecular biology is practiced. The widely promoted $1000 genome will be a reality with attendant changes for healthcare, including personalized medicine. More broadly the genomes of many new organisms with large samplings from populations will be commonplace. What is less appreciated is the explosive demands on computation, both for CPU cycles and storage as well as the need for new computational methods. In this article we will survey some of these developments and demands.
Similar content being viewed by others
References
Harris T D, Buzby P R, Babcock H, Beer E, Bowers J, Braslavsky I, Causey M, Colonell J, Dimeo J, Efcavitch J W, Giladi E, Gill J, Healy J, Jarosz M, Lapen D, Moulton K, Quake S R, Steinmann K, Thayer E, Tyurina A, Ward R, Weiss H, Xie Z. Single-molecule DNA sequencing of a viral genome. Science, 2008, 320(5872): 106-109.
Fuller C W, Middendorf L R, Benner S A, Church G M, Harris T, Huang X, Jovanovich S B, Nelson J R, Schloss J A, Schwartz D C, Vezenov D V. The challenges of sequencing by synthesis. Nat. Biotechnol., 2009, 27(11): 1013-1023.
Pemov A, Modi H, Chandler D P, Bavykin S. DNA analysis with multiplex microarray-enhanced PCR. Nucleic Acids Res., 2005, 33(2): e11.
Ronaghi M, Uhlen M, Nyren P. A sequencing method based on real-time pyrophosphate. Science, 1998, 281(5375): 363-365.
Ronaghi M, Karamohamed S, Pettersson B, Uhlen M, Nyren P. Real-time DNA sequencing using detection of pyrophosphate release. Anal. Biochem., 1996, 242(1): 84-89.
Bentley D R. Whole-genome re-sequencing. Curr. Opin. Genet. Dev., 2006, 16(6): 545-552.
Eid J, Fehr A, Gray J et al. Real-time DNA sequencing from single polymerase molecules. Science, 2009, 323(5910): 133-138.
Kasianowicz J J, Brandin E, Branton D, Deamer D W. Characterization of individual polynucleotide molecules using a membrane channel. Proc. Natl. Acad. Sci. USA, 1996, 93(24): 13770-13773.
Astier Y, Braha O, Bayley H. Toward single molecule DNA sequencing: Direct identification of ribonucleoside and deoxyribonucleoside 5′-monophosphates by using an engineered protein nanopore equipped with a molecular adapter. J. Am. Chem. Soc., 2006, 128(5): 1705-1710.
Branton D, Deamer D W, Marziali A et al. The potential and challenges of nanopore sequencing. Nat. Biotechnol., 2008, 26(10): 1146-1153.
Sigalov G, Comer J, Timp G, Aksimentiev A. Detection of DNA sequences using an alternating electric field in a nanopore capacitor. Nano. Lett., 2008, 8(1): 56-63.
Jett J H, Keller R A, Martin J C, Marrone B L, Moyzis R K, Ratliff R L, Seitzinger N K, Shera E B, Stewart C C. Highspeed DNA sequencing: An approach based upon fluorescence detection of single molecules. J. Biomol. Struct. Dyn., 1989, 7(2): 301-309.
Clarke J, Wu H C, Jayasinghe L, Patel A, Reid S, Bayley H. Continuous base identification for single-molecule nanopore DNA sequencing. Nat. Nanotechnol., 2009, 4(4): 265-270.
Zhang M Q, Smith A D. Challenges in understanding genomewide DNA methylation. J. Comput. Sci. & Technol., 2010, 25(1): 26-34.
Morozova O, Marra M. Applications of next-generation sequencing technologies in functional genomics. Genomics, 2008, 92(5): 255-264.
Chen Y, Souaiaia T, Chen T. PerM: Efficient mapping of short sequencing reads with periodic full sensitive spaced seeds. Bioinformatics, 2009, 25 (19): 2514-2521.
Staden R. A strategy of DNA sequencing employing computer programs. Nucleic Acids Res., June 11, 1979, 6(7): 2601-2610.
Gingeras T R, Milazzo J P, Sciaky D, Roberts R J. Computer programs for the assembly of DNA sequences. Nucleic Acids Res., September 25, 1979, 7(9): 529-545.
Gallant J, Maier D, Storer J. On finding minimal length superstrings. J. Computer System Sci., 1980, 20(1): 50-58.
Waterman M S. Introduction to Computational Biology. Chapman & Hall, 1995.
Kececioglu J D, Myers E W. Combinatiorial algorithms for DNA sequence assembly. Algorithmica, 1995, 13(1/2): 7-51.
Kececioglu J D. Exact and approximation algorithms for DNA sequence reconstruction [Ph.D. Dissertation]. University of Arizona, Tucson, USA, 1992.
Myers E W. Toward simplifying and accurately formulating fragment assembly. Journal of Computational Biology, 1995, 2(2): 275-290.
Myers E S. The fragment assembly string graph. Bioinformatics, 2005, 21(Suppl. 2): ii79-ii85.
Venter J C, Adams M D, Myers E W et al. The sequence of the human genome. Science, 2001, 291: 1304-1351.
Lander E S, Linton L M, Birren B et al. Initial sequencing and analysis of the human genome. Nature, 2001, 409(6822): 860-921.
Istrail S, Sutton G, Florea L et al. Whole genome shotgun assembly and comparison of human genome assemblies. Proc. Natl. Acad. Sci. USA, 2003, 101(7): 1916-1921.
Idury R, Waterman M S. A new algorithm for DNA sequence. J. Comput. Biol., 1995, 2(2): 291-306.
Chaisson M J, Pevzner P A. Short read fragment assembly of bacterial genomes. Genome Res., 2008, 18(2): 324-330.
Chaisson M J, Tang H, Pevzner P A. Fragment assembly with short reads. Bioinformatics, 2004, 20(13): 2067-2074.
Myers E W. The fragment assembly string graph. Bioinformatics, 2005, 21(Suppl. 2): ii79-ii85, doi:10.1093/bioinformatics/bti7114.
Pevzner P A, Tang H. Fragment assembly with doublebarreled data. Bioinformatics, 2001, 17(Suppl. 1): S225-S233.
Pevzner P A, Tang H, Waterman M S. A Eulerian path approach to DNA fragment assembly. Proc. Natl. Acad. Sci. USA, 2001, 98(17): 9748-9753.
Zerbino D R, Birney E. Velvet: Algorithms for de novo short read assembly using de Bruijn graphs. Genome Res., 2008, 18(5): 821-829.
Church D M, Goodstadt L, Hillier L W et al. Lineage-specific biology revealed by a finished genome assembly of the mouse. PLoS Biol., 2009, 7(5): e1000112.
Valouev A, Schwartz D C, Zhou S, Waterman M S. An algorithm for assembly of ordered restriction maps from single DNA molecules. Proc. Natl. Acad. Sci. USA, 2006, 103(43): 15770-15775.
Nagarajan N, Read T D, Pop M. Scaffolding and validation of bacterial genome assemblies using optical restriction maps. Bioinformatics, 2008, 24(10): 1229-1235.
Schwartz D C, Li X, Hernandez L, Ramnarain S P, Huff E J, Wang Y K. Ordered restriction maps of Saccharomyces cerevisiae chromosomes constructed by optical mapping. Science, 1993, 262(5130): 110-114.
Zhou S, Herschleb J, Schwartz D C. A Single Molecule System forWhole Genome Analysis. New Methods for DNA Sequencing, Mitchelson K R (ed)., Amsterdam: Elsevier, 2007.
Dimalanta E T, Lim A, Runnheim R et al. A microfluidic system for large DNA molecule arrays. Anal. Chem., 2004, 76(18): 5293-5301.
Valouev A, Zhang Y, Schwartz D C, Waterman M S. Refinement of optical map assemblies (original paper). Bioinformatics, 2006, 22(10): 1217-1224.
Valouev A, Li L, Liu Y C, Schwartz D C, Yang Y, Zhang Y, Waterman M S. Alignment of optical maps. J. Comput. Biol., 2006, 13(2): 442-462.
Jo K, Dhingra D M, Odijk T et al. A single-molecule barcoding system using nanoslits for DNA analysis. Proc. Natl. Acad. Sci. USA, 2007, 104(8): 2673-2678.
Ramanathan A, Pape L, Schwartz D C. High-density polymerase-mediated incorporation of fluorochrome-labeled nucleotides. Analytical Biochemistry, 2005, 337(1): 1-11.
Ramanathan A, Huff E J, Lamers C C, Potamousis K D. Forrest D K, Schwartz D C. An integrative approach for the optical sequencing of single DNA molecules. Analytical Biochemistry, 2004, 330(2): 227-241.
Zhou S, Pape L, Schwartz D C. Optical Sequencing: Acquisition from Mapped Single Molecule Templates. Next Generation Genome Sequencing: Towards Personalized Medicine, Janitz M (ed.), 2008, Weinheim: Wiley-VCH Verlag & Co., pp.133-149.
Aguilera A, Gomez-Gonzalez B. Genome instability: A mechanistic view of its causes and consequences. Nat. Rev. Genet., 2008, 9(3): 204-217.
Author information
Authors and Affiliations
Corresponding author
Additional information
This work is supported by NIH under Grant No. R01 HG000225 (DCS) and NSF of USA under Grant No. DBI-0501818 (DCS).
Rights and permissions
About this article
Cite this article
Schwartz, D.C., Waterman, M.S. New Generations: Sequencing Machines and Their Computational Challenges. J. Comput. Sci. Technol. 25, 3–9 (2010). https://doi.org/10.1007/s11390-010-9300-x
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11390-010-9300-x