A mathematical concept known as a de Bruijn graph turns the formidable challenge of assembling a contiguous genome from billions of short sequencing reads into a tractable computational problem.
Similar content being viewed by others
References
Euler, L. Commentarii Academiae Scientiarum Petropolitanae 8, 128–140 (1741).
Skiena, S. The Algorithm Design Manual (Springer, Berlin, 2008).
Lander, E. et al. Nature 409, 860–921 (2001).
Venter, J.C. et al. Science 291, 1304–1351 (2001).
Kececioglu, J. & Myers, E. Algorithmica 13, 7–51 (1995).
Adams, M. et al. Science 287, 2185–2195 (2000).
Fleischmann, R. et al. Science 269, 496–512 (1995).
Schatz, M., Delcher, A. & Salzberg, S. Genome Res. 20, 1165–1173 (2010).
Bandeira, N., Pham, V., Pevzner, P., Arnott, D. & Lill, J. Nat. Biotechnol. 26, 1336–1338 (2008).
Pham, S. & Pevzner, P.A. Bioinformatics 26, 2509–2516 (2010).
Grabherr, M. et al. Nat. Biotechnol. 29, 644–652 (2011).
de Bruijn, N. Proc. Nederl. Akad. Wetensch. 49, 758–764 (1946).
Idury, R. & Waterman, M. J. Comput. Biol. 2, 291–306 (1995).
Pevzner, P.A., Tang, H. & Waterman, M. Proc. Natl. Acad. Sci. USA 98, 9748–9753 (2001).
Pevzner, P.A., Tang, H. & Tesler, G. Genome Res. 14, 1786–1796 (2004).
Chaisson, M. & Pevzner, P.A. Genome Res. 18, 324–330 (2008).
Zerbino, D. & Birney, E. Genome Res. 18, 821–829 (2008).
Butler, J. et al. Genome Res. 18, 810–820 (2008).
Simpson, J. et al. Genome Res. 19, 1117–1123 (2009).
Li, R. et al. Genome Res. 20, 265–272 (2010).
Paszkiewicz, K. & Studholme, D. Brief. Bioinform. 11, 457–472 (2010).
Miller, J., Koren, S. & Sutton, G. Genomics 95, 315–327 (2010).
Drmanac, R., Labat, I., Brukner, I. & Crkvenjakov, R. Genomics 4, 114–128 (1989).
Southern, E. United Kingdom patent application gb8810400 (1988).
Lysov, Y. et al. Doklady Academy Nauk USSR 303, 1508–1511 (1988).
Pevzner, P.A. J. Biomol. Struct. Dyn. 7, 63–73 (1989).
Acknowledgements
This work was supported by grants from Howard Hughes Medical Institute (HHMI grant 52005726), the US National Institutes of Health (NIH grant 3P41RR024851-02S1) and the National Science Foundation (NSF grant DMS-0718810). We are grateful to S. Wasserman for many helpful comments.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Supplementary information
Supplementary Figure 1 and 2
De Bruijn graph from reads with sequencing errors (PDF 139 kb)
Rights and permissions
About this article
Cite this article
Compeau, P., Pevzner, P. & Tesler, G. How to apply de Bruijn graphs to genome assembly. Nat Biotechnol 29, 987–991 (2011). https://doi.org/10.1038/nbt.2023
Published:
Issue Date:
DOI: https://doi.org/10.1038/nbt.2023
- Springer Nature America, Inc.
This article is cited by
-
An overlooked phenomenon: complex interactions of potential error sources on the quality of bacterial de novo genome assemblies
BMC Genomics (2024)
-
Deepstacked-AVPs: predicting antiviral peptides using tri-segment evolutionary profile and word embedding based multi-perspective features with deep stacking model
BMC Bioinformatics (2024)
-
GMean—a semi-supervised GRU and K-mean model for predicting the TF binding site
Scientific Reports (2024)
-
Eulerian Routing in Practice
Erkenntnis (2024)
-
Integrating genomic sequencing resources: an innovative perspective on recycling with universal Angiosperms353 probe sets
Horticulture Advances (2024)