Identification of Mutations in Laboratory-Evolved Microbes from Next-Generation Sequencing Data Using breseq

Part of the Methods in Molecular Biology book series (MIMB, volume 1151)


Next-generation DNA sequencing (NGS) can be used to reconstruct eco-evolutionary population dynamics and to identify the genetic basis of adaptation in laboratory evolution experiments. Here, we describe how to run the open-source breseq computational pipeline to identify and annotate genetic differences found in whole-genome and whole-population NGS data from haploid microbes where a high-quality reference genome is available. These methods can also be used to analyze mutants isolated in genetic screens and to detect unintended mutations that may occur during strain construction and genome editing.

Key words

Evolutionary genomics Genome re-sequencing Variant caller Single-nucleotide variant Structural variant Insertion sequence Mobile genetic element Gene conversion 


  1. 1.
    Mardis ER (2008) Next-generation DNA sequencing methods. Annu Rev Genomics Hum Genet 9:387–402CrossRefGoogle Scholar
  2. 2.
    Eid J, Fehr A, Gray J et al (2009) Real-time DNA sequencing from single polymerase molecules. Science 323:133–138CrossRefGoogle Scholar
  3. 3.
    Trapnell C, Salzberg SL (2009) How to map billions of short reads onto genomes. Nat Biotechnol 27:455–457CrossRefGoogle Scholar
  4. 4.
    DePristo MA, Banks E, Poplin R et al (2011) A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet 43:491–498CrossRefGoogle Scholar
  5. 5.
    Kim D, Salzberg SL (2011) TopHat-fusion: an algorithm for discovery of novel fusion transcripts. Genome Biol 12:R72CrossRefGoogle Scholar
  6. 6.
    Barrick JE, Yu DS, Yoon SH et al (2009) Genome evolution and adaptation in a long-term experiment with Escherichia coli. Nature 461:1243–1247CrossRefGoogle Scholar
  7. 7.
    Barrick JE, Lenski RE (2009) Genome-wide mutational diversity in an evolving population of Escherichia coli. Cold Spring Harb Symp Quant Biol 74:119–129CrossRefGoogle Scholar
  8. 8.
    Woods RJ, Barrick JE, Cooper TF et al (2011) Second-order selection for evolvability in a large Escherichia coli population. Science 331:1433–1436CrossRefGoogle Scholar
  9. 9.
    Blount ZD, Barrick JE, Davidson CJ, Lenski RE (2012) Genomic analysis of a key innovation in an experimental Escherichia coli population. Nature 489:513–518CrossRefGoogle Scholar
  10. 10.
    Milne I, Stephen G, Bayer M et al (2013) Using Tablet for visual exploration of second-generation sequencing data. Brief Bioinform 14:193–202. doi:10.1093/bib/bbs012 CrossRefGoogle Scholar
  11. 11.
    Thorvaldsdóttir H, Robinson JT, Mesirov JP (2013) Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Brief Bioinform 14:178–192CrossRefGoogle Scholar
  12. 12.
    Jeong H, Barbe V, Lee CH et al (2009) Genome sequences of Escherichia coli B strains REL606 and BL21(DE3). J Mol Biol 394:644–652CrossRefGoogle Scholar
  13. 13.
    Schneider D, Duperchy E, Coursange E et al (2000) Long-term experimental evolution in Escherichia coli. IX. Characterization of insertion sequence-mediated mutations and rearrangements. Genetics 156:477–488Google Scholar
  14. 14.
    Danecek P, Auton A, Abecasis G et al (2011) The variant call format and VCF tools. Bioinformatics 27:2156–2158CrossRefGoogle Scholar
  15. 15.
    Andrews S FastQC: a quality control tool for high throughput sequence data.
  16. 16.
    Dodt M, Roehr J, Ahmed R, Dieterich C (2012) FLEXBAR—flexible barcode and adapter processing for next-generation sequencing platforms. Biology 1:895–905CrossRefGoogle Scholar
  17. 17.
    Zerbino DR, Birney E (2008) Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res 18:821–829CrossRefGoogle Scholar
  18. 18.
    Ribeiro FJ, Przybylski D, Yin S et al (2012) Finished bacterial genomes from shotgun sequence data. Genome Res 22:2270–2277CrossRefGoogle Scholar
  19. 19.
    Kurtz S, Phillippy A, Delcher AL et al (2004) Versatile and open software for comparing large genomes. Genome Biol 5:R12CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  1. 1.Department of Molecular BiosciencesCenter for Systems and Synthetic Biology, Center for Computational Biology and Bioinformatics, Institute for Cellular and Molecular Biology, The University of Texas at AustinAustinUSA

Personalised recommendations