Abstract
Whole genome sequencing is frequently applied to hundreds of samples within a single microbial population study. The resulting datasets are large and need to be analysed using computationally efficient methods, the development of which is an active research field. Here we review the current state of the art in terms of computation methods used in microbial population genomics. This includes software for assembly and alignment of core genomic regions, which is usually a pre-requirement for analysing the ancestry of the genomes, via phylogenetic on non-phylogenetic methods. We also review additional techniques aimed at combining genomic data with temporal, geographical or other types of metadata, as well as pan-genome methods of analysis that go beyond the core genome.
References
Alexander DH, Novembre J, Lange K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 2009;19:1655–64. https://doi.org/10.1101/gr.094052.109.
Altschul SF, Madden TL, Schaffer AA, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25:3389–402.
Angiuoli SV, Salzberg SL. Mugsy: fast multiple alignment of closely related whole genomes. Bioinformatics. 2010;27:334–42.
Ansari MA, Didelot X. Bayesian inference of the evolution of a phenotype distribution on a phylogenetic tree. Genetics. 2016;204:89–98. https://doi.org/10.1101/040980.
Argimón S, Abudahab K, Goater RJE, et al. Microreact: visualizing and sharing data for genomic epidemiology and phylogeography. Microb Genomics. 2016;2:e000093. https://doi.org/10.1099/mgen.0.000093.
Balding DJ. A tutorial on statistical methods for population association studies. Nat Rev Genet. 2006;7:781–91. https://doi.org/10.1038/nrg1916.
Bankevich A, Nurk S, Antipov D, et al. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol. 2012;19:455–77. https://doi.org/10.1089/cmb.2012.0021.
Baum DA, Smith SD, Donovan SSS. The tree-thinking challenge. Science. 2005;310:979–80. https://doi.org/10.1126/science.1117727.
Biek R, Henderson JC, Waller LA, et al. A high-resolution genetic signature of demographic and spatial expansion in epizootic rabies virus. Proc Natl Acad Sci U S A. 2007;104:7993–8. https://doi.org/10.1073/pnas.0700741104.
Biek R, Pybus OG, Lloyd-Smith JO, Didelot X. Measurably evolving pathogens in the genomic era. Trends Ecol Evol. 2015;30:306–13. https://doi.org/10.1016/j.tree.2015.03.009.
Bielejec F, Rambaut A, Suchard MA, Lemey P. SPREAD: spatial phylogenetic reconstruction of evolutionary dynamics. Bioinformatics. 2011;27:2910–2. https://doi.org/10.1093/bioinformatics/btr481.
Bielejec F, Baele G, Vrancken B, et al. SpreaD3: interactive visualization of spatiotemporal history and trait evolutionary processes. Mol Biol Evol. 2016;33:2167–9. https://doi.org/10.1093/molbev/msw082.
Bloomquist EWEEW, Dorman KSKSK, Suchard MA. StepBrothers: inferring partially shared ancestries among recombinant viral sequences. Biostatistics. 2009;10:106–20. https://doi.org/10.1093/biostatistics/kxn019.
Bloomquist EW, Lemey P, Suchard MA. Three roads diverged? Routes to phylogeographic inference. Trends Ecol Evol. 2010;25:626–32. https://doi.org/10.1016/j.tree.2010.08.010.
Bouckaert R, Heled J, Kühnert D, et al. BEAST 2: a software platform for Bayesian evolutionary analysis. PLoS Comput Biol. 2014;10:e1003537. https://doi.org/10.1371/journal.pcbi.1003537.
Brown T, Didelot X, Wilson DJ, De Maio N. SimBac: simulation of whole bacterial genomes with homologous recombination. Microb Genomics. 2016;2. https://doi.org/10.1099/mgen.0.000044.
Castillo-Ramírez S, Corander J, Marttinen P, et al. Phylogeographic variation in recombination rates within a global clone of methicillin-resistant Staphylococcus aureus. Genome Biol. 2012;13:R126. https://doi.org/10.1186/gb-2012-13-12-r126.
Chaudhari NM, Gupta VK, Dutta C. BPGA- an ultra-fast pan-genome analysis pipeline. Sci Rep. 2016;6:24373. https://doi.org/10.1038/srep24373.
Chewapreecha C, Harris SR, Croucher NJ, et al. Dense genomic sampling identifies highways of pneumococcal recombination. Nat Genet. 2014;46:305–9. https://doi.org/10.1038/ng.2895.
Chin CS, Sorenson J, Harris JB, et al. The origin of the Haitian cholera outbreak strain. N Engl J Med. 2011;364:33–42.
Cohan FM, Perry EB. A systematics for discovering the fundamental units of bacterial diversity. Curr Biol. 2007;17:R373–86. https://doi.org/10.1016/j.cub.2007.03.032.
Collins C, Didelot X. A phylogenetic method to perform genome-wide association studies in microbes that accounts for population structure and recombination. bioRxiv. 2017. https://doi.org/10.1101/140798.
Comas I, Coscolla M, Luo T, et al. Out-of-Africa migration and Neolithic coexpansion of Mycobacterium tuberculosis with modern humans. Nat Genet. 2013;45:1176–82. https://doi.org/10.1038/ng.2744.
Croucher NJ, Didelot X. The application of genomics to tracing bacterial pathogen transmission. Curr Opin Microbiol. 2015;23:62–7. https://doi.org/10.1016/j.mib.2014.11.004.
Croucher NJ, Harris SRR, Fraser C, et al. Rapid pneumococcal evolution in response to clinical interventions. Science. 2011;331:430–4. https://doi.org/10.1126/science.1198545.
Croucher NJ, Coupland PG, Stevenson AE, et al. Diversification of bacterial genome content through distinct mechanisms over different timescales. Nat Commun. 2014;5:5471. https://doi.org/10.1038/ncomms6471.
Croucher NJ, Page AJ, Connor TR, et al. Rapid phylogenetic analysis of large samples of recombinant bacterial whole genome sequences using Gubbins. Nucleic Acids Res. 2015;43:e15. https://doi.org/10.1093/nar/gku1196.
Cui Y, Yu C, Yan Y, et al. Historical variations in mutation rate in an epidemic pathogen, Yersinia pestis. Proc Natl Acad Sci U S A. 2013;110:577–82. https://doi.org/10.1073/pnas.1205750110.
Cui Y, Yang X, Didelot X, et al. Epidemic clones, oceanic gene pools and eco-LD in the free living marine pathogen Vibrio parahaemolyticus. Mol Biol Evol. 2015;32:1396–410. https://doi.org/10.1093/molbev/msv009.
Darling AE, Mau B, Perna NT. progressiveMauve: multiple genome alignment with gene gain, loss and rearrangement. PLoS One. 2010;5:e11147. https://doi.org/10.1371/journal.pone.0011147.
De Maio N, C-H W, O’Reilly KM, Wilson D. New routes to phylogeography: a Bayesian structured coalescent approximation. PLoS Genet. 2015;11:e1005421. https://doi.org/10.1371/journal.pgen.1005421.
De Silva D, Peters J, Cole K, et al. Whole-genome sequencing to determine transmission of Neisseria gonorrhoeae: an observational study. Lancet Infect Dis. 2016;16:1295–303. https://doi.org/10.1016/S1473-3099(16)30157-8.
Dearlove BL, Cody AJ, Pascoe B, et al. Rapid host switching in generalist Campylobacter strains erodes the signal for tracing human infections. ISME J. 2015;10(3):721–9. https://doi.org/10.1038/ismej.2015.149.
Didelot X, Falush D. Inference of bacterial microevolution using multilocus sequence data. Genetics. 2007;175:1251–66. https://doi.org/10.1534/genetics.106.063305.
Didelot X, Wilson DJ. ClonalFrameML: efficient inference of recombination in whole bacterial genomes. PLoS Comput Biol. 2015;11:e1004041. https://doi.org/10.1371/journal.pcbi.1004041.
Didelot X, Darling AE, Falush D. Inferring genomic flux in bacteria. Genome Res. 2009a;19:306–17. https://doi.org/10.1101/gr.082263.108.clearly.
Didelot X, Lawson DJ, Falush D. SimMLST: simulation of multi-locus sequence typing data under a neutral model. Bioinformatics. 2009b;25:1442–4. https://doi.org/10.1093/bioinformatics/btp145.
Didelot X, Lawson DJ, Darling AE, Falush D. Inference of homologous recombination in bacteria using whole-genome sequences. Genetics. 2010;186:1435–49. https://doi.org/10.1534/genetics.110.120121.
Didelot X, Eyre DW, Cule M, et al. Microevolutionary analysis of Clostridium difficile genomes to investigate transmission. Genome Biol. 2012a;13:R118. https://doi.org/10.1186/gb-2012-13-12-r118.
Didelot X, Méric G, Falush D, Darling AE. Impact of homologous and non-homologous recombination in the genomic evolution of Escherichia coli. BMC Genomics. 2012b;13:256. https://doi.org/10.1186/1471-2164-13-256.
Didelot X, Pang B, Zhou Z, et al. The role of China in the global spread of the current cholera pandemic. PLoS Genet. 2015;11:e1005072. https://doi.org/10.1371/journal.pgen.1005072.
Didelot X, Walker AS, Peto TE, et al. Within-host evolution of bacterial pathogens. Nat Rev Microbiol. 2016;14:150–62. https://doi.org/10.1038/nrmicro.2015.13.
Dingle KE, Elliott B, Robinson E, et al. Evolutionary history of the clostridium difficile pathogenicity locus. Genome Biol Evol. 2014;6:36–52. https://doi.org/10.1093/gbe/evt204.
Drummond AJ, Rambaut A. BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evol Biol. 2007;7:214. https://doi.org/10.1186/1471-2148-7-214.
Drummond AJ, Suchard MA. Bayesian random local clocks, or one rate to rule them all. BMC Biol. 2010;8:114. https://doi.org/10.1186/1741-7007-8-114.
Drummond AJ, Rambaut A, Shapiro B, Pybus OG. Bayesian coalescent inference of past population dynamics from molecular sequences. Mol Biol Evol. 2005;22:1185–92. https://doi.org/10.1093/molbev/msi103.
Drummond AJ, Ho SYW, Phillips MJ, Rambaut A. Relaxed phylogenetics and dating with confidence. PLoS Biol. 2006;4:e88. https://doi.org/10.1371/journal.pbio.0040088.
Drummond AJ, Suchard MA, Xie D, Rambaut A. Bayesian phylogenetics with BEAUti and the BEAST 1.7. Mol Biol Evol. 2012;29:1969–73. https://doi.org/10.1093/molbev/mss075.
Earle SG, Wu C, Charlesworth J, et al. Identifying lineage effects when controlling for population structure improves power in bacterial association studies. Nat Microbiol. 2016;1:16041. https://doi.org/10.1038/nmicrobiol.2016.41.
Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004;32:1792–7. https://doi.org/10.1093/nar/gkh340.
Excoffier L, Foll M. Fastsimcoal: a continuous-time coalescent simulator of genomic diversity under arbitrarily complex evolutionary scenarios. Bioinformatics. 2011;27:1332–4. https://doi.org/10.1093/bioinformatics/btr124.
Falush D, Stephens M, Pritchard JK. Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. Genetics. 2003;164:1567–87.
Faria NR, Suchard MA, Rambaut A, et al. Simultaneously reconstructing viral cross-species transmission history and identifying the underlying constraints. Philos Trans R Soc Lond Ser B Biol Sci. 2013;368:20120196. https://doi.org/10.1098/rstb.2012.0196.
Felsenstein J. Evolutionary trees from DNA sequences: a maximum likelihood approach. J Mol Evol. 1981;17:368–76. https://doi.org/10.1007/BF01734359.
Felsenstein J. Confidence limits on phylogenies: an approach using the bootstrap. Syst Biol. 1985;39:783–91.
Fitch WM. Toward defining the course of evolution: minimum change for a specific tree topology. Syst Biol. 1971;20:406–16. https://doi.org/10.1093/sysbio/20.4.406.
Garrison E, Marth G. Haplotype-based variant detection from short-read sequencing. Prepr arXiv:1207.3907 [q-bio.GN]. 2012; 9.
Gascuel O. BIONJ: an improved version of the NJ algorithm based on a simple model of sequence data. Mol Biol Evol. 1997;14:685–95. https://doi.org/10.1093/oxfordjournals.molbev.a025808.
Gire SK, Goba A, Andersen KG, et al. Genomic surveillance elucidates Ebola virus origin and transmission during the 2014 outbreak. Science. 2014;345:1369–72. https://doi.org/10.1126/science.1259657.
Goodwin S, McPherson JD, McCombie WR. Coming of age: ten years of next-generation sequencing technologies. Nat Rev Genet. 2016;17:333–51. https://doi.org/10.1038/nrg.2016.49.
Grad YH, Kirkcaldy RD, Trees D, et al. Genomic epidemiology of Neisseria gonorrhoeae with reduced susceptibility to cefixime in the USA: a retrospective observational study. Lancet Infect Dis. 2014;14:220–6. https://doi.org/10.1016/S1473-3099(13)70693-5.
Griffiths R, Tavare S. Sampling theory for neutral alleles in a varying environment. Philos Trans R Soc B Biol Sci. 1994;344:403–10.
Guindon S, Dufayard J-F, Lefort V, et al. New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst Biol. 2010;59:307–21. https://doi.org/10.1093/sysbio/syq010.
Haase JK, Didelot X, Lecuit M, et al. The ubiquitous nature of Listeria monocytogenes clones: a large scale MultiLocus sequence typing study. Environ Microbiol. 2014;16:405–16. https://doi.org/10.1111/1462-2920.12342.
Harris SRR, Feil EJ, Holden MT, et al. Evolution of MRSA during hospital transmission and intercontinental spread. Science. 2010;327:469–74. https://doi.org/10.1126/science.1182395.
Harris SR, Clarke IN, Seth-Smith HMB, et al. Whole-genome analysis of diverse Chlamydia trachomatis strains identifies phylogenetic relationships masked by current clinical typing. Nat Genet. 2012;44:413–9. https://doi.org/10.1038/ng.2214.
He M, Miyajima F, Roberts P, et al. Emergence and global spread of epidemic healthcare-associated Clostridium difficile. Nat Genet. 2013;45:109–13. https://doi.org/10.1038/ng.2478.
Hedge J, Wilson J. Bacterial phylogenetic reconstruction from whole genomes is robust to recombination but demographic inference is not. MBio. 2014;5:e02158–14. https://doi.org/10.1128/mBio.02158-14.Editor.
Hellenthal G, Stephens M. msHOT: modifying Hudson’s ms simulator to incorporate crossover and gene conversion hotspots. Bioinformatics. 2007;23:520–1. https://doi.org/10.1093/bioinformatics/btl622.
Höhna MJ, et al. RevBayes: Bayesian phylogenetic inference using graphical models and an interactive model-specification language. Syst Biol. 2016;65:726–36.
Holt KE, Baker S, Weill F-X, et al. Shigella sonnei genome sequencing and phylogenetic analysis indicate recent global dissemination from Europe. Nat Genet. 2012;44:1056–9. https://doi.org/10.1038/ng.2369.
Hudson RR. Generating samples under a Wright-Fisher neutral model of genetic variation. Bioinformatics. 2002;18:337–8. https://doi.org/10.1093/bioinformatics/18.2.337.
Hunt DEDE, David LA, Gevers D, et al. Resource partitioning and sympatric differentiation among closely related bacterioplankton. Science. 2008;320(5879):1081–5. https://doi.org/10.1126/science.1157890.
Hyatt D, Chen G-L, Locascio PF, et al. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics. 2010;11:119. https://doi.org/10.1186/1471-2105-11-119.
Ingle DJ, Tauschek M, Edwards DJ, et al. Evolution of atypical enteropathogenic E. coli by repeated acquisition of LEE pathogenicity island variants. Nat Microbiol. 2016;1:15010. https://doi.org/10.1038/nmicrobiol.2015.10.
Jolley KAA, Maiden MCJ. BIGSdb: scalable analysis of bacterial genome variation at the population level. BMC Bioinformatics. 2010;11:595. https://doi.org/10.1186/1471-2105-11-595.
Joseph SJ, Didelot X, Gandhi K, et al. Interplay of recombination and selection in the genomes of Chlamydia trachomatis. Biol Direct. 2011;6:28. https://doi.org/10.1186/1745-6150-6-28.
Joseph SJ, Didelot X, Rothschild J, et al. Population genomics of chlamydia trachomatis: insights on drift, selection, recombination and population structure. Mol Biol Evol. 2012;29:3933–46. https://doi.org/10.1093/molbev/mss198.
Joy JB, Liang RH, Mccloskey RM, et al. Ancestral reconstruction. PLoS Comput Biol. 2016;12:e1004763. https://doi.org/10.1371/journal.pcbi.1004763.
Kennemann L, Didelot X, Aebischer T, et al. Helicobacter pylori genome evolution during human infection. Proc Natl Acad Sci U S A. 2011;108:5033–8. https://doi.org/10.1073/pnas.1018444108.
Kingman JFC. The coalescent. Stoch Process their Appl. 1982;13:235–48. https://doi.org/10.1016/0304-4149(82)90011-4.
Kislyuk AO, Haegeman B, Bergman NH, Weitz JS. Genomic fluidity: an integrative view of gene diversity within microbial populations. BMC Genomics. 2011;12:32. https://doi.org/10.1186/1471-2164-12-32.
Kurtz S, Phillippy A, Delcher AL, et al. Versatile and open software for comparing large genomes. Genome Biol. 2004;5:R12. https://doi.org/10.1186/gb-2004-5-2-r12.
Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 2009;10:R25. https://doi.org/10.1186/gb-2009-10-3-r25.
Lapierre P, Gogarten JP. Estimating the size of the bacterial pan-genome. Trends Genet. 2009;25:107–10. https://doi.org/10.1002/9781118314630.ch15.
Lawson DJ, Hellenthal G, Myers S, Falush D. Inference of population structure using dense haplotype data. PLoS Genet. 2012;8:e1002453. https://doi.org/10.1371/journal.pgen.1002453.
Lees JA, Vehkala M, Välimäki N, et al. Sequence element enrichment analysis to determine the genetic basis of bacterial phenotypes. Nat Commun. 2016;7:12797. https://doi.org/10.1101/038463.
Lemey P, Rambaut A, Drummond AJ, Suchard M. Bayesian phylogeography finds its roots. PLoS Comput Biol. 2009;5:e1000520. https://doi.org/10.1371/journal.pcbi.1000520.
Lemey P, Rambaut A, Welch JJ, Suchard MA. Phylogeography takes a relaxed random walk in continuous space and time. Mol Biol Evol. 2010;27:1877–85. https://doi.org/10.1093/molbev/msq067.
Li H, Durbin R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics. 2009;25:1754–60.
Li N, Stephens M. Modeling linkage disequilibrium and identifying recombination hotspots using single-nucleotide polymorphism data. Genetics. 2003;165:2213–33. https://doi.org/10.1534/genetics.104.030692.
Li L, Stoeckert CJJ, Roos DS. OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res. 2003;13(9):2178–89. https://doi.org/10.1101/gr.1224503.candidates.
Li H, Handsaker B, Wysoker A, et al. The sequence alignment/map (SAM) format and SAMtools. Bioinformatics. 2009;25:2078–9. https://doi.org/10.1093/bioinformatics/btp352.
Loman NJ, Pallen MJ. Twenty years of bacterial genome sequencing. Nat Rev Microbiol. 2015;13(12):787–94. https://doi.org/10.1038/nrmicro3565.
Lunter G, Goodson M. Stampy: a statistical algorithm for sensitive and fast mapping of Illumina sequence reads. Genome Res. 2011;21:936–9. https://doi.org/10.1101/gr.111120.110.tions.
Maiden MC, Bygraves JA, Feil EJ, et al. Multilocus sequence typing: a portable approach to the identification of clones within populations of pathogenic microorganisms. Proc Natl Acad Sci U S A. 1998;95:3140–5.
Marin JMJ, Pudlo P, Robert CPCP, Ryder R. Approximate Bayesian computational methods. Stat Comput. 2012;22:1167–80.
Martin DP, Murrell B, Golden M, et al. RDP4: detection and analysis of recombination patterns in virus genomes. Virus Evol. 2015;1:vev003. https://doi.org/10.1093/ve/vev003.
Marttinen P, Hanage WP, Croucher NJ, et al. Detection of recombination events in bacterial genomes from large population samples. Nucleic Acids Res. 2012;40:1–12. https://doi.org/10.1093/nar/gkr928.
McKenna A, Hanna M, Banks E, et al. The genome analysis toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010;20:1297–303.
McNally A, Oren Y, Kelly D, et al. Combined analysis of variation in core, accessory and regulatory genome regions provides a super-resolution view into the evolution of bacterial populations. PLoS Genet. 2016;12:e1006280. https://doi.org/10.5061/dryad.d7d71.
Medini D, Donati C, Tettelin H, et al. The microbial pan-genome. Curr Opin Genet Dev. 2005;15:589–94. https://doi.org/10.1016/j.gde.2005.09.006.
Milne I, Wright F, Rowe G, et al. TOPALi: software for automatic identification of recombinant sequences within DNA multiple alignments. Bioinformatics. 2004;20:1806–7. https://doi.org/10.1093/bioinformatics/bth155.
Milne I, Lindner D, Bayer M, et al. TOPALi v2: a rich graphical interface for evolutionary analyses of multiple alignments on HPC clusters and multi-core desktops. Bioinformatics. 2009;25:126–7. https://doi.org/10.1093/bioinformatics/btn575.
Mutreja A, Kim DW, Thomson NR, et al. Evidence for several waves of global transmission in the seventh cholera pandemic. Nature. 2011;477:462–5. https://doi.org/10.1038/nature10392.
Nagarajan N, Kingsford C. GiRaF: robust, computational identification of influenza reassortments via graph mining. Nucleic Acids Res. 2011;39:e34. https://doi.org/10.1093/nar/gkq1232.
Nübel U, Dordel J, Kurt K, et al. A timescale for evolution, population expansion, and spatial spread of an emerging clone of methicillin-resistant Staphylococcus aureus. PLoS Pathog. 2010;6:e1000855. https://doi.org/10.1371/journal.ppat.1000855.
Overbeek R, Olson R, Pusch GD, et al. The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST). Nucleic Acids Res. 2014;42:206–14. https://doi.org/10.1093/nar/gkt1226.
Page AJ, Cummins CA, Hunt M, et al. Roary: rapid large-scale prokaryote pan genome analysis. Bioinformatics. 2015;31:3691–3. https://doi.org/10.1093/bioinformatics/btv421.
Paradis E, Claude J, Strimmer K. APE: analyses of phylogenetics and evolution in R language. Bioinformatics. 2004;20:289–90. https://doi.org/10.1093/bioinformatics/btg412.
Peng Y, et al. IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth. Bioinformatics. 2012;28:1420–8.
Pond SLK, Posada D, Gravenor MB, et al. Sequence analysis GARD: a genetic algorithm for recombination detection. Bioinformatics. 2006;22:3096–8. https://doi.org/10.1093/bioinformatics/btl474.
Price MN, Dehal PS, Arkin AP. FastTree: computing large minimum evolution trees with profiles instead of a distance matrix. Mol Biol Evol. 2009;26:1641–50. https://doi.org/10.1093/molbev/msp077.
Price MN, Dehal PS, Arkin AP. FastTree 2 – approximately maximum-likelihood trees for large alignments. PLoS One. 2010;5:e9490. https://doi.org/10.1371/journal.pone.0009490.
Pritchard JK, Stephens M, Donnelly P. Inference of population structure using multilocus genotype data. Genetics. 2000;155:945–59.
Purcell S, Neale B, Todd-Brown K, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet. 2007;81:559–75. https://doi.org/10.1086/519795.
Rambaut A, Lam TT, Max Carvalho L, Pybus OG. Exploring the temporal structure of heterochronous sequences using TempEst (formerly Path-O-Gen). Virus Evol. 2016;2:vew007. https://doi.org/10.1093/ve/vew007.
Ratmann O, Hodcroft EB, Pickles M, et al. Phylogenetic tools for generalized HIV-1 epidemics: findings from the PANGEA-HIV methods comparison. Mol Biol Evol. 2017;34:185–203. https://doi.org/10.1093/molbev/msw217.
Read TD, Massey RC. Characterizing the genetic basis of bacterial phenotypes using genome-wide association studies: a new direction for bacteriology. Genome Med. 2014;6:109. https://doi.org/10.1186/s13073-014-0109-z.
Ronquist F, Teslenko M, van der Mark P, et al. MrBayes 3.2: efficient Bayesian phylogenetic inference and model choice across a large model space. Syst Biol. 2012;61:539–42. https://doi.org/10.1093/sysbio/sys029.
Sahl JW, Caporaso JG, Rasko DA, Keim P. The large-scale blast score ratio (LS-BSR) pipeline: a method to rapidly compare genetic content between bacterial genomes. PeerJ. 2014;2:e332. https://doi.org/10.7717/peerj.332.
Saitou N, Nei M. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol. 1987;4:406–25.
Schierup MH, Hein J. Consequences of recombination on traditional phylogenetic analysis. Genetics. 2000;156:879–91.
Seemann T. Prokka: rapid prokaryotic genome annotation. Bioinformatics. 2014;30:2068–9. https://doi.org/10.1093/bioinformatics/btu153.
Shepheard MA, Fleming VM, Connor TR, et al. Historical zoonoses and other changes in host tropism of staphylococcus aureus, identified by phylogenetic analysis of a population dataset. PLoS One. 2013;8:e62369. https://doi.org/10.1371/journal.pone.0062369.
Sheppard SK, Didelot X, Jolley KA, et al. Progressive genome-wide introgression in agricultural Campylobacter coli. Mol Ecol. 2013a;22:1051–64. https://doi.org/10.1111/mec.12162.
Sheppard SK, Didelot X, Meric G, et al. Genome-wide association study identifies vitamin B5 biosynthesis as a host specificity factor in Campylobacter. Proc Natl Acad Sci U S A. 2013b;110:11923–7. https://doi.org/10.5061/dryad.28n35.
Smith GJD, Vijaykrishna D, Bahl J, et al. Origins and evolutionary genomics of the 2009 swine-origin H1N1 influenza A epidemic. Nature. 2009;459:1122–5. https://doi.org/10.1038/nature08182.
Stamatakis A. RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics. 2006;22:2688–90. https://doi.org/10.1093/bioinformatics/btl446.
Stoesser N, Sheppard A, Pankhurst L, et al. Evolutionary history of the global emergence of the Escherichia coli epidemic clone ST131. MBio. 2016;7:e02162–15. https://doi.org/10.1128/mBio.02162-15.Invited.
Tang J, Hanage WP, Fraser C, Corander J. Identifying currents in the gene pool for bacterial populations using an integrative approach. PLoS Comput Biol. 2009;5:e1000455. https://doi.org/10.1371/journal.pcbi.1000455.
Tettelin H, Masignani V, Cieslewicz MJ, et al. Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: Implications for the microbial “pan-genome”. Proc Natl Acad Sci U S A. 2005;102:13950–5.
Tettelin H, Riley D, Cattuto C, Medini D. Comparative genomics: the bacterial pan-genome. Curr Opin Microbiol. 2008;12:472–7. https://doi.org/10.1016/j.mib.2008.09.006.
To T-H, Jung M, Lycett S, Gascuel O. Fast dating using least-squares criteria and algorithms. Syst Biol. 2016;65:82–97. https://doi.org/10.1093/sysbio/syv068.
Touchon M, Hoede C, Tenaillon O, et al. Organised genome dynamics in the Escherichia coli species results in highly diverse adaptive paths. PLoS Genet. 2009;5:e1000344. https://doi.org/10.1371/journal.pgen.1000344.
Tritt A, Eisen JA, Facciotti MT, Darling AE. An integrated pipeline for de novo assembly of microbial genomes. PLoS One. 2012;7:e42304. https://doi.org/10.1371/journal.pone.0042304.
Vernikos G, Medini D, Riley DR, Tettelin H. Ten years of pan-genome analyses. Curr Opin Microbiol. 2015;23:148–54. https://doi.org/10.1016/j.mib.2014.11.016.
Visscher PM, Hill WG, Wray NR. Heritability in the genomics era – concepts and misconceptions. Nat Rev Genet. 2008;9:255–66. https://doi.org/10.1038/nrg2322.
Ward MJ, Gibbons CL, McAdam PR, et al. Time-scaled evolutionary analysis of the transmission and antibiotic resistance dynamics of Staphylococcus aureus clonal complex 398. Appl Environ Microbiol. 2014;80:7275–82. https://doi.org/10.1128/AEM.01777-14.
Weinert LA, Chaudhuri RR, Wang J, et al. Genomic signatures of human and animal disease in the zoonotic pathogen Streptococcus suis. Nat Commun. 2015;6:6740. https://doi.org/10.1038/ncomms7740.
Wielgoss S, Didelot X, Chaudhuri RR, et al. A barrier to homologous recombination between sympatric strains of the cooperative soil bacterium Myxococcus xanthus. ISME J. 2016;10:2468–77. https://doi.org/10.1038/ismej.2016.34.
Worobey M, Gemmel M, Teuwen DE, et al. Direct evidence of extensive diversity of HIV-1 in Kinshasa by 1960. Nature. 2008;455:661–4. https://doi.org/10.1038/nature07390.
Yahara K, Furuta Y, Oshima K, et al. Chromosome painting in silico in a bacterial species reveals fine population structure. Mol Biol Evol. 2013;30:1454–64. https://doi.org/10.1093/molbev/mst055.
Yahara K, Didelot X, Ansari MA, et al. Efficient inference of recombination hot regions in bacterial genomes. Mol Biol Evol. 2014;31:1593–605. https://doi.org/10.1093/molbev/msu082.
Zerbino DR, Birney E. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 2008;18:821–9. https://doi.org/10.1101/gr.074492.107.
Zhou Z, McCann A, Litrup E, et al. Neutral genomic microevolution of a recently emerged pathogen, Salmonella enterica serovar Agona. PLoS Genet. 2013;9:e1003471. https://doi.org/10.1371/journal.pgen.1003471.
Zinder D, Bedford T, Gupta S, Pascual M. The roles of competition and mutation in shaping antigenic and genetic diversity in influenza. PLoS Pathog. 2013;9:e1003104. https://doi.org/10.1371/journal.ppat.1003104.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this chapter
Cite this chapter
Didelot, X. (2017). Computational Methods in Microbial Population Genomics. In: Polz, M., Rajora, O. (eds) Population Genomics: Microorganisms. Population Genomics. Springer, Cham. https://doi.org/10.1007/13836_2017_3
Download citation
DOI: https://doi.org/10.1007/13836_2017_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-04755-9
Online ISBN: 978-3-030-04756-6
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)