Genetic Analysis

  • Gavin J. D. Smith
  • Justin Bahl
  • Dhanasekaran Vijaykrishna
Part of the Methods in Molecular Biology book series (MIMB, volume 865)


Genetic analysis of sequence data is central to determining the evolutionary history and molecular epidemiology of viruses, particularly those such as influenza A virus that have complex ecosystems involving multiple hosts. Here we provide an outline of routine phylogenetic analyses of influenza A viruses including multiple sequence alignment, selecting the best-fit evolutionary model and phylogenetic tree reconstruction using Neighbor joining, Maximum likelihood, and Bayesian inference.

Key words

Sequence alignment Phylogeny Evolution Natural selection Neighbor joining Maximum likelihood Bayesian inference 



G.J.D.S. is supported by a career development award under National Institutes of Health, National Institute of Allergy and Infectious Disease contract HHSN266200700005C and G.J.D.S., J.B. and D.V. by the Duke–NUS Signature Research Program funded by the Agency for Science, Technology and Research, and the Ministry of Health, Singapore.


  1. 1.
    Pybus, O. G., and Rambaut, A. (2009) Evolutionary genetics of the dynamics of viral infectious disease. Nat. Rev. Genetics. 10, 540–550.CrossRefGoogle Scholar
  2. 2.
    Smith, G. J. D., Vijaykrishna, D., Bahl, J., Lycett, S. J., Worobey, M., Pybus, O., Ma, S. K., Cheung, C. L., Raghwani, J., Bhatt, S., Peiris, J. S. M., Guan, Y., and Rambaut, A. (2009) Origins and evolutionary genomics of the 2009 swine-origin H1N1 influenza A epidemic. Nature 459, 1122–1125PubMedCrossRefGoogle Scholar
  3. 3.
    Holmes, E. C. (ed.) (2009) The Evolution and Emergence of RNA Viruses. Oxford University Press, Oxford.Google Scholar
  4. 4.
    Shaw, M. L. (2007) Orthomyxoviridae: The viruses and their replication, in Fields Virology (Knipe, D. M., and Howley P. M., eds.), Lippincott Williams & Wilkins, a Wolters Kluwer Business, Philadelphia, USA.Google Scholar
  5. 5.
    Bao, Y., Bolotov, P., Dernovoy, D., Kiryutin, B., Zaslavsky, L., Tatusova, T., Ostell, J., and Lipman, D. (2008) The Influenza Virus Resource at the National Center for Biotechnology Information. J. Virol. 82, 596–601.PubMedCrossRefGoogle Scholar
  6. 6.
    Katoh, K., Misawa, K., Kuma, K., and Miyata, T. (2002) MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 30, 3059–3066.PubMedCrossRefGoogle Scholar
  7. 7.
    Katoh, K., Kuma, K., Toh, H., and Miyata, T. (2005) MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res. 33, 511–518.PubMedCrossRefGoogle Scholar
  8. 8.
    Posada, D. (2008) jModelTest: Phylogenetic Model Averaging. Mol. Biol. Evol. 25, 1253–1256.PubMedCrossRefGoogle Scholar
  9. 9.
    Swofford, D. L. (2001) PAUP*: Phylogenetic Analysis Using Parsimony (and Other Methods) 4.0 Beta. Sinauer Associates, Sunderland, MA.Google Scholar
  10. 10.
    Zwickl, D. J. (2006) Genetic algorithm approaches for the phylogenetic analysis of large biological sequence datasets under the maximum likelihood criterion. Ph.D. dissertation, The University of Texas at Austin.Google Scholar
  11. 11.
    Huelsenbeck, J. P., and Ronquist, F. R. (2001) MrBayes: Bayesian inference of phylogenetic trees. Bioinformatics 17, 754–755.PubMedCrossRefGoogle Scholar
  12. 12.
    Maddison, D. R., Swofford, D. L., and Maddison, W. P. (1997) NEXUS: an extensible format for systematic information. Syst. Biol. 46, 590–621.PubMedCrossRefGoogle Scholar
  13. 13.
    Squires, B., Macken, C., Garcia-Sastre, A., Godbole, S., Noronha, J., Hunt, V., Chang, R., Larsen, C. N., Klem, E., Biersack, K., and Scheuermann, R. H. (2008) BioHealthBase: informatics support in the elucidation of influenza virus host pathogen interactions and virulence. Nucleic Acids Research 36, D497–D503..PubMedCrossRefGoogle Scholar
  14. 14.
    Bose, M. E., Littrell, J. C., Patzer, A. D., Kraft, A. J., Metallo, J. A., Fan, J., and Henrickson, K. J. (2008) The Influenza Primer Design Resource: a new tool for translating influenza sequence data into effective diagnostics. Influenza Other Respi Viruses. 2, 23–31.PubMedCrossRefGoogle Scholar
  15. 15.
    Thompson, J. D., Higgins, D. G., and Gibson, T. J. (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignments through sequence weighting, position specific gap penalties and weight matrix choice. Nucl. Acids Res. 22, 4673–4680.PubMedCrossRefGoogle Scholar
  16. 16.
    Hall, T. A. (1999) BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucl. Acids. Symp. Ser. 41, 95–98.Google Scholar
  17. 17.
    Posada, D., and Crandall, K. A. (1998) Modeltest: testing the model of DNA substitution. Bioinformatics 14, 817–818.PubMedCrossRefGoogle Scholar
  18. 18.
    Nylander, J. A. A. (2004) MrModeltest, version 2. Program distributed by the author. Evolutionary Biology Centre, Uppsala University.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2012

Authors and Affiliations

  • Gavin J. D. Smith
    • 1
  • Justin Bahl
    • 1
  • Dhanasekaran Vijaykrishna
    • 1
  1. 1.Program in Emerging Infectious DiseasesDuke-NUS Graduate Medical SchoolSingaporeSingapore

Personalised recommendations