Abstract
In recent years, there have been numerous technological advances in the field of molecular biology; these include next- and third-generation sequencing of DNA genomes and mRNA transcripts and mass spectrometry of proteins. Perhaps, however, it is genome sequencing that impacts a virologist the most. In 2017, more than 480 complete genome sequences of poxviruses have been generated, and are constantly used in many different ways by almost all molecular virologists. Matching this growth in data acquisition is an explosion of the relatively new field of bioinformatics, providing databases to store and organize this valuable/expensive data and algorithms to analyze it. For the bench virologist, access to intuitive, easy-to-use, software is often critical for performing bioinformatics-based experiments. Three common hurdles for the researcher are (1) selection, retrieval, and reformatting genomics data from large databases; (2) use of tools to compare/analyze the genomics data; and (3) display and interpretation of complex sets of results. This chapter is directed at the bench virologist and describes the software that helps overcome these obstacles, with a focus on the comparison and analysis of poxvirus genomes. Although poxvirus genomes are stored in public databases such as GenBank, this resource can be cumbersome and tedious to use if large amounts of data must to be collected. Therefore, we also highlight our Viral Orthologous Clusters database system and integrated tools that we developed specifically for the management and analysis of complete viral genomes.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Goebel SJ, Johnson GP, Perkus ME, Davis SW, Winslow JP, Paoletti E (1990) The complete DNA sequence of vaccinia virus. Virology 179:247–266
Bennett M, Tu S-L, Upton C, McArtor C, Gillett A, Laird T et al (2017) Complete genomic characterisation of two novel poxviruses (WKPV and EKPV) from western and eastern grey kangaroos. Virus Res 242:106–121
Laird MR, Langille MGI, Brinkman FSL (2015) GenomeD3Plot: a library for rich, interactive visualizations of genomic data in web applications. Bioinformatics 31:3348–3349
Upton C, Slack S, Hunter AL, Ehlers A, Roper RL (2003) Poxvirus orthologous clusters: toward defining the minimum essential poxvirus genome. J Virol 77:7590–7600
Upton C, Hogg D, Perrin D, Boone M, Harris NL (2000) Viral genome organizer: a system for analyzing complete viral genomes. Virus Res 70:55–64
Sonnhammer E, Durbin R (1995) A dot-matrix program with dynamic threshold control suited for genomic DNA and protein sequence analysis (Reprinted from Gene Combis, vol 167, pg GC1-GC10, 1996). Gene 167:GC1–GC10
Brodie R, Roper RL, Upton C (2004) JDotter: a Java interface to multiple dotplots generated by dotter. Bioinformatics 20:279–281
Sievers F, Higgins DG (2014) Clustal Omega, accurate alignment of very large numbers of sequences. Methods Mol Biol 1079:105–116
Katoh K, Misawa K, Kuma K, Miyata T (2002) MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res 30:3059–3066
Edgar RC (2004) MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics 5:113
Hillary W, Lin S-H, Upton C (2011) Base-By-Base version 2: single nucleotide-level analysis of whole viral genome alignments. Microb Inform Exp 1:2
Tcherepanov V, Ehlers A, Upton C (2006) Genome Annotation Transfer Utility (GATU): rapid annotation of viral genomes using a closely related reference genome. BMC Genomics 7:150
Soding J, Biegert A, Lupas AN (2005) The HHpred interactive server for protein homology detection and structure prediction. Nucleic Acids Res 33:W244–W248
Chevreux B (2007) MIRA: an automated genome and EST assembler
Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS et al (2012) SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol 19:455–477
Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25:1754–1760
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N et al (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25:2078–2079
Breese MR, Liu Y (2013) NGSUtils: a software suite for analyzing and manipulating next-generation sequencing datasets. Bioinformatics 29:494–496
Stamatakis A (2006) RAxML-VI-HPC: Maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22:2688–2690
Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W et al (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402
Madden T (2013) The BLAST sequence analysis tool.
Satheshkumar PS, Moss B (2009) Characterization of a newly identified 35-amino-acid component of the vaccinia virus entry/fusion complex conserved in all chordopoxviruses. J Virol 83:12822–12832
Satheshkumar PS, Moss B (2012) Sequence-divergent chordopoxvirus homologs of the O3 protein maintain functional interactions with components of the vaccinia virus entry-fusion complex. J Virol 86:1696–1705
Da Silva M, Upton C (2005) Host-derived pathogenicity islands in poxviruses. Virol J:2, 30
Upton C (2000) Screening predicted coding regions in poxvirus genomes. Virus Genes 20:159–164
Da Silva M, Upton C (2005) Using purine skews to predict genes in AT-rich poxviruses. BMC Genomics 6:22
Boratyn GM, Schaeffer AA, Agarwala R, Altschul SF, Lipman DJ, Madden TL (2012) Domain enhanced lookup time accelerated BLAST. Biol Direct 7:12
Papadopoulos JS, Agarwala R (2007) COBALT: constraint-based alignment tool for multiple protein sequences. Bioinformatics 23:1073–1079
Kelley LA, Sternberg MJE (2009) Protein structure prediction on the Web: a case study using the Phyre server. Nat Protoc 4:363–371
Kim DE, Chivian D, Baker D (2004) Protein structure prediction and analysis using the Robetta server. Nucleic Acids Res 32:W526–W531
Zhang Y (2008) I-TASSER server for protein 3D structure prediction. BMC Bioinformatics 9:40
O’Dea MA, Tu S-L, Pang S, De Ridder T, Jackson B, Upton C (2016) Genomic characterization of a novel poxvirus from a flying fox: evidence for a new genus? J Gen Virol 97:2363–2375
Pettersen EF, Goddard TD, Huang CC, Couch GS, Greenblatt DM, Meng EC et al (2004) UCSF chimera–A visualization system for exploratory research and analysis. J Comput Chem 25:1605–1612
Bairoch A (1993) The prosite dictionary of sites and patterns in proteins, its current status. Nucleic Acids Res 21:3097–3103
de Castro E, Sigrist CJA, Gattiker A, Bulliard V, Langendijk-Genevaux PS, Gasteiger E et al (2006) ScanProsite: detection of PROSITE signature matches and ProRule-associated functional and structural residues in proteins. Nucleic Acids Res 34:W362–W365
Marchler-Bauer A, Derbyshire MK, Gonzales NR, Lu S, Chitsaz F, Geer LY et al (2015) CDD: NCBI’s conserved domain database. Nucleic Acids Res 43:D222–D226
Notredame C, Higgins DG, Heringa J (2000) T-Coffee: a novel method for fast and accurate multiple sequence alignment. J Mol Biol 302:205–217
Subramanian AR, Kaufmann M, Morgenstern B (2008) DIALIGN-TX: greedy and progressive approaches for segment-based multiple sequence alignment. Algorithms Mol Biol 3:6
Rissman AI, Mau B, Biehl BS, Darling AE, Glasner JD, Perna NT (2009) Reordering contigs of draft genomes using the Mauve Aligner. Bioinformatics 25:2071–2073
Hoen AG, Gardner SN, Moore JH (2013) Identification of SNPs associated with variola virus virulence. BioData Min 6:3
Smithson C, Purdy A, Verster AJ, Upton C (2014) Prediction of Steps in the Evolution of Variola Virus Host Range. PLoS One 9:e91520
Flygare S, Simmon K, Miller C, Qiao Y, Kennedy B, Di Sera T et al (2016) Taxonomer: an interactive metagenomics analysis portal for universal pathogen detection and host mRNA expression profiling. Genome Biol 17:111
Juenemann S, Prior K, Albersmeier A, Albaum S, Kalinowski J, Goesmann A et al (2014) GABenchToB: a genome assembly benchmark tuned on bacteria and benchtop sequencers. PLoS One 9:e107014
Smithson C, Imbery J, Upton C (2017) Re-assembly and analysis of an ancient variola virus genome. Viruses 9:E253
Milne I, Bayer M, Stephen G, Cardle L, Marshall D (2016) Tablet: visualizing next-generation sequence assemblies and mappings. Methods Mol Biol 1374:253–268
Tamura K, Stecher G, Peterson D, Filipski A, Kumar S (2013) MEGA6: molecular evolutionary genetics analysis version 6.0. Mol Biol Evol 30:2725–2729
Sivashankari S, Shanmughavel P (2006) Functional annotation of hypothetical proteins—a review. Bioinformation 1:335–338
McLeod K, Upton C (2017) Virus databases. Reference Module in Biomedical Sciences. Elsevier
Acknowledgments
The authors wish to thank the many programmers, researchers, and students who have contributed to the Virus Bioinformatics Resource software. This work has been supported by funds from the Natural Sciences Engineering Research Council of Canada. Drs. C. Upton, R. M. L. Buller, and. E. J. Lefkowitz were the original developers of the Poxvirus Bioinformatics Resource Center.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Tu, SL., Upton, C. (2019). Bioinformatics for Analysis of Poxvirus Genomes. In: Mercer, J. (eds) Vaccinia Virus. Methods in Molecular Biology, vol 2023. Humana, New York, NY. https://doi.org/10.1007/978-1-4939-9593-6_2
Download citation
DOI: https://doi.org/10.1007/978-1-4939-9593-6_2
Published:
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-4939-9592-9
Online ISBN: 978-1-4939-9593-6
eBook Packages: Springer Protocols