Multi-Genome Annotation with AUGUSTUS

Nachtweide, Stefanie; Stanke, Mario

doi:10.1007/978-1-4939-9173-0_8

Stefanie Nachtweide³ &
Mario Stanke³

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1962))

3894 Accesses
34 Citations
4 Altmetric

Abstract

Comparing multiple related genomes can help to improve their structural annotation. The accuracy and consistency of the predicted exon–intron structures of the protein coding genes can be higher when considering all genomes at once rather than annotating one genome at a time.

The comparative gene prediction algorithm of AUGUSTUS performs such a multi-genome annotation. A multiple alignment of genomes is used to exploit evolutionary clues to conservation and negative selection. Further, AUGUSTUS exploits the fact that orthologous genes typically have congruent exon–intron structures. Comparative AUGUSTUS simultaneously predicts the genes in all input genomes. In this chapter we walk the reader through a small example from eight vertebrate species, including the construction of an alignment of the input genomes and how to integrate RNA-Seq evidence from multiple species for gene finding.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Hardcover Book: USD 199.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Stanke M, Waack S (2003) Gene prediction with a hidden Markov model and new intron submodel. Bioinformatics 19(Suppl 2):ii215–ii225
Article PubMed Google Scholar
Stanke M, Diekhans M, Baertsch R, Haussler D (2008) Using native and syntenically mapped cDNA alignments to improve de novo gene finding. Bioinformatics 24(5):637–644
Article CAS PubMed Google Scholar
Keller O, Kollmar M, Stanke M, Waack S (2011) A novel hybrid gene prediction method employing protein multiple sequence alignments. Bioinformatics 27(6):757–763
Article CAS PubMed Google Scholar
Hoff KJ, Stanke M (2013) WebAUGUSTUS – a web service for training AUGUSTUS and predicting genes in eukaryotes. Nucleic Acids Res 41(W1):W123–W128
Article PubMed PubMed Central Google Scholar
Hoff KJ, Stanke M (2018) Predicting genes in single genomes with AUGUSTUS. Curr Protoc Bioinf (.e57)
Google Scholar
Gross S, Do C, Sirota M, Batzoglou S (2007) CONTRAST: a discriminative, phylogeny-free approach to multiple informant de novo gene prediction. Genome Biol 8(12):R269
Article PubMed PubMed Central Google Scholar
Gross SS, Brent MR (2006) Using multiple alignments to improve gene prediction. J Comput Biol 13(2):379–393
Article CAS PubMed Google Scholar
König S, Romoth LW, Gerischer L, Stanke M (2016) Simultaneous gene finding in multiple genomes. Bioinformatics 32(22):3388–3395
PubMed PubMed Central Google Scholar
Nachtweide S (2018) The simultaneous identification of genes in related species. Doctoral thesis
Google Scholar
Rosenbloom KR, Armstrong J, Barber GP, Casper J, Clawson H, Diekhans M, Dreszer TR, Fujita PA, Guruvadoo L, Haeussler M, et al (2014) The UCSC genome browser database: 2015 update. Nucleic Acids Res 43(D1):D670–D681
Article PubMed PubMed Central Google Scholar
Paten B, Earl D, Nguyen N, Diekhans M, Zerbino D, Haussler D (2011) Cactus: algorithms for genome multiple sequence alignment. Genome Res 21(9):1512–1528
Article CAS PubMed PubMed Central Google Scholar
Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, Gingeras TR (2013) STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29(1):15–21
Article CAS PubMed Google Scholar
Lilue J, Doran AG, Fiddes IT, Abrudan M, Armstrong J, Bennett R, Chow W, Collins J, Collins S, Czechanski A, Danecek P, Diekhans M, Dolle D-D, Dunn M, Durbin R, Earl D, Ferguson-Smith A, Flicek P, Flint J, Frankish A, Fu B, Gerstein M, Gilbert J, Goodstadt L, Harrow J, Howe K, Kolmogorov M, Koenig S, Lelliott C, Loveland J, Mott R, Muir P, Navarro F, Odom D, Park N, Pelan S, Phan SK, Quail M, Reinholdt L, Romoth L, Shirley L, Sisu C, Sjoberg-Herrera M, Stanke M, Steward C, Thomas M, Threadgold G, Thybert D, Torrance J, Wong K, Wood J, Yang F, Adams DJ, Paten B, Keane TM (2018) Sixteen diverse laboratory mouse reference genomes define strain-specific haplotypes and novel functional loci. Nat Genet 50:1574–1583
Article CAS PubMed PubMed Central Google Scholar
Fiddes IT, Armstrong J, Diekhans M, Nachtweide S, Kronenberg ZN, Underwood JG, Gordon D, Earl D, Keane T, Eichler EE, Haussler D, Stanke M, Paten B (2018) Comparative Annotation Toolkit (CAT) – simultaneous clade and personal genome annotation. Genome Res. https://doi.org/10.1101/gr.233460.117
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

This chapter is based on research that was funded partially by Deutsche Forschungsgemeinschaft grant STA 1009/10-1 to MS and by a scholarship of the Studienstiftung des deutschen Volkes to SN.

Author information

Authors and Affiliations

Institute of Mathematics and Computer Science, University of Greifswald, Walther-Rathenau-Straße 47, 17487, Greifswald, Germany
Stefanie Nachtweide & Mario Stanke

Authors

Stefanie Nachtweide
View author publications
You can also search for this author in PubMed Google Scholar
Mario Stanke
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mario Stanke .

Editor information

Editors and Affiliations

Group Systems Biology of Motor Proteins, Department NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Goettingen, Germany
Martin Kollmar

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Nachtweide, S., Stanke, M. (2019). Multi-Genome Annotation with AUGUSTUS. In: Kollmar, M. (eds) Gene Prediction. Methods in Molecular Biology, vol 1962. Humana, New York, NY. https://doi.org/10.1007/978-1-4939-9173-0_8

Download citation

DOI: https://doi.org/10.1007/978-1-4939-9173-0_8
Published: 25 April 2019
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-4939-9172-3
Online ISBN: 978-1-4939-9173-0
eBook Packages: Springer Protocols

Publish with us

Policies and ethics