Skip to main content

Multi-Genome Annotation with AUGUSTUS

  • Protocol
  • First Online:
Gene Prediction

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1962))

Abstract

Comparing multiple related genomes can help to improve their structural annotation. The accuracy and consistency of the predicted exon–intron structures of the protein coding genes can be higher when considering all genomes at once rather than annotating one genome at a time.

The comparative gene prediction algorithm of AUGUSTUS performs such a multi-genome annotation. A multiple alignment of genomes is used to exploit evolutionary clues to conservation and negative selection. Further, AUGUSTUS exploits the fact that orthologous genes typically have congruent exon–intron structures. Comparative AUGUSTUS simultaneously predicts the genes in all input genomes. In this chapter we walk the reader through a small example from eight vertebrate species, including the construction of an alignment of the input genomes and how to integrate RNA-Seq evidence from multiple species for gene finding.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Stanke M, Waack S (2003) Gene prediction with a hidden Markov model and new intron submodel. Bioinformatics 19(Suppl 2):ii215–ii225

    Article  PubMed  Google Scholar 

  2. Stanke M, Diekhans M, Baertsch R, Haussler D (2008) Using native and syntenically mapped cDNA alignments to improve de novo gene finding. Bioinformatics 24(5):637–644

    Article  CAS  PubMed  Google Scholar 

  3. Keller O, Kollmar M, Stanke M, Waack S (2011) A novel hybrid gene prediction method employing protein multiple sequence alignments. Bioinformatics 27(6):757–763

    Article  CAS  PubMed  Google Scholar 

  4. Hoff KJ, Stanke M (2013) WebAUGUSTUS – a web service for training AUGUSTUS and predicting genes in eukaryotes. Nucleic Acids Res 41(W1):W123–W128

    Article  PubMed  PubMed Central  Google Scholar 

  5. Hoff KJ, Stanke M (2018) Predicting genes in single genomes with AUGUSTUS. Curr Protoc Bioinf (.e57)

    Google Scholar 

  6. Gross S, Do C, Sirota M, Batzoglou S (2007) CONTRAST: a discriminative, phylogeny-free approach to multiple informant de novo gene prediction. Genome Biol 8(12):R269

    Article  PubMed  PubMed Central  Google Scholar 

  7. Gross SS, Brent MR (2006) Using multiple alignments to improve gene prediction. J Comput Biol 13(2):379–393

    Article  CAS  PubMed  Google Scholar 

  8. König S, Romoth LW, Gerischer L, Stanke M (2016) Simultaneous gene finding in multiple genomes. Bioinformatics 32(22):3388–3395

    PubMed  PubMed Central  Google Scholar 

  9. Nachtweide S (2018) The simultaneous identification of genes in related species. Doctoral thesis

    Google Scholar 

  10. Rosenbloom KR, Armstrong J, Barber GP, Casper J, Clawson H, Diekhans M, Dreszer TR, Fujita PA, Guruvadoo L, Haeussler M, et al (2014) The UCSC genome browser database: 2015 update. Nucleic Acids Res 43(D1):D670–D681

    Article  PubMed  PubMed Central  Google Scholar 

  11. Paten B, Earl D, Nguyen N, Diekhans M, Zerbino D, Haussler D (2011) Cactus: algorithms for genome multiple sequence alignment. Genome Res 21(9):1512–1528

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, Gingeras TR (2013) STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29(1):15–21

    Article  CAS  PubMed  Google Scholar 

  13. Lilue J, Doran AG, Fiddes IT, Abrudan M, Armstrong J, Bennett R, Chow W, Collins J, Collins S, Czechanski A, Danecek P, Diekhans M, Dolle D-D, Dunn M, Durbin R, Earl D, Ferguson-Smith A, Flicek P, Flint J, Frankish A, Fu B, Gerstein M, Gilbert J, Goodstadt L, Harrow J, Howe K, Kolmogorov M, Koenig S, Lelliott C, Loveland J, Mott R, Muir P, Navarro F, Odom D, Park N, Pelan S, Phan SK, Quail M, Reinholdt L, Romoth L, Shirley L, Sisu C, Sjoberg-Herrera M, Stanke M, Steward C, Thomas M, Threadgold G, Thybert D, Torrance J, Wong K, Wood J, Yang F, Adams DJ, Paten B, Keane TM (2018) Sixteen diverse laboratory mouse reference genomes define strain-specific haplotypes and novel functional loci. Nat Genet 50:1574–1583

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Fiddes IT, Armstrong J, Diekhans M, Nachtweide S, Kronenberg ZN, Underwood JG, Gordon D, Earl D, Keane T, Eichler EE, Haussler D, Stanke M, Paten B (2018) Comparative Annotation Toolkit (CAT) – simultaneous clade and personal genome annotation. Genome Res. https://doi.org/10.1101/gr.233460.117

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

This chapter is based on research that was funded partially by Deutsche Forschungsgemeinschaft grant STA 1009/10-1 to MS and by a scholarship of the Studienstiftung des deutschen Volkes to SN.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mario Stanke .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Nachtweide, S., Stanke, M. (2019). Multi-Genome Annotation with AUGUSTUS. In: Kollmar, M. (eds) Gene Prediction. Methods in Molecular Biology, vol 1962. Humana, New York, NY. https://doi.org/10.1007/978-1-4939-9173-0_8

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-9173-0_8

  • Published:

  • Publisher Name: Humana, New York, NY

  • Print ISBN: 978-1-4939-9172-3

  • Online ISBN: 978-1-4939-9173-0

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics