Skip to main content

GeMoMa: Homology-Based Gene Prediction Utilizing Intron Position Conservation and RNA-seq Data

  • Protocol
  • First Online:
Gene Prediction

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1962))

Abstract

GeMoMa is a homology-based gene prediction program that predicts gene models in target species based on gene models in evolutionary related reference species. GeMoMa utilizes amino acid sequence conservation, intron position conservation, and RNA-seq data to accurately predict protein-coding transcripts. Furthermore, GeMoMa supports the combination of predictions based on several reference species allowing to transfer high-quality annotation of different reference species to a target species. Here, we present a detailed description of GeMoMa modules and the GeMoMa pipeline and how they can be used on the command line to address particular biological problems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Hoff KJ , Stanke M (2015) Current methods for automated annotation of protein-coding genes. Curr Opin Insect Sci 7:8–14. https://doi.org/10.1016/j.cois.2015.02.008. ISSN 2214-5745

    Article  Google Scholar 

  2. Holt C, Yandell M (2011) MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects. BMC Bioinf 12(1):491. https://doi.org/10.1186/1471-2105-12-491. ISSN 1471-2105

  3. Hartung F, Blattner FR, Puchta H (2002) Intron gain and loss in the evolution of the conserved eukaryotic recombination machinery. Nucleic Acids Res 30(23):5175–5181. https://doi.org/10.1093/nar/gkf649

    Article  CAS  Google Scholar 

  4. Keilwagen J, Wenk M, Erickson JL, Schattat MH, Grau J, Hartung F (2016) Using intron position conservation for homology-based gene prediction. Nucleic Acids Res 44(9):e89. https://doi.org/10.1093/nar/gkw092

    Article  Google Scholar 

  5. Fedorov A, Merican AF, Gilbert W (2002) Large-scale comparison of intron positions among animal, plant, and fungal genes. Proc Natl Acad Sci U S A 99(25):16128–16133. https://doi.org/10.1073/pnas.242624899

    Article  CAS  Google Scholar 

  6. Hartung F, Suer S, Bergmann T, Puchta H (2006) The role of AtMUS81 in DNA repair and its genetic interaction with the helicase AtRecQ4A. Nucleic Acids Res 34(16):4438–4448. https://doi.org/10.1093/nar/gkl576

    Article  CAS  Google Scholar 

  7. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215(3):403–410. https://doi.org/10.1016/S0022-2836(05)80360-2. ISSN 0022-2836

    Article  CAS  Google Scholar 

  8. She R, Chu JS-C, Uyar B, Wang J, Wang K, Chen N (2011) genBlastG: using BLAST searches to build homologous gene models. Bioinformatics 27(15):2141–2143. https://doi.org/10.1093/bioinformatics/btr342

    Article  CAS  Google Scholar 

  9. Slater G, Birney E (2005) Automated generation of heuristics for biological sequence comparison. BMC Bioinf 6(1):31. https://doi.org/10.1186/1471-2105-6-31. ISSN 1471-2105

    Article  Google Scholar 

  10. Testa AC, Hane JK, Ellwood SR, Oliver RP (2015) CodingQuarry: highly accurate hidden Markov model gene prediction in fungal genomes using RNA-seq transcripts. BMC Genomics 16(1):170. https://doi.org/10.1186/s12864-015-1344-4. ISSN 1471–2164

  11. Hoff KJ, Lange S, Lomsadze A, Borodovsky M, Stanke M (2016) BRAKER1: unsupervised RNA-Seq-based genome annotation with GeneMark-ET and AUGUSTUS. Bioinformatics 32(5):767. https://doi.org/10.1093/bioinformatics/btv661

    Article  CAS  Google Scholar 

  12. Keilwagen J, Hartung F, Paulini M, Twardziok SO, Grau J (2018) Combining RNA-seq data and homology-based gene prediction for plants, animals and fungi. BMC Bioinf 19(1):189. https://doi.org/10.1186/s12859-018-2203-5. ISSN 1471-2105

  13. Grau J, Keilwagen J, Gohr A, Haldemann B, Posch S, Grosse I (2012) Jstacs: a Java framework for statistical analysis and classification of biological sequences. J Mach Learn Res 13(June):S. 1967–1971

    Google Scholar 

  14. Kim D, Pertea G, Trapnell C, Pimentel H, Kelley R, Salzberg SL (2013) TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol 14:R36

    Article  Google Scholar 

  15. Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, Gingeras TR (2013) STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29(1):15. https://doi.org/10.1093/bioinformatics/bts635

    Article  CAS  Google Scholar 

  16. Song Li, Shankar DS, Florea L (2016) Rascaf: improving genome assembly with RNA sequencing data. Plant Genome 9(3)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jens Keilwagen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Keilwagen, J., Hartung, F., Grau, J. (2019). GeMoMa: Homology-Based Gene Prediction Utilizing Intron Position Conservation and RNA-seq Data. In: Kollmar, M. (eds) Gene Prediction. Methods in Molecular Biology, vol 1962. Humana, New York, NY. https://doi.org/10.1007/978-1-4939-9173-0_9

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-9173-0_9

  • Published:

  • Publisher Name: Humana, New York, NY

  • Print ISBN: 978-1-4939-9172-3

  • Online ISBN: 978-1-4939-9173-0

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics