Skip to main content

Predicting Genes in Closely Related Species with Scipio and WebScipio

  • Protocol
  • First Online:
Book cover Gene Prediction

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1962))

  • 2804 Accesses

Abstract

Scipio and WebScipio are homology-based gene prediction software designed for annotating multigenic families and for transferring annotations from one species to closely related species. The strengths include the power to cope with sequencing-related problems such as sequencing errors and assemblies with short contigs but also the ability to correctly predict genes with unusually long introns and/or rather short exons. WebScipio is connected to diArk, the largest collection of eukaryotic genome assemblies, and thereby offers a very convenient way to correct existing annotations and to extend protein family datasets. WebScipio is also a key resource for researchers interested in mutually exclusive splicing, allowing to search for alternative exons not only in introns but also in up- and downstream regions in case of incompleteness of the search sequence. In this chapter, I describe how to use Scipio and WebScipio keeping a first-time user in mind.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Gerstein MB, Bruce C, Rozowsky JS et al (2007) What is a gene, post-ENCODE? History and updated definition. Genome Res 17:669–681. https://doi.org/10.1101/gr.6339607

    Article  CAS  PubMed  Google Scholar 

  2. Sleator RD (2010) An overview of the current status of eukaryote gene prediction strategies. Gene 461:1–4. https://doi.org/10.1016/j.gene.2010.04.008

    Article  CAS  PubMed  Google Scholar 

  3. Yandell M, Ence D (2012) A beginner’s guide to eukaryotic genome annotation. Nat Rev Genet 13:329–342. https://doi.org/10.1038/nrg3174

    Article  CAS  PubMed  Google Scholar 

  4. Keller O, Odronitz F, Stanke M et al (2008) Scipio: using protein sequences to determine the precise exon/intron structures of genes and their orthologs in closely related species. BMC Bioinformatics 9:278. https://doi.org/10.1186/1471-2105-9-278

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Hatje K, Keller O, Hammesfahr B et al (2011) Cross-species protein sequence and gene structure prediction with fine-tuned Webscipio 2.0 and Scipio. BMC Res Notes 4:265. https://doi.org/10.1186/1756-0500-4-265

    Article  PubMed  PubMed Central  Google Scholar 

  6. Kent WJ (2002) BLAT—the BLAST-like alignment tool. Genome Res 12:656–664. https://doi.org/10.1101/gr.229202

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Odronitz F, Pillmann H, Keller O et al (2008) WebScipio: an online tool for the determination of gene structures using protein sequences. BMC Genomics 9:422. https://doi.org/10.1186/1471-2164-9-422

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Hatje K, Hammesfahr B, Kollmar M (2013) WebScipio: reconstructing alternative splice variants of eukaryotic proteins. Nucleic Acids Res 41:W504–W509. https://doi.org/10.1093/nar/gkt398

    Article  PubMed  PubMed Central  Google Scholar 

  9. Odronitz F, Hellkamp M, Kollmar M (2007) diArk—a resource for eukaryotic genome research. BMC Genomics 8:103. https://doi.org/10.1186/1471-2164-8-103

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Hammesfahr B, Odronitz F, Hellkamp M, Kollmar M (2011) diArk 2.0 provides detailed analyses of the ever increasing eukaryotic genome sequencing data. BMC Res Notes 4:338. https://doi.org/10.1186/1756-0500-4-338

    Article  PubMed  PubMed Central  Google Scholar 

  11. Kollmar M, Kollmar L, Hammesfahr B, Simm D (2015) diArk – the database for eukaryotic genome and transcriptome assemblies in 2014. Nucleic Acids Res 43:D1107–D1112. https://doi.org/10.1093/nar/gku990

    Article  CAS  PubMed  Google Scholar 

  12. Pillmann H, Hatje K, Odronitz F et al (2011) Predicting mutually exclusive spliced exons based on exon length, splice site and reading frame conservation, and exon sequence homology. BMC Bioinformatics 12:270. https://doi.org/10.1186/1471-2105-12-270

    Article  PubMed  PubMed Central  Google Scholar 

  13. Smith CWJ (2005) Alternative splicing—when two’s a crowd. Cell 123:1–3. https://doi.org/10.1016/j.cell.2005.09.010

    Article  CAS  PubMed  Google Scholar 

  14. Barbosa-Morais NL, Irimia M, Pan Q et al (2012) The evolutionary landscape of alternative splicing in vertebrate species. Science 338:1587–1593. https://doi.org/10.1126/science.1230612

    Article  CAS  PubMed  Google Scholar 

  15. Djebali S, Davis CA, Merkel A et al (2012) Landscape of transcription in human cells. Nature 489:101–108. https://doi.org/10.1038/nature11233

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Gerstein MB, Rozowsky J, Yan K-K et al (2014) Comparative analysis of the transcriptome across distant species. Nature 512:445–448. https://doi.org/10.1038/nature13424

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Hatje K, Kollmar M (2014) Kassiopeia: a database and web application for the analysis of mutually exclusive exomes of eukaryotes. BMC Genomics 15:115. https://doi.org/10.1186/1471-2164-15-115

    Article  PubMed  PubMed Central  Google Scholar 

  18. Hatje K, Kollmar M (2013) Expansion of the mutually exclusive spliced exome in Drosophila. Nat Commun 4:2460. https://doi.org/10.1038/ncomms3460

    Article  CAS  PubMed  Google Scholar 

  19. Hatje K, Rahman R-U, Vidal RO et al (2017) The landscape of human mutually exclusive splicing. Mol Syst Biol 13:959

    Article  Google Scholar 

  20. Kollmar M, Hatje K (2014) Shared gene structures and clusters of mutually exclusive spliced exons within the metazoan muscle myosin heavy chain genes. PLoS One 9:e88111. https://doi.org/10.1371/journal.pone.0088111

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Needleman SB, Wunsch CD (1970) A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol 48:443–453. https://doi.org/10.1016/0022-2836(70)90057-4

    Article  CAS  PubMed  Google Scholar 

  22. Stajich JE, Block D, Boulez K et al (2002) The Bioperl toolkit: Perl modules for the life sciences. Genome Res 12:1611–1618. https://doi.org/10.1101/gr.361602

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Kollmar M, Mühlhausen S (2017) Nuclear codon reassignments in the genomics era and mechanisms behind their evolution. BioEssays 39:1600221. https://doi.org/10.1002/bies.201600221

    Article  CAS  Google Scholar 

  24. Mühlhausen S, Schmitt HD, Pan K-T et al (2018) Endogenous stochastic decoding of the CUG codon by competing Ser- and Leu-tRNAs in Ascoidea asiatica. Curr Biol 28:2046–2057.e5. https://doi.org/10.1016/j.cub.2018.04.085

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Bradnam KR, Fass JN, Alexandrov A et al (2013) Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species. GigaScience 2:10. https://doi.org/10.1186/2047-217X-2-10

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Martin Kollmar .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Kollmar, M. (2019). Predicting Genes in Closely Related Species with Scipio and WebScipio. In: Kollmar, M. (eds) Gene Prediction. Methods in Molecular Biology, vol 1962. Humana, New York, NY. https://doi.org/10.1007/978-1-4939-9173-0_11

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-9173-0_11

  • Published:

  • Publisher Name: Humana, New York, NY

  • Print ISBN: 978-1-4939-9172-3

  • Online ISBN: 978-1-4939-9173-0

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics