Abstract
Scipio and WebScipio are homology-based gene prediction software designed for annotating multigenic families and for transferring annotations from one species to closely related species. The strengths include the power to cope with sequencing-related problems such as sequencing errors and assemblies with short contigs but also the ability to correctly predict genes with unusually long introns and/or rather short exons. WebScipio is connected to diArk, the largest collection of eukaryotic genome assemblies, and thereby offers a very convenient way to correct existing annotations and to extend protein family datasets. WebScipio is also a key resource for researchers interested in mutually exclusive splicing, allowing to search for alternative exons not only in introns but also in up- and downstream regions in case of incompleteness of the search sequence. In this chapter, I describe how to use Scipio and WebScipio keeping a first-time user in mind.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Gerstein MB, Bruce C, Rozowsky JS et al (2007) What is a gene, post-ENCODE? History and updated definition. Genome Res 17:669–681. https://doi.org/10.1101/gr.6339607
Sleator RD (2010) An overview of the current status of eukaryote gene prediction strategies. Gene 461:1–4. https://doi.org/10.1016/j.gene.2010.04.008
Yandell M, Ence D (2012) A beginner’s guide to eukaryotic genome annotation. Nat Rev Genet 13:329–342. https://doi.org/10.1038/nrg3174
Keller O, Odronitz F, Stanke M et al (2008) Scipio: using protein sequences to determine the precise exon/intron structures of genes and their orthologs in closely related species. BMC Bioinformatics 9:278. https://doi.org/10.1186/1471-2105-9-278
Hatje K, Keller O, Hammesfahr B et al (2011) Cross-species protein sequence and gene structure prediction with fine-tuned Webscipio 2.0 and Scipio. BMC Res Notes 4:265. https://doi.org/10.1186/1756-0500-4-265
Kent WJ (2002) BLAT—the BLAST-like alignment tool. Genome Res 12:656–664. https://doi.org/10.1101/gr.229202
Odronitz F, Pillmann H, Keller O et al (2008) WebScipio: an online tool for the determination of gene structures using protein sequences. BMC Genomics 9:422. https://doi.org/10.1186/1471-2164-9-422
Hatje K, Hammesfahr B, Kollmar M (2013) WebScipio: reconstructing alternative splice variants of eukaryotic proteins. Nucleic Acids Res 41:W504–W509. https://doi.org/10.1093/nar/gkt398
Odronitz F, Hellkamp M, Kollmar M (2007) diArk—a resource for eukaryotic genome research. BMC Genomics 8:103. https://doi.org/10.1186/1471-2164-8-103
Hammesfahr B, Odronitz F, Hellkamp M, Kollmar M (2011) diArk 2.0 provides detailed analyses of the ever increasing eukaryotic genome sequencing data. BMC Res Notes 4:338. https://doi.org/10.1186/1756-0500-4-338
Kollmar M, Kollmar L, Hammesfahr B, Simm D (2015) diArk – the database for eukaryotic genome and transcriptome assemblies in 2014. Nucleic Acids Res 43:D1107–D1112. https://doi.org/10.1093/nar/gku990
Pillmann H, Hatje K, Odronitz F et al (2011) Predicting mutually exclusive spliced exons based on exon length, splice site and reading frame conservation, and exon sequence homology. BMC Bioinformatics 12:270. https://doi.org/10.1186/1471-2105-12-270
Smith CWJ (2005) Alternative splicing—when two’s a crowd. Cell 123:1–3. https://doi.org/10.1016/j.cell.2005.09.010
Barbosa-Morais NL, Irimia M, Pan Q et al (2012) The evolutionary landscape of alternative splicing in vertebrate species. Science 338:1587–1593. https://doi.org/10.1126/science.1230612
Djebali S, Davis CA, Merkel A et al (2012) Landscape of transcription in human cells. Nature 489:101–108. https://doi.org/10.1038/nature11233
Gerstein MB, Rozowsky J, Yan K-K et al (2014) Comparative analysis of the transcriptome across distant species. Nature 512:445–448. https://doi.org/10.1038/nature13424
Hatje K, Kollmar M (2014) Kassiopeia: a database and web application for the analysis of mutually exclusive exomes of eukaryotes. BMC Genomics 15:115. https://doi.org/10.1186/1471-2164-15-115
Hatje K, Kollmar M (2013) Expansion of the mutually exclusive spliced exome in Drosophila. Nat Commun 4:2460. https://doi.org/10.1038/ncomms3460
Hatje K, Rahman R-U, Vidal RO et al (2017) The landscape of human mutually exclusive splicing. Mol Syst Biol 13:959
Kollmar M, Hatje K (2014) Shared gene structures and clusters of mutually exclusive spliced exons within the metazoan muscle myosin heavy chain genes. PLoS One 9:e88111. https://doi.org/10.1371/journal.pone.0088111
Needleman SB, Wunsch CD (1970) A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol 48:443–453. https://doi.org/10.1016/0022-2836(70)90057-4
Stajich JE, Block D, Boulez K et al (2002) The Bioperl toolkit: Perl modules for the life sciences. Genome Res 12:1611–1618. https://doi.org/10.1101/gr.361602
Kollmar M, Mühlhausen S (2017) Nuclear codon reassignments in the genomics era and mechanisms behind their evolution. BioEssays 39:1600221. https://doi.org/10.1002/bies.201600221
Mühlhausen S, Schmitt HD, Pan K-T et al (2018) Endogenous stochastic decoding of the CUG codon by competing Ser- and Leu-tRNAs in Ascoidea asiatica. Curr Biol 28:2046–2057.e5. https://doi.org/10.1016/j.cub.2018.04.085
Bradnam KR, Fass JN, Alexandrov A et al (2013) Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species. GigaScience 2:10. https://doi.org/10.1186/2047-217X-2-10
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Kollmar, M. (2019). Predicting Genes in Closely Related Species with Scipio and WebScipio. In: Kollmar, M. (eds) Gene Prediction. Methods in Molecular Biology, vol 1962. Humana, New York, NY. https://doi.org/10.1007/978-1-4939-9173-0_11
Download citation
DOI: https://doi.org/10.1007/978-1-4939-9173-0_11
Published:
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-4939-9172-3
Online ISBN: 978-1-4939-9173-0
eBook Packages: Springer Protocols