Abstract
Alignment-based gene identification methods utilize sequence conservation between orthologous protein-coding genes to annotate genes in newly sequenced genomes. CESAR is an approach that makes use of existing genome alignments to transfer genes from one genome to other aligned genomes, and thus generates comparative gene annotations. To accurately detect conserved exons that exhibit an intact reading frame and consensus splice sites, CESAR produces a new alignment between orthologous exons, taking information about the exon’s reading frame and splice site positions into account. Furthermore, CESAR is able to detect most evolutionary splice site shifts, which helps to annotate exon boundaries at high precision. Here, we describe how to apply CESAR to generate comparative gene annotations for one or many species, and discuss the strengths and limitations of this approach. CESAR is available at https://github.com/hillerlab/CESAR2.0.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Picardi E, Pesole G (2010) Computational methods for ab initio and comparative gene finding. Methods Mol Biol 609:269–284. https://doi.org/10.1007/978-1-60327-241-4_16
Zhu J, Sanborn JZ, Diekhans M, Lowe CB, Pringle TH, Haussler D (2007) Comparative genomics search for losses of long-established genes on the human lineage. PLoS Comput Biol 3(12):e247. https://doi.org/10.1371/journal.pcbi.0030247
Sharma V, Hiller M (2017) Increased alignment sensitivity improves the usage of genome alignments for comparative gene annotation. Nucleic Acids Res 45(14):8369–8377. https://doi.org/10.1093/nar/gkx554
Sharma V, Elghafari A, Hiller M (2016) Coding exon-structure aware realigner (CESAR) utilizes genome alignments for accurate comparative gene annotation. Nucleic Acids Res 44(11):e103. https://doi.org/10.1093/nar/gkw210
Sharma V, Schwede P, Hiller M (2017) CESAR 2.0 substantially improves speed and accuracy of comparative gene annotation. Bioinformatics 33(24):3985–3987. https://doi.org/10.1093/bioinformatics/btx527
Casper J, Zweig AS, Villarreal C, Tyner C, Speir ML, Rosenbloom KR, Raney BJ, Lee CM, Lee BT, Karolchik D, Hinrichs AS, Haeussler M, Guruvadoo L, Navarro Gonzalez J, Gibson D, Fiddes IT, Eisenhart C, Diekhans M, Clawson H, Barber GP, Armstrong J, Haussler D, Kuhn RM, Kent WJ (2018) The UCSC Genome Browser database: 2018 update. Nucleic Acids Res 46(D1):D762–D769. https://doi.org/10.1093/nar/gkx1020
Zerbino DR, Achuthan P, Akanni W, Amode MR, Barrell D, Bhai J, Billis K, Cummins C, Gall A, Giron CG, Gil L, Gordon L, Haggerty L, Haskell E, Hourlier T, Izuogu OG, Janacek SH, Juettemann T, To JK, Laird MR, Lavidas I, Liu Z, Loveland JE, Maurel T, McLaren W, Moore B, Mudge J, Murphy DN, Newman V, Nuhn M, Ogeh D, Ong CK, Parker A, Patricio M, Riat HS, Schuilenburg H, Sheppard D, Sparrow H, Taylor K, Thormann A, Vullo A, Walts B, Zadissa A, Frankish A, Hunt SE, Kostadima M, Langridge N, Martin FJ, Muffato M, Perry E, Ruffier M, Staines DM, Trevanion SJ, Aken BL, Cunningham F, Yates A, Flicek P (2018) Ensembl 2018. Nucleic Acids Res 46(D1):D754–D761. https://doi.org/10.1093/nar/gkx1098
Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D (2003) Evolution’s cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A 100(20):11484–11489. https://doi.org/10.1073/pnas.1932072100
Sharma V, Hecker N, Roscito JG, Foerster L, Langer BE, Hiller M (2018) A genomics approach reveals insights into the importance of gene losses for mammalian adaptations. Nat Commun 9(1):1215. https://doi.org/10.1038/s41467-018-03667-1
Acknowledgment
This work was supported by the Max Planck Society and the German Research Foundation (HI 1423/3-1).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Sharma, V., Hiller, M. (2019). Coding Exon-Structure Aware Realigner (CESAR): Utilizing Genome Alignments for Comparative Gene Annotation. In: Kollmar, M. (eds) Gene Prediction. Methods in Molecular Biology, vol 1962. Humana, New York, NY. https://doi.org/10.1007/978-1-4939-9173-0_10
Download citation
DOI: https://doi.org/10.1007/978-1-4939-9173-0_10
Published:
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-4939-9172-3
Online ISBN: 978-1-4939-9173-0
eBook Packages: Springer Protocols