Abstract
The rapid growth in the quantity of available sequence data has made necessary the development of efficient computational tools for its analysis. Substantial progress has been made in the development of tools for the identification and prediction of genes that are expressed via standard decoding. However, since recoded genes embrace only a minority of all genes and since their prediction requires different approaches, they are frequently neglected and as a result are often mis-annotated in the public databases or even left undetected during the annotation process. This chapter aims to describe available computer tools designed for the identification and analysis of recoded genes and public databases that collect information related to recoding. In addition, we also discuss how standard tools for sequence analysis can be used for these purposes.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215:403ā410
Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman D J (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucl Acids Res 25:3389ā3402
Athanasiadis A, Rich A, Maas S (2004) Widespread A-to-I RNA editing of Alu-containing mRNAs in the human transcriptome. PLoS Biol 2:e391
Baranov PV, Fayet O, Hendrix RW, Atkins JF (2006) Recoding in bacteriophages and bacterial IS elements. Trends Genet 22:174ā181
Baranov PV, Gesteland RF, Atkins JF (2002a) Recoding: translational bifurcations in gene expression. Gene 286:187ā201
Baranov PV, Gesteland RF, Atkins JF (2002b) Release factor 2 frameshifting sites in different bacteria. EMBO Rep 3:373ā377
Baranov PV, Gurvich OL, Fayet O, Prere MF, Miller WA, Gesteland RF, Atkins JF, Giddings MC (2001) RECODE: a database of frameshifting, bypassing and codon redefinition utilized for gene expression. Nucl Acids Res 29:264ā267
Baranov PV, Gurvich OL, Hammer AW, Gesteland RF, Atkins JF (2003) Recode 2003. Nucl Acids Res 31:87ā89
Bekaert M, Atkins JF, Baranov PV (2006) ARFA: a program for annotating bacterial release factor genes, including prediction of programmed ribosomal frameshifting. Bioinformatics 22:2463ā2465
Bekaert M, Bidou L, Denise A, Duchateau-Nguyen G, Forest JP, Froidevaux C, Hatin I, Rousset JP, Termier M (2003) Towards a computational model for -1 eukaryotic frameshifting sites. Bioinformatics 19:327ā335Bekaert M, Firth AE, Zhang Y, Gladyshev VN, Atkins JF, Baranov PV (2009) Recode-2: new design, new search tools, andmany more genes. Nucl Acids Res e-pul ahead of print
Bekaert M, Ivanov IP, Atkins JF, Baranov PV (2008) Ornithine decarboxylase antizyme finder (OAF): fast and reliable detection of antizymes with frameshifts in mRNAs. BMC Bioinformatics 9:178
Belcourt MF, Farabaugh PJ (1990) Ribosomal frameshifting in the yeast retrotransposon Ty: tRNAs induce slippage on a 7 nucleotide minimal site. Cell 62:339ā352
Belew AT, Hepler NL, Jacobs JL, Dinman JD (2008) PRFdb: a database of computationally predicted eukaryotic programmed ā1 ribosomal frameshift signals. BMC Genomics9:339
Brierley I, Pennell S (2001) Structure and function of the stimulatory RNAs involved in programmed eukaryotic-1 ribosomal frameshifting. Cold Spr Harb Symp Quant Biol 66:233ā248
Byun Y, Han K (2006) PseudoViewer: web application and web service for visualizing RNA pseudoknots and secondary structures. Nucl Acids Res 34:W416āW422
Byun Y, Moon S, Han K (2007) A general computational model for predicting ribosomal frameshifts in genome sequences. Comput Biol Med 37:1796ā1801
Castellano S, Gladyshev VN, Guigo R, Berry MJ (2008) SelenoDB 1.0 : a database of selenoprotein genes, proteins and SECIS elements. Nucl Acids Res 36:D332ā338
Castellano S, Morozova N, Morey M, Berry MJ, Serras F, Corominas M, Guigo R (2001) In silico identification of novel selenoproteins in the Drosophila melanogaster genome. EMBO Rep 2:697ā702
Castellano S, Novoselov SV, Kryukov GV, Lescure A, Blanco E, Krol A, Gladyshev VN, Guigo R (2004) Reconsidering the evolution of eukaryotic selenoproteins: a novel nonmammalian family with scattered phylogenetic distribution. EMBO Rep 5:71ā77
Chen SH, Habib G, Yang CY, Gu ZW, Lee BR, Weng SA, Silberman SR, Cai SJ, Deslypere JP, Rosseneu M et al. (1987) Apolipoprotein B-48 is the product of a messenger RNA with an organ-specific in-frame stop codon. Science 238:363ā366
Chung BY, Miller WA, Atkins JF, Firth AE (2008) An overlapping essential gene in the Potyviridae. Proc Nat Acad Sci USA 105:5897ā5902
Chung WY, Wadhawan S, Szklarczyk R, Pond SK, Nekrutenko A (2007) A first look at ARFome: dual-coding genes in mammalian genomes. PLoS Comput Biol 3:e91
Dsouza M, Larsen N, Overbeek R (1997) Searching for patterns in genomic data. Trends Genet 13:497ā498
Eddy SR (1998) Profile hidden Markov models. Bioinformatics 14:755ā763
Eddy SR, Durbin R (1994) RNA sequence analysis using covariance models. Nucl Acids Res 22:2079ā2088
Firth AE, Brown CM (2005) Detecting overlapping coding sequences with pairwise alignments. Bioinformatics 21:282ā292
Firth AE, Brown CM (2006) Detecting overlapping coding sequences in virus genomes. BMC Bioinformatics 7:75Firth AE, Chung BY, Fleeton MN, Atkins JF (2008) Discovery of frameshifting in Alphavirus 6āK resolves a 20-year enigma. Virol J 5:108
Freyhult EK, Bollback JP, Gardner PP (2007) Exploring genomic dark matter: a critical assessment of the performance of homology search methods on noncoding RNA. Genome Res 17: 117ā125
Gardner PP, Giegerich R (2004) A comprehensive comparison of comparative RNA structure prediction approaches. BMC Bioinformatics 5:140
Gruber AR, Bernhart SH, Hofacker IL, Washietl S (2008a) Strategies for measuring evolutionary conservation of RNA secondary structures. BMC Bioinformatics 9:122
Gruber AR, Lorenz R, Bernhart SH, Neubock R, Hofacker IL (2008b) The vienna RNA websuite. Nucl Acids Res 36:W70ā74
Gurvich OL, Baranov PV, Zhou J, Hammer AW, Gesteland RF, Atkins JF (2003) Sequences that direct significant levels of frameshifting are frequent in coding regions of Escherichia coli. EMBO J 22:5941ā5950
Hammell AB, Taylor RC, Peltz SW, Dinman JD (1999) Identification of putative programmed -1 ribosomal frameshift signals in large DNA databases. Genome Res 9:417ā427
Han K, Byun Y (2003) PSEUDOVIEWER2: Visualization of RNA pseudoknots of any type. Nucl Acids Res 31:3432ā3440
Han K, Lee Y, Kim W (2002) PseudoViewer: automatic visualization of RNA pseudoknots. Bioinformatics 18(Suppl 1):S321āS328
Harrison PM, Carriero N, Liu Y, Gerstein M (2003) A āpolyORFomicā analysis of prokaryote genomes using disabled-homology filtering reveals conserved but undiscovered short ORFs. J Mol Biol 333:885ā892
Havgaard JH, Lyngso RB, Gorodkin J (2005) The FOLDALIGN web server for pairwise structural RNA alignment and mutual motif search. Nucl Acids Res 33:W650ā653
Herr AJ, Atkins JF, Gesteland RF (2000) Coupling of open reading frames by translational bypassing. Annu Rev Biochem 69:343ā372
Hofacker IL (2003) Vienna RNA secondary structure server. Nucl Acids Res 31:3429ā3431
Hofacker IL, Fekete M, Stadler PF (2002) Secondary structure prediction for aligned RNA sequences. J Mol Biol 319:1059ā1066
Ivanov IP, Atkins JF (2007) Ribosomal frameshifting in decoding antizyme mRNAs from yeast and protists to humans: close to 300 cases reveal remarkable diversity despite underlying conservation. Nucl Acids Res 35:1842ā1858
Jacobs JL, Belew AT, Rakauskaite R, Dinman JD (2007) Identification of functional, endogenous programmed -1 ribosomal frameshift signals in the genome of Saccharomyces cerevisiae. Nucl Acids Res 35:165ā174
Karolchik D, Kuhn RM, Baertsch R, Barber GP, Clawson H, Diekhans M, Giardine B, Harte RA, Hinrichs AS, Hsu F, Kober KM, Miller W, Pedersen JS, Pohl A, Raney BJ, Rhead B, Rosenbloom KR, Smith KE, Stanke M, Thakkapallayil A, Trumbower H, Wang T, Zweig AS, Haussler D, Kent WJ (2008) The UCSC Genome Browser Database: 2008 update. Nucl Acids Res 36:D773ā779
Kellis M, Patterson N, Endrizzi M, Birren B, Lander ES (2003) Sequencing and comparison of yeast species to identify genes and regulatory elements. Nature 423:241ā254
Kim DD, Kim TT, Walsh T, Kobayashi Y, Matise TC, Buyske S, Gabriel A (2004) Widespread RNA editing of embedded alu elements in the human transcriptome. Genome Res 14:1719ā1725
Klein RJ, Eddy SR (2003) RSEARCH: finding homologs of single structured RNA sequences. BMC Bioinformatics 4:44
Knudsen B, Hein J (2003) Pfold: RNA secondary structure prediction using stochastic context-free grammars. Nucl Acids Res 31:3423ā3428
Korf I, Yandell M, Bedell J (2003) BLAST: OāReilly and Associates Inc
Krogh A, Brown M, Mian IS, Sjolander K, Haussler D (1994) Hidden Markov models in computational biology. Applications to protein modeling. J Mol Biol 235:1501ā1531
Kryukov GV, Castellano S, Novoselov SV, Lobanov AV, Zehtab O, Guigo R, Gladyshev VN (2003) Characterization of mammalian selenoproteomes. Science 300:1439ā1443
Kryukov GV, Kryukov VM, Gladyshev VN (1999) New mammalian selenocysteine-containing proteins identified with an algorithm that searches for selenocysteine insertion sequence elements. J Biol Chem 274:33888ā33897
Lescure A, Gautheret D, Carbon P, Krol A (1999) Novel selenoproteins identified in silico and in vivo by using a conserved RNA structural motif. J Biol Chem 274:38147ā38154
Levanon EY, Eisenberg E, Yelin R, Nemzer S, Hallegger M, Shemesh R, Fligelman ZY, Shoshan A, Pollock SR, Sztybel D, Olshansky M, Rechavi G, Jantsch MF (2004) Systematic identification of abundant A-to-I editing sites in the human transcriptome. Nature Biotech 22:1001ā1005Li JB, Levanon EY, Yoon JK, Aach J, Xie B, Leproust E, Zhang K, Gao Y, Church GM (2009) Genome-wide identification of human RNA editing sites by parallel DNA capturing and sequencing. Science 324:1210ā1213
Lin MF, Carlson JW, Crosby MA, Matthews BB., Yu C, Park S, Wan KH, Schroeder AJ, Gramates LS, St Pierre SE, Roark M, Wiley KL Jr, Kulathinal RJ, Zhang P, Myrick KV, Antone JV, Celniker SE, Gelbart WM, Kellis M (2007) Revisiting the protein-coding gene catalog of Drosophila melanogaster using 12 fly genomes. Genome Res 17:1823ā1836
Lin MF, Deoras AN, Rasmussen MD, Kellis M (2008) Performance and scalability of discriminative metrics for comparative gene identification in 12 Drosophila genomes. PLoS Computat Biol 4:e1000067
Lipman DJ, Pearson WR (1985) Rapid and sensitive protein similarity searches. Science 227:1435ā1441Macke TJ, Ecker DJ, Gutell RR, Gautheret D, Case DA, Sampath R (2001) RNAMotif, an RNA secondary structure definition and search algorithm.Nucleic Acids Res 29:4724ā4735
Mathews DH, Turner DH (2002) Dynalign: an algorithm for finding the secondary structure common to two RNA sequences. J Mol Biol 317:191ā203
Matsufuji S, Matsufuji T, Miyazaki Y, Murakami Y, Atkins JF, Gesteland RF, Hayashi S (1995) Autoregulatory frameshifting in decoding mammalian ornithine decarboxylase antizyme. Cell 80:51ā60
McCaskill JS (1990) The equilibrium partition function and base pair binding probabilities for RNA secondary structure. Biopolymers 29:1105ā1119
Moon S, Byun Yand Han K (2007) FSDB: a frameshift signal database. Computat Biol Chem 31:298ā302
Moon S, Byun Y, Kim HJ, Jeong S, Han K (2004) Predicting genes expressed via -1 and +1 frameshifts. Nucl Acids Res 32:4884ā4892
Namy O, Duchateau-Nguyen G, Hatin I, Hermann-Le Denmat S, Termier M, Rousset JP (2003) Identification of stop codon readthrough genes in Saccharomyces cerevisiae. Nucl Acids Res 31:2289ā2296
Nawrocki EP, Eddy SR (2007) Query-dependent banding (QDB) for faster RNA similarity searches. PLoS Computat Biol 3:e56
Nekrutenko A, Wadhawan S, Goetting-Minesky P, Makova KD (2005) Oscillating evolution of a mammalian locus with overlapping reading frames: an XLalphas/ALEX relay. PLoS Genetics 1:e18
Pedersen JS, Bejerano G, Siepel A, Rosenbloom K, Lindblad-Toh K, Lander ES, Kent J, Miller W, Haussler D (2006) Identification and classification of conserved RNA secondary structures in the human genome. PLoS Computat Biol 2:e33
Pedersen JS, Meyer IM, Forsberg R, Simmonds P, Hein J (2004) A comparative method for finding and folding RNA secondary structures within protein-coding regions. Nucl Acids Res 32:4925ā4936
Reeder J, Giegerich R (2004) Design, implementation and evaluation of a practical pseudoknot folding algorithm based on thermodynamics. BMC Bioinformatics 5:104
Reeder J, Reeder J, Giegerich R (2007a) Locomotif: from graphical motif description to RNA motif search. Bioinformatics 23:i392ā400
Reeder J, Steffen P, Giegerich R (2007b) pknotsRG: RNA pseudoknot folding including near-optimal structures and sliding windows. Nucl Acids Res 35:W320ā324
Rivas E, Eddy SR (1999) A dynamic programming algorithm for RNA structure prediction including pseudoknots. J Mol Biol 285:2053ā2068
Rivas E, Eddy SR (2000) Secondary structure alone is generally not statistically significant for the detection of noncoding RNAs. Bioinformatics 16:583ā605
Romano P (2008) Automation of in-silico data analysis processes through workflow management systems. Briefings Bioinformat 9:57ā68
Ruan J, Stormo GD, Zhang W (2004) ILM: a web server for predicting RNA secondary structures with pseudoknots. Nucl Acids Res 32:W146ā149
Sankoff D (1985) Simultaneous solution of the RNA folding, alignment and protosequence problems. Siam J Appl Math 45:810ā825
Schuster P, Fontana W, Stadler PF, Hofacker IL (1994) From sequences to shapes and back: a case study in RNA secondary structures. Proc Royal Soc London B 255:279ā284
Shah AA., Giddings MC, Parvaz JB, Gesteland RF, Atkins JF, Ivanov IP (2002) Computational identification of putative programmed translational frameshift sites. Bioinformatics 18:1046ā1053
Siguier P, Perochon J, Lestrade L, Mahillon J, Chandler M (2006) ISfinder: the reference centre for bacterial insertion sequences. Nucl Acids Res 34:D32ā36
Skuzeski JM, Nichols LM, Gesteland RF, Atkins JF (1991) The signal for a leaky UAG stop codon in several plant viruses includes the two downstream codons. J Mol Biol 218:365ā373
Smith TF, Waterman MS (1981) Identification of common molecular subsequences. J Mol Biol 147:195ā197
Strabo, Hamilton HC, Falconer W (1854) The geography of Strabo. H. G. Bohn, London
Theis C, Reeder J, Giegerich R (2008) KnotInFrame: prediction of -1 ribosomal frameshift events. Nucl Acids Res 36:6013ā6020
Touzet H, Perriquet O (2004) CARNAC: folding families of related RNAs. Nucl Acids Res 32:W142āW145
van Batenburg FH, Gultyaev AP, Pleij CW (2001) PseudoBase: structural information on RNA pseudoknots. Nucl Acids Res 29:194ā195
Washietl S, Hofacker IL (2004) Consensus folding of aligned sequences as a new measure for the detection of functional RNAs by comparative genomics. J Mol Biol 342:19ā30
Washietl S, Hofacker IL, Lukasser M, Huttenhofer A, Stadler PF (2005a) Mapping of conserved RNA secondary structures predicts thousands of functional noncoding RNAs in the human genome. Nature Biotech 23:1383ā1390
Washietl S, Hofacker IL, Stadler PF (2005b) Fast and reliable prediction of noncoding RNAs. Proc Natl Acad Sci USA 102:2454ā2459
Will S, Reiche K, Hofacker IL, Stadler PF, Backofen R (2007) Inferring noncoding RNA families and classes by means of genome-scale structure-based clustering. PLoS Computat Biol 3:e65
Wills NM, Moore B, Hammer A, Gesteland RF, Atkins JF (2006) A functional -1 ribosomal frameshift signal in the human paraneoplastic Ma3 gene. J Biol Chem 281:7082ā7088
Zuker M (2003) Mfold web server for nucleic acid folding and hybridization prediction. Nucl Acids Res 31:3406ā3415
Zuker M, Stiegler P (1981) Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information. Nucl Acids Res 9:133ā148
Acknowledgments
We are grateful to Drs. Sergi Castellano and Kyungsook Han for careful reading of the manuscript and useful comments. This work was supported by funds from Science Foundation Ireland.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
Ā© 2010 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Firth, A.E., Bekaert, M., Baranov, P.V. (2010). Computational Resources for Studying Recoding. In: Atkins, J., Gesteland, R. (eds) Recoding: Expansion of Decoding Rules Enriches Gene Expression. Nucleic Acids and Molecular Biology, vol 24. Springer, New York, NY. https://doi.org/10.1007/978-0-387-89382-2_20
Download citation
DOI: https://doi.org/10.1007/978-0-387-89382-2_20
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-0-387-89381-5
Online ISBN: 978-0-387-89382-2
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)