Computer Methods to Locate Genes and Signals in Nucleic Acid Sequences
Computer methods are becoming increasingly important both during the determination of a DNA sequence and later in its subsequent analysis. This is because the sequencing methods are very rapid, easy to apply and hence generate a lot of data, and also because the rate of sequencing far outstrips the rate at which experiments can be done to elucidate the function of the sequences derived. Elucidation of the function of the sequence includes mapping messenger RNAs, promoters, splice junctions and other control regions. While a positive experimental result has the great advantage over computer analysis of giving firm evidence, computer methods are fast and cheap. The purpose of this article is to describe some of the computer techniques developed for locating these sequence features. I include methods to locate protein genes, tRNA genes, promoters, ribosome binding sites, splice junctions, terminator sequences and polyadenylation sites. I shall refer to sequences such as promoters and ribosome binding sites as “signal sequences”. We need to be able to scan through a sequence and to give some measure of the probability that each section of the sequence contains any of these features.
KeywordsAmino Acid Composition Codon Usage Weight Matrix Ribosome Binding Site Complementary Strand
Unable to display preview. Download preview PDF.
- 10.Dayhoff, M.O. (1969) Atlas of Protein Sequence and Structure, National Biomedical Research Foundation, Silver Springs, MD.Google Scholar
- 17.Walker, J.E., Saraste, M. and Gay, N.J. Biochim. Biophys. Acta, Bioenergetic Reviews (in press).Google Scholar
- 23.Gurevitch, A.I., Avakov A.E. and Kolosov, M.N. (1979) Bioorg. Khim. 5, 1735–1738.Google Scholar
- 31.Gauss, D.H., Gruter, F. and Sprinzl, M. (1979) Nucl. Acids Res. 6 rl–rl9.Google Scholar