Skip to main content
Log in

The analysis of intron data and their use in the detection of short signals

  • Published:
Journal of Molecular Evolution Aims and scope Submit manuscript

    We’re sorry, something doesn't seem to be working properly.

    Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Summary

In order to examine whether certain short DNA sequences (putative splice signals) occurred in a certain region of an intron more often than would be expected by chance, intron data were examined to see what structure they took. There were significant departures from equal nucleotide frequency, and successive nucleotides clearly did not occur independently in the rat and mouse introns examined. The nonindependence was mainly due to a CG shortage and a less marked TA shortage. However the pairwise frequencies explained almost all the variability in triplet frequencies in the data and so the data could be approximately modeled by using nucleotide frequencies conditional on what the previous nucleotide was. Some coding DNA was also examined and the pairs in second and third positions, and third and first positions in a codon, showed similar departures from independence to those of the intron data. Using the probability model derived for intron data, expected frequencies of putative signals were derived and compared with the observed frequencies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Baker RJ, Nelder JA (1978) The GLIM system: release 3. Numerical Algorithms Group, Oxford

    Google Scholar 

  • Bishop MMM, Fienberg SE, Holland PW (1975) Discrete multivariate analysis: theory and practice. MIT Press, Cambridge, p 270

    Google Scholar 

  • Breathnach R, Chambon R (1981) Organisation and expression of eukaryotic split genes coding for proteins. Annu Rev Biochem 50:349–384

    PubMed  Google Scholar 

  • Bulmer MG (1987) A statistical analysis of nucleotide sequences of introns and exons in human genes. Mol Biol Evol (in press)

  • Josse J, Kaiser AA, Kornberg A (1961) Enzymatic synthesis of deoxyribonucleic acid: VIII. Frequencies of nearest neighbour base sequences in deoxyribonucleic acid. J Biol Chem 236: 864–875

    PubMed  Google Scholar 

  • Keller EB, Noon WA (1984) Intron splicing: a conserved internal signal in introns of animal pre-mRNAS. Proc Natl Acad Sci USA 81:7417–7420

    PubMed  Google Scholar 

  • King CR, Piatigorsky J (1983) Alternative RNA splicing of the murine αA-crystallin gene: protein-coding information within an intron. Cell 32:707–712

    PubMed  Google Scholar 

  • Kinnaird JH, Fincham JRS (1983) The complete nucleotide sequence of theNeurospora crassa am (NADP-specific glutamate dehydrogenase) gene. Gene 26:253–260

    PubMed  Google Scholar 

  • Langford CJ, Gallwitz D (1983) Evidence for an intron-contained sequence required for the splicing of yeast RNA polymerase II transcripts. Cell 33:519–527

    PubMed  Google Scholar 

  • Lathe R (1985) Synthetic oligonucleotide probes deduced from amino acid sequence data: theoretical and practical considerations. J Mol Biol 183:1–12

    PubMed  Google Scholar 

  • Lewin B (1983) Genes. John Wiley & Sons, New York

    Google Scholar 

  • Lomedico P, Rosenthal N, Efstratiadis A, Gilbert W, Kolodner R, Tizard R (1979) The structure and evolution of the two non-allelic rat preproinsulin genes. Cell 18:545–558

    PubMed  Google Scholar 

  • Maruyama T, Gojobori T, Aota S, Ikemura T (1986) Codon usage tabulated from the Genbank genetic sequence data. Nucleic Acids Res 14:r151-r197

    PubMed  Google Scholar 

  • Miller WL, Martial JA, Baxter JD (1980) Molecular cloning of DNA complementary to bovine growth hormone mRNA. J Biol Chem 255:7521–7524

    PubMed  Google Scholar 

  • Mount SM (1982) A catalogue of splice junction sequences. Nucleic Acids Res 10:459–472

    PubMed  Google Scholar 

  • Noda M, Furutani Y, Takahashi H, Toyosato M, Hirose T, Inayama S, Nakanishi S, Numa S (1982) Cloning and sequence analysis of cDNA of bovine adrenal preproenkephalin. Nature 295:202–206

    PubMed  Google Scholar 

  • Nussinov R (1984) Doublet frequencies in evolutionary distinct groups. Nucleic Acids Res 12:1749–1763.

    PubMed  Google Scholar 

  • Pikielny CW, Teem JL, Rosbash M (1983) Evidence for the biochemical role of an internal sequence in yeast nuclear m-RNA introns: implications for U1RNA and metazoan mRNA splicing. Cell 34:395–403

    PubMed  Google Scholar 

  • Tautz D, Trick M, Dover GA (1986) Cryptic simplicity in DNA is a major source of genetic variation. Nature 322:652–656

    PubMed  Google Scholar 

  • Woudt LP, Pastink A, Kempers-Veenstra AE, Jansen AEM, Mager WH, Planta RJ (1983) The genes coding for histone H3 and H4 inNeurospora crassa are unique and contain intervening sequences. Nucleic Acids Res 11:5347–5360

    PubMed  Google Scholar 

  • Zakut R, Shani M, Givol D, Neuman S, Yaffe D, Nudel U (1982) The nucleotide sequence of the rat skeletal muscle actin gene. Nature 298:857–859

    PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

Some of the work for this paper was done while the author was at the Department of Applied Statistics, University of Reading, England

Rights and permissions

Reprints and permissions

About this article

Cite this article

Avery, P.J. The analysis of intron data and their use in the detection of short signals. J Mol Evol 26, 335–340 (1987). https://doi.org/10.1007/BF02101152

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02101152

Key words

Navigation