Skip to main content
Log in

The size distributions of proteins, mRNA, and nuclear RNA

  • Published:
Journal of Molecular Evolution Aims and scope Submit manuscript

Summary

The frequency distributions of size (molecular weight) and of numbers of subunits were determined from lists of over 500 mammalian and bacterial proteins. The size distribution of polypeptides is well fitted by a lognormal distribution with a median value of about 40,000 daltons and a deviation of 1.8. About 60% of all proteins exist in multimeric aggregates. Of the multimers 75% have either two or four subunits while less than 1% have an odd number of subunits that is greater than three. Over 90% of the time, a given multimer is composed of subunits of nearly equal size so that the size of a N-mer is lognormally distributed with a median value of N x 40,000 daltons and a deviation of 1.8. The distribution of polypeptide size and subunit number is similar for mammalian and bacterial proteins as well as for intracellular and extracellular proteins.

The sedimentation profiles of mRNA from HeLa and CHO cells indicate that the lengths of mammalian mRNA are lognormally distributed with a median value of 1.4 kb and a deviation of 2.0. This implies that, on the average, a mRNA species is only about 25% larger than the mature polypeptide it codes for. Therefore, at most a small fraction of mammalian mRNA could code for large precursor polypeptides which are then cleaved into a number of mature polypeptides (like polio mRNA), or for 3′ coterminal mRNAs where the larger species contain the information for up to four proteins (like adenovirus mRNA).

The sedimentation profile of nascent nuclear RNA from HeLa suggests that the length distribution of transcription units has 2 components: An exponential component that decays with a half-length of 10–15 kb, and a high frequency of very short molecules. However, other distributions (for example, the lognormal distribution) of transcription unit lengths could also be consistent with the data if one or more of the following occurred: Physiological cleavage of nascent chains, perturbation of non-rRNA transcription by actinomycin D, or degradation during isolation.

The length distribution of HeLa nuclear RNA labeled for 60 min is similar to that of nascent nuclear RNA, indicating that a completed hnRNA chain is quickly transported or degraded after being cleaved.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Abbreviations

hRNA:

heterogeneous RNA

L1/2 :

in an exponential distribution, the increase in length required to reduce the frequency by a factor of 2

kb:

kilobases

kd:

kilodaltons

CHO cells:

Chinese hamster ovary cells

References

  • Aitchison, J, Brown, J.A.C. (1957): The lognormal distribution, p. 102. Cambridge: Cambridge University Press

    Google Scholar 

  • Baralle, F.E. (1977). Cell10, 549–558

    Google Scholar 

  • Bishop, J.O. (1974). Cell2, 81–86

    Google Scholar 

  • Bishop, J.O., Morton, J.G., Rosebach, M., Richardson, R.M. (1974). Nature250, 199–204

    Google Scholar 

  • Brachet, J. (1967). Nature213, 650–655

    Google Scholar 

  • Bremer, H., Yuan, D. (1968). J. Mol. Biol.38, 163–180

    Google Scholar 

  • Darnall, D.W., Klotz, I.M. (1976). In: CRC Handbook of biochemical and molecular biology: Proteins. Fasman, G.D., ed., Vol.2, pp. 325–371, Cleveland: CRC Press

    Google Scholar 

  • Darnell, J.E., Girard, M., Baltimore, D., Summers, D.F., Maizel, J. (1967). In: Molecular biology of viruses. Cotter, J., ed., New York: Academic

    Google Scholar 

  • Derman, E., Darnell, J.E. (1974). Cell3, 255–264

    Google Scholar 

  • Derman, E., Goldberg, S., Darnell, J.E. (1976). Cell9, 465–472

    Google Scholar 

  • Eagle, H. (1959). Science130, 432–437

    Google Scholar 

  • Edwards, Y.H., Hopkinson, D.A., Harris, H. (1977). Ann. Hum. Genet.40, 267–277

    Google Scholar 

  • Efstratiadis, A., Kafatos, F.C., Maniatis, T. (1977). Cell10, 571–586

    Google Scholar 

  • Egyhazi, E. (1975). Proc. Nat. Acad. Sci.72, 947–950

    Google Scholar 

  • Feller, W. (1966). An introduction to probability theory and its applications., Vol.2, New York: Wiley

    Google Scholar 

  • Gibrat, R. (1931). Les Inégalitiés Economique, Paris: Libraire de Recueil, Sirey

    Google Scholar 

  • Giorno, R., Sauerbier, W. (1976). Cell9, 775–786

    Google Scholar 

  • Goldberg, S., Schwartz, H., Darnell, J.E. (1977). Proc. Nat. Acad. Sci.74, 4520–4523

    Google Scholar 

  • Greenberg, H., Penman, S. (1966). J. Mol. Biol.21, 527–535

    Google Scholar 

  • Herman, R.C., Penman, S. (1977). Biochemistry16, 3460–3465

    Google Scholar 

  • Hopkinson, D.A., Edwards, Y.H., Harris, H. (1976). Ann. Hum. Genet.39, 383–411

    Google Scholar 

  • Hruby, P.E., Maki, R.A., Cummings, D.J. (1977). Biochim. Biophys. Acta,47, 89–96

    Google Scholar 

  • Jelinek, W., Leinwand, L. (1978). Cell15, 205–214

    Google Scholar 

  • Karlin, S., Taylor, H.M. (1975). A first course in stochastic processes. New York: Academic

    Google Scholar 

  • Kleczkowski, A. (1949). Ann. Appl. Biol.36, 139–152

    Google Scholar 

  • Koehn, R.K., Eanes, W.F. (1978). Evolutionary Biol.11, 39–100

    Google Scholar 

  • Levis, R., Penman, S. (1977). Cell11, 105–113

    Google Scholar 

  • MacReynolds, L.A., O'Malley, B.W., Nesbet, A.D., Fothergill, J.E., Givol, D., Fields, S., Robertson, M., Brownlee, G.G. (1978). Nature273, 723–728

    Google Scholar 

  • Malloy, G.R., Jelinek, W., Salditt, M., Darnell, J.R. (1974). Cell1, 43–53

    Google Scholar 

  • Masson, P.L. (1976). In: CRC Handbook of biochemical and molecular biology: Proteins. Fasman, G.D., ed., Vol.2, pp. 242–253, Cleveland: CRC Press

    Google Scholar 

  • Milcarek, C., Price, R., Penman, S. (1974). Cell3, 1–10

    Google Scholar 

  • Nei, M., Chakraborty, R., Fuerst, P.A. (1976). Proc. Nat. Acad. Sci.73, 4164–4168

    Google Scholar 

  • Nei, M., Fuerst, P.A., Chakraborty, R. (1978). Proc. Nat. Acad. Sci.75, 3359–3362

    Google Scholar 

  • Nemer, M., Dubroff, C.M., Graham, M. (1975). Cell6, 171–178

    Google Scholar 

  • Nevins, J., Darnell, J.E. (1978). J. Virology25, 811–825

    Google Scholar 

  • Nikolaev, N., Silengo, L., Schlessinger, D. (1973). Proc. Nat. Acad. Sci.70, 3361–3365

    Google Scholar 

  • Pearson, E.S., Hartley, H.O. (1966, 1972). Biometrika tables for statisticians, Vol. 1 and 2., Cambridge: Cambridge University Press

    Google Scholar 

  • Penman, S. (1966). J. Mol. Biol.17, 117–130

    Google Scholar 

  • Penman, S., Scherrer, K., Becker, Y., Darnell, J.E. (1963). Proc. Nat. Acad. Sci.49, 654–662

    Google Scholar 

  • Perry, R.P. (1963). Exp. Cell Research29, 400–406

    Google Scholar 

  • Perry, R.P., Latorre, J., Kelly, D.E., Greenberg, J.A. (1972). Biochim. Biophys. Acta262, 220–226

    Google Scholar 

  • Peterson, J.L., McConkey, L. (1976). J. Biol. Chem.251, 548–554

    Google Scholar 

  • Polasa, H., Green, M. (1967) Virology31, 565–567

    Google Scholar 

  • Preston, F.W. (1948). Ecology29, 254–283

    Google Scholar 

  • Proudfoot, N.J. (1977). Cell10, 559–570

    Google Scholar 

  • Proudfoot, N.J., Gillam, S., Smith, M., Longley, J.I. (1977). Cell11, 807–818

    Google Scholar 

  • Puckett, L., Darnell, J.E. (1976). J. Cell Physiol.90, 521–534

    Google Scholar 

  • Reddy, V.B., Thimmappaya, B., Dhar, R., Subramanian, K.N., Zain, B.S., Pan, J., Ghosh, P.K., Celma, M.L., Weissman, S.M. (1978). Science200, 494–502

    Google Scholar 

  • Reeck, G. (1976). In: CRC Handbook of biochemistry and molecular biology: Proteins. Fasman, G.D., ed., Vol.3, pp. 504–519, Cleveland: CRC Press

    Google Scholar 

  • Sanger, F., Dir, G.M., Barrell, B.G., Brown, B.L., Coulson, H.R., Fiddes, J.C., Hutchinson, C.V., Slocombe, P.M., Smith, M. (1976). Nature265, 687–698

    Google Scholar 

  • Sawicki, S., Jelinek, W., Darnell, J.E. (1977), J. Mol. Biol.113, 219–239

    Google Scholar 

  • Spradling, A., Hui, H., Penman, S. (1974). Cell4, 131–137

    Google Scholar 

  • Strauss, J.H., Kelly, R.B., Sinsheimer, R.I. (1968). Biopolymers6, 793–807

    Google Scholar 

  • Sueoka, N. (1961). Proc. Nat. Acad. Sci.47, 1141–1149

    Google Scholar 

  • U.S. Department of Commerce, Office of Business Economics. (1952). Income distribution in the United States, Washington, D.C.: US Govt. Printing Office

    Google Scholar 

  • Vallee, B.L., Wacker, W.E.C. (1976). In: CRC Handbook of biochemistry and molecular biology: Proteins. Fasman, G.D., ed., Vol.3, pp. 278–292, Cleveland: CRC Press

    Google Scholar 

  • Villa-Komaroff, C., Guttman, N., Baltimore, D., Lodish, H.F. (1975). Proc. Nat. Acad. Sci.72, 4157–4161

    Google Scholar 

  • Williams, C.B. (1937). Ann. Appl. Biol.24, 404–414

    Google Scholar 

  • Yuan, P.T. (1933). Ann. Math. Statistics6, 20–34

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

This paper is dedicated to Harold Sommer

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sommer, S.S., Cohen, J.E. The size distributions of proteins, mRNA, and nuclear RNA. J Mol Evol 15, 37–57 (1980). https://doi.org/10.1007/BF01732582

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF01732582

Key words

Navigation