Skip to main content
Log in

Pattern recognition in several sequences: Consensus and alignment

  • Published:
Bulletin of Mathematical Biology Aims and scope Submit manuscript

Abstract

The comparison of several sequences is central to many problems of molecular biology. Finding consensus patterns that define genetic control regions or that determine structural or functional themes are examples of these problems. Previously proposed methods, such as dynamic programming, are not adequate for solving problems of realistic size. This paper gives a new and practical solution for finding unknown patterns that occur imperfectly above a preset frequency. Algorithms for finding the patterns are given as well as estimates of statistical significance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Literature

  • Arbanel, R. M., P. R. Wienecke, E. Mansfield, D. A. Jaffe and D. L. Brutlag. 1984. “Rapid Searches for Computer Patterns in Biological Molecules.”Nucl. Acids. Res. 12, 263–280.

    Google Scholar 

  • Aho, V. A., J. E. Hopcroft and J. D. Ullman. 1974.The Design and Analysis of Computer Algorithms. Menlo Park, CA: Addison-Wesley.

    Google Scholar 

  • Anderson, W. F., Y. Takeda, D. H. Ahlendorf and B. W. Matthews. 1982. “Proposed Helix Super-Secondary Structure Associated with Protein-DNA Recognition.”J. Mol. Biol. 159, 745–751.

    Article  Google Scholar 

  • Breathnoch, R. and P. Chambon. 1981. “Organization Expression of Split Genes Coding for Proteins.”A. Rev. Biochem. 50, 344–383.

    Google Scholar 

  • Breen, S., M. S. Waterman and N. Zhang. 1984. “Renewal Theory for Several Patterns.”J. Appl. Prob. (in press).

  • Dickerson, R. E., H. R. Brew, B. N. Conner, R. M. Wing, A. V. Frantini and M. L. Kopha. 1982. “The Anatomy of A-B- C and Z-DNA.”Science 216, 475–485.

    Google Scholar 

  • Dickerson, R. E. 1983. “Base Sequence and Helix Structure Variation in B DNA.”J. Mol. Biol. 166, 419–441.

    Google Scholar 

  • Dumas, J. P. and J. Ninio, 1982. “Efficient Algorithms for Folding and Comparing Nucleic Acid Sequences.”Nucl. Acids Res. 80, 197–206.

    Google Scholar 

  • Gnanadesikan, R. 1977.Methods for Statistical Data Analysis of Multivariate Observations. New York: John Wiley.

    Google Scholar 

  • Goldberg, M. L. 1979. Ph.D. thesis, Stanford University.

  • Hawley, D. K. and W. R. McClure. 1983. “Compilation and Analysis ofEscherichia Coli Promotor DNA Sequences.”Nucl. Acids Res. 11, 2237–2255.

    Google Scholar 

  • Marliere, P. 1982. “The Fossil Organization of Transfer-RNA Sequences.” Unpublished manuscript.

  • Matthews, B. W., D. H. Ahlendorf, W. F. Anderson and Y. Takeda. 1982. “Structure of the DNA-binding Region ofLac Repressor Inferred from its Homology withCro Repressor.”Proc. natn. Acad. Sci. U.S.A. 79, 1428–1432.

    Article  Google Scholar 

  • Minsky, M. and S. Papert. 1969. In “Perceptrons.” MIT Press, Cambridge, MA.

    Google Scholar 

  • Noller, H. F. and C. R. Woese. 1981. “Secondary Structure of 16S Ribosomal RNA.”Science 212, 403–410.

    Google Scholar 

  • Parzen, E. 1962. “On the Estimation of Probability Density Functions and Mode.”Ann. Math. Statist. 33, 1065–1076.

    MATH  MathSciNet  Google Scholar 

  • Pribnow, D. 1975. “Bacteriophage T7 Early Promoters: Nucleotide Sequences of Two RNA Polymerase Binding Sites.”J. Mol. Biol. 99, 419–443.

    Google Scholar 

  • Queen, C. M., N. Wegman and L. T. Korn. 1982. “Improvements to a Program for DNA Analysis: A Procedure to Find Homologies Among Many Sequences.”Nucl. Acids Res. 10, 449–456.

    Google Scholar 

  • Sadler, J. R., M. S. Waterman and T. F. Smith. 1983. “Regulatory Pattern Identification in Nucleic Acid Sequences.”Nucl. Acids Res. 11, 2221–2231.

    Google Scholar 

  • Schaller, H., C. Gray and K. Herrmann. 1975. “Nucleotide Sequence of an RNA Polymerase Binding Site from the DNA of Bacteriophage fd.”PNAS 72, 737–741.

    Article  Google Scholar 

  • Smith, T. F., M. S. Waterman and W. M. Fitch. 1981. “Comparative Biosequence Metrics.”J. Mol. Biol. 18, 38–46.

    Google Scholar 

  • Steitz, J. A. and K. Jakes. 1975. “How Ribosomes Select Initiator Regions in mRNA: Base Pair Formation Between the 3′ Terminus of 16S rRNA and the mRNA During Initiation of Protein Synthesis inE. coli.”Proc. natn. Acad. Sci. U.S.A. 72, 4734–4738.

    Article  Google Scholar 

  • Stormo, G. D., T. D. Schneider, L. Gold and A. Ehrenfeucht. 1982. “Use of the ‘Perceptron’ Algorithm to Distinguish Translational Initation Sites inE. coli.”Nucl. Acids Res. 10, 2997–3011.

    Google Scholar 

  • Waterman, M. S. and D. E. Whiteman. 1978. “Estimation of Probability Densities by Empirical Density Functions.”Int. J. Math. Educ. Sci. Technol. 9, 127–137.

    MATH  Google Scholar 

  • Waterman, M. S. 1983. “Frequences of Restriction Sites.”Nucl. Acids. Res. 11, 8951–8956.

    Google Scholar 

  • Waterman, M. S. 1984. “General Methods of Sequence Comparison.”Bull. math. Biol.

Download references

Author information

Authors and Affiliations

Authors

Additional information

This author supported by a grant from the System Development Foundation.

This author supported by NSF grant MCS-8301960 and by a grant from the System Development Foundation.

This author supported by NIH grant GM19036.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Waterman, M.S., Arratia, R. & Galas, D.J. Pattern recognition in several sequences: Consensus and alignment. Bltn Mathcal Biology 46, 515–527 (1984). https://doi.org/10.1007/BF02459500

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02459500

Keywords

Navigation