Abstract
A strategy is presented for searching the gene and protein sequence data banks which combines the use of two previously described algorthms. The implementation of this strategy is thoroughly evaluated with respect to sensitivity, specificity and speed. The establishment of standard benchmarks for comparing programs that rearch the sequence data banks for homology is proposed.
Similar content being viewed by others
Literature
Altschul, S. F. and B. W. Erickson. 1986a. “A Nonlinear Measure of Subalignment Similarity and its Significance levels.”Bull. math. Biol. 48, 617–632.
— and —. 1986b. “Locally Optimal Subalignments Using Nonlinear Similarity Functions.”Bull. math. Biol.,48, 633–660.
Barker, W. C. and M. O. Dayoff. 1982. “Viral src gene Products are Related to the Catalytic chain of Mammalian cAMP-dependent Protein kinase.”Proc. natn. Acad. Sci. U.S.A. 79, 2836–2839.
Bilofsky, H. S., C. Burks, J. W. Fickett, W. B. Goad, F. I. Lewitter, W. P. Rindone, C. D. Swindell and C.-S. Tung. 1986. “The GenBank Genetic Sequence Data Bank.”Nucl. Acids Res. 14, 1–4.
Boeke, J. D., D. J. Garfinkel, C. A. Styles and G. R. Fink. 1985. ‘Ty Elements Transpose Through an RNA Intermediate.”Cell 40, 491–500.
Fowlkes, D. M., N. T. Mullis, C. M. Comeau and G. R. Crabtree. 1984. “Potential Basis for Regulation of the Coordinately Expressed Fibrinogen Genes: Homology in the 5′ Flanking Regions.”Proc. natn. Acad. Sci. U.S.A. 81, 2313–2316.
George, D. G., W. C. Barker and L. T. Hunt. 1986. “The Protein Identification Resource (PIR).”Nucl. Acids Res. 14, 11–16.
Goad, W. B. and M. I. Kanehisa. 1984. “Pattern Recognition in Nucleic Acid Sequences. I. A General Method for Finding Local Homologies and Symmetries.”Nucl. Acids Res. 10, 247–263.
Gotoh, O. and Y. Tagashira. 1986. “Sequence Search on a Supercomputer.”Nucl. Acids Res. 14, 57–64.
Kanehisa, M. I. 1984. “Use of Statistical Criteria for Screening Potential Homologies in Nucleic Acid Sequences.”Nucl. Acids Res. 12, 203–214.
Lipman, D. J. and W. R. Pearson. 1985. “Rapid and Sensitive Protein Similarity Searches.”Science 227, 1435–1441.
Lonberg, N. and W. Gilbert. 1985. “The Intron/Exon Structure of the Chicken Pyruvate Kinase Gene.”Cell 40, 81–90.
Mount, S. M. and G. M. Rubin. 1985. “Complete Nucleotide Sequence of the Drosophila Transposable Element Copia: Homology Between Copia and Retroviral Proteins.”Molec. cell. Biol. 5, 1630–1637.
Reich, J. G., H. Drabsch and D. Daumler. 1984. “On the Statistical Assessment of Similarities in DNA Sequences.”Nucl. Acids Res. 12, 5529–5543.
Sankoff, D. and J. B. Kruskal (Eds.) 1983.Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison. London: Addison-Wesley.
Shapiro, J. A. (Ed.). 1983.Mobile Genetic Elements. New York: Academic Press.
Smith, T. F., M. S. Waterman and C. Burks. 1985. “The Statistical Distribution of Nucleic Acid Similarities.”Nucl. Acids Res. 13, 645–656.
Stone, E. M., K. N. Rothblum and R. J. Schwartz. 1985. “Intron-dependent Evolution of Chicken Glyceraldehyde Phosphate Dehydrogenase Gene.”Nature 313, 498–500.
Strauss, D. and W. Gilbert. 1985. “Genetic Engineering in the Precombrian: Structure of the Chicken Triosphosphate Isomerase Gene.”Molec. cell. Biol. 5, 3497–3506.
Sudhof, T. C., J. L. Goldstein, M. S. Brown and D. W. Russell, 1985. “The LDL Receptor Gene: A Mosaic of Exons Shared with Different Proteins.”Science 228, 815–822.
Waterman, M. S. 1984. “General Methods of Sequence Comparison.”Bull. Math. Biol. 46, 473–500.
Wilbur, W. J. and D. J. Lipman. 1983. “Rapid Similarity Searches of Nucleic Acid and Protein Data Banks.”Proc. natn. Acad. Sci. U.S.A. 80, 726–730
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Lawrence, C.B., Goldman, D.A. & Hood, R.T. Optimized homology searches of the gene and protein sequence data banks. Bltn Mathcal Biology 48, 569–583 (1986). https://doi.org/10.1007/BF02462324
Received:
Issue Date:
DOI: https://doi.org/10.1007/BF02462324