Using the FASTA Program to Search Protein and DNA Sequence Databases

  • William R. Pearson
Part of the Methods in Molecular Biology book series (MIMB, volume 24)


As this volume illustrates, computers have become an integral tool in the analysis of DNA and protein sequence data. One of the most popular applications of computers in modern molecular biology is to characterize newly determined sequences by searching DNA and protein sequence databases. The FASTA* program (1,2) is widely used for such searches, because it is fast, sensitive, and readily available. FASTA is available as part of a package of programs that construct local and global sequence alignments. This chapter will describe a number of simple applications of FASTA and other programs in the FASTA package. This chapter focuses on the steps required to run the programs, rather than on the interpretation of the results of a FASTA search. For a more complete description of FASTA and related programs for identifying distantly related DNA and protein sequences, for evaluating the statistical significance of sequence similarities, and for identifying similar structures in DNA and protein sequences see ref. 2.


Similarity Score Query Sequence Protein Sequence Database FASTA Search Database File 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Pearson, W. R. and Lipman, D. I. (1988) Improved tools for biological sequence comparison. Proc. Natl. Acad. Sci. USA 85, 2444–2448.PubMedCrossRefGoogle Scholar
  2. 2.
    Pearson, W. R. (1990) Rapid and sensitive sequence comparison with FASTP and FASTA, in Methods in Enzymology, vol. 183 (Doolittle, R. F, ed.), Academic, New York, pp. 63–98.Google Scholar
  3. 3.
    Lipman, D. J. and Pearson, W. R. (1985) Rapid and sensitive protem similarity searches. Science 227, 1435–1441.PubMedCrossRefGoogle Scholar
  4. 4.
    Pearson, W. R. (1991) Searching protein sequence libraries: Comparison of the sensitivity and selectivity of the Smith-Waterman and FASTA algorithms. Genomics 11, 635–650.PubMedCrossRefGoogle Scholar
  5. 5.
    Dayhoff, M., Schwartz, R. M., and Orcutt, B. C. (1978) A model of evolutionary change in proteins, in Atlas of Protein Sequence and Structure, vol. 5,supplement 3 (Dayhoff, M, ed.), National Biomedical Research Foundation, Silver Spring, MD, pp. 345–352.Google Scholar
  6. 6.
    Doolittle, R. F., Feng, D. F., Johnson, M. S., and McClure, M. A. (1986) Relationships of human protein sequences to those of other organisms. Cold Spring Harb. Symp. Quant. Biol. 51, 447–455.PubMedGoogle Scholar
  7. 7.
    Smith, T. F. and Waterman, M. S. (1981) Identification of common molecular subsequences. J. Mol. Biol. 147, 195–197.PubMedCrossRefGoogle Scholar
  8. 8.
    Altschul, S. F., Gish, W., Miller, W., Myers, E. W., and Lipman, D. J. (1990) A basic local alignment search tool. J. Mol. Biol. 215, 403–410.PubMedGoogle Scholar
  9. 9.
    Waterman, M. S. and Eggert, M. (1987) A new algorithm for best subsequences alignment with application to tRNA-rRNA comparisons. J. Mol. Biol. 197, 723–728.PubMedCrossRefGoogle Scholar
  10. 10.
    Huang, X., Hardrson, R. C., and Miller, W. (1990) A space-efficient algorithm for local similarities. CABIOS 6, 373–381.PubMedGoogle Scholar
  11. 11.
    Huang, X. and Miller, W. (1991) A time-efficient, linear-space local similarity algorithm. Adv. Appl. Math. 12, 337–357.CrossRefGoogle Scholar

Copyright information

© Humana Press Inc., Totowa, NJ 1994

Authors and Affiliations

  • William R. Pearson
    • 1
  1. 1.Department of BiochemistryUniversity of VirginiaCharlottesville

Personalised recommendations