Abstract
Protein and genomic sequence analyses helps in understanding the structure, function, and organization of cellular systems. Important features of genes include identifying promoter regions, protein-coding regions, and intron-exon boundaries. Protein sequence analysis involves identifying functional motifs and patterns. Sequence search tools help in identifying similar sequences in protein and genomic databases. Here, we will discuss bioinformatics tools that help in biological sequence searches and analyses.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Abagyan, R.A. and Batalov, S. (1997) Do aligned sequences share the same fold? JMol Biol 273(1), 355–68.
Attwood, T.K., Craning, M.D., et al. (2000) PRINTS-S: the database formerly known as PRINTS. Nucleic Acids Res 28(1), 225–7.
Biswas, M., O’Rourke, J.F., et al. (2002) Applications of InterPro in protein annotation and genome analysis. Brief Bioinform 3(3), 285–95.
Dayhoff, M.O. and Schwartz, R.M. (1978). A model of evolutionary change in proteins. Washington DC, National Biomedical Research Foundation.
Falquet, L., Pagni, M., et al. (2002) The PROSITE database, its status in 2002. Nucleic Acids Res 30(1), 235–8.
Finn, R.D., Mistry, J., et al. (2006) Pfam: clans, web tools and services. Nucleic Acids Res 34(Database issue), D247–51.
Gattiker, A., Gasteiger, E., et al. (2002) ScanProsite: a reference implementation of a PROSITE scanning tool. Appl Bioinformatics 1(2), 107–8.
Gonnet, G.H., Cohen, M.A., et al. (1992) Exhaustive matching of the entire protein sequence database. Science 256(5062), 1443–5.
Gotoh, O. (1982) An improved algorithm for matching biological sequences. J Mol Biol 162(3), 705–8.
Grundy, W.N., Bailey, T.L., et al. (1997) Hidden Markov model analysis of motifs in steroid dehydrogenases and their homologs. Biochem Biophys Res Commun 231(3), 760–6.
Grundy, W.N., Bailey, T.L., et al. (1997 b) Meta-MEME: motif-based hidden Markov models of protein families. Comput Appl Biosci 13(4), 397–406.
Henikoff, J.G., Greene, E.A., et al. (2000) Increased coverage of protein families with the blocks database servers. Nucleic Acids Res 28(1), 228–30.
Henikoff, J.G., Pietrokovski, S., et al. (2000 b) Blocks-based methods for detecting protein homology. Electrophoresis 21(9), 1700–6.
Henikoff, S. and Henikoff, J.G. (1992) Amino acid substitution matrices from protein blocks. Proc NatlAcadSci USA 89(22), 10915–9.
Huang, J.Y. and Brutlag, D.L. (2001) The EMOTIF database. Nucleic Acids Res 29(1), 202–4.
Johnson, M.S. and Overington, J.P. (1993) A structural basis for sequence comparisons. An evaluation of scoring methodologies. JMol Biol 233(4), 716–38.
Jonassen, I., Collins, J.F., et al. (1995) Finding flexible patterns in unaligned protein sequences. Protein Sci 4(8), 1587–95.
Kanapin, A., Apweiler, R., et al. (2002) Interactive InterPro-based comparisons of proteins in whole genomes. Bioinformatics 18(2), 374–5.
Karlin, S. and Altschul, S.F. (1990) Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes. Proc Natl Acad Sci USA 87(6), 2264–8.
Lipman, D.J., Wilbur, W.J., et al. (1984) On the statistical significance of nucleic acid similarities. Nucleic Acids Res 12(1 Pt 1), 215–26.
Mathura, V.S., Schein, C.H., et al. (2003) Identifying property based sequence motifs in protein families and superfamilies: application to DNase-1 related endonucleases. Bioinformatics 19(11), 1381–90.
Mulder, N.J. and Apweiler, R. (2002) Tools and resources for identifying protein families, domains and motifs. Genome Biol 3(1), REVIEWS2001.
Naor, D., Fischer, D., et al. (1996) Amino acid pair interchanges at spatially conserved locations. JMol Biol 256(5), 924–38.
Needleman, S.B. and Wunsch, CD. (1970) A general method applicable to the search for similarities in the amino acid sequence of two proteins. JMol Biol 48(3), 443–53.
Notredame, C, Higgins, D.G., et al. (2000) T-Coffee: A novel method for fast and accurate multiple sequence alignment JMol Biol 302(1), 205–17.
Pearson, W.R. (1998) Empirical statistical estimates for sequence similarity searches. J Mol Biol 276(1), 71–84.
Prlic, A., Domingues, F.S., et al. (2000) Structure-derived substitution matrices for alignment of distantly related sequences. Protein Eng 13(8), 545–50.
Rost, B. (1999) Twilight zone of protein sequence alignments. Protein Eng 12(2), 85–94.
Sigrist, C.J., Cerutti, L., et al. (2002) PROSITE: a documented database using patterns and profiles as motif descriptors. Brief Bioinform 3(3), 265–74.
Smith, T.F. and Waterman, M.S. (1981) Identification of common molecular subsequences. J Mol Biol 147(1), 195–7.
Thompson, J.D., Higgins, D.G., et al. (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22(22), 4673–80.
Thompson, J.D., Plewniak, F., et al. (1999) BAliBASE: a benchmark alignment database for the evaluation of multiple alignment programs. Bioinformatics 15(1), 87–8.
Thompson, W., Rouchka, E.C., et al. (2003) Gibbs Recursive Sampler: finding transcription factor binding sites. Nucleic Acids Res 31(13), 3580–5.
Venkatarajan, M.S. and Braun, W. (2001) New quantitative descriptors of amino acids based on multidimensional scaling of a large number of physical-chemical properties. J Mol Model, 7, 445–53.
Wilson, C.A., Kreychman, J., et al. (2000) Assessing annotation transfer for genomics: quantifying the relations between protein sequence, structure and function through traditional and probabilistic scores. J Mol Biol 297(1), 233–49.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Mathura, V.S. (2009). Biological Sequence Search and Analysis. In: Mathura, V.S., Kangueane, P. (eds) Bioinformatics: A Concept-Based Introduction. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-84870-9_5
Download citation
DOI: https://doi.org/10.1007/978-0-387-84870-9_5
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-84869-3
Online ISBN: 978-0-387-84870-9
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)