Abstract
A program package has been developed to search for hidden tandem repeats of any specified type in the protein sequence databases. The applied algorithm of the locally optimal cyclic alignment is able to find subsequences possessing a certain profile-based periodicity type when no appreciable homology between periods is observed, as well as in the presence of arbitrary insertions/deletions. The profile can be adjusted to search for the periodicity types structurally and functionally important. The Swiss-Prot database has been analyzed to reveal the periodicities undetectable earlier that are caused by the secondary and super-secondary structure regularities of the NAD-binding sites. In particular, a significant periodicity of 24 aa was found to be characteristic of the absolute majority of domains possessing the Rossman (or Rossman-like) fold and displaying apparent regularity in their secondary structures, not being obvious at the primary structure level.
Article PDF
Similar content being viewed by others
REFERENCES
Korotkov E.V., Korotkova M.A. 1996. Enlarged similarity of nucleic acid sequences. DNA Res. 3, 157–164.
Korotkov E.V., Korotkova M.A. 1995. DNA regions with latent periodicity in some human clones. DNA Seq. 5, 353–358.
Chaley M.B., Korotkov E.V., Skryabin K.G. 1999. Method revealing latent periodicity of the nucleotide sequences for a case of small samples. DNA Res. 6, 15–163.
Korotkov E.V., Korotkova M.A., Rudenko V.M., Skryabin K.G. 1999. The latent periodicity in amino acid sequences. Mol. Biol. 33, 611–617.
Korotkova M.A., Korotkov E.V., Rudenko V.M. 1999. Latent periodicity of protein sequences. J. Mol. Model. 5, 103–115.
Heger A., Holm L. 2000. Rapid automatic detection and alignment of repeats in protein sequences. Proteins. 41, 224–237.
Landau G.M., Schmidt J.P., Sokol D. 2001. An Algorithm for approximate tandem repeats. J. Comp. Biol. 8, 1–18.
Neuwald A.F., Poleksic A. 2000. PSI-BLAST searches using hidden Markov models of structural repeats: prediction of unusual sliding DNA clamp and of beta-propellers in UV-damaged DNA-binding protein. Nucleic. Acids Res. 28, 3570–3580.
Coward E., Drablos F. 1998. Detecting periodic patterns in biological sequences. Bioinformatics. 14, 498–507.
Katti M.V., Sami-Subbu R., Ranjekar P.K., Gupta V.S. 2000. Amino acid repeat patterns in protein sequences: their diversity and structural-functional implications. Protein Sci. 9, 1203–1209.
Andrade M.A., Ponting C.P., Gibson T.J., Bork P. 2000. Homology-based method for identification of protein repeats using statistical significance estimates. J. Mol. Biol. 298, 521–537.
Benson G. 1999. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 27, 573–580.
Marcotte E.M., Pellegrini M., Yeates T.O., Eisenberg D. 1999. A census of protein repeats. J. Mol. Biol. 293, 151–160.
Landschulz W.H., Johnson P.F., McKnight S.L. 1988. The leucine zipper: a hypothetical structure common to a new class of DNA binding proteins. Science. 240, 1759–1764.
Gribskov M., McLachlan A.D., Eisenberg D. 1987. Profile analysis: Detection of distantly related proteins. Proc. Natl. Acad. Sci. USA. 84, 4355–4358.
Fischetti V., Landau G., Schmidt J., Sellers P. 1992. In Proceedings of the 3rd Annual Symposium on Combinatorial Pattern Matching. Eds. Apostolico A., Crochemore M., Galil Z., Manber U., Lecture Notes in Computer Science. N.Y.-London-Berlin: Springer-Verlag, 644, 111–120.
Benson G., Waterman M.S. 1994. A method for fast database search for all k-nucleotide repeats. Nucleic Acids Res. 22, 4828–4836.
Benson G. 2001. In: Proceedings of the 12th Annual Symposium on Combinatorial Pattern Matching. Eds. Amir A., Landau G.M., Lecture Notes in Computer Science. N.Y.-London-Berlin: Springer-Verlag, 2089, 118–130.
Webber C., Barton G.J. 2001. Estimation of P-values for global alignment of protein sequences. Bioinformatics. 17, 1158–1167.
Comet J.P., Aude J.C., Glemet E., Risler J.L., Henaut A., Slonimski P.P., Codani J.J. 1999. Significance of Z-value statistics of Smith-Waterman scores for protein alignments. Comp. Chem. 23, 317–331.
Aravind L., Koonin E.V. 1999. Gleaning non-trivial structural, functional and evolutionary information about proteins by iterative database searches. J. Mol. Biol. 287, 1023–1040.
Altschul S.F., Koonin E.V. 1998. Iterated profile searches with PSI-BLAST - a tool for discovery in protein databases. Trends Biochem. Sci. 23, 444–447.
Karlin S., Dembo A., Kawabata T. 1990. Statistical composition of high-scoring segments from molecular sequences. Ann. Stat. 18, 571–581.
Mott R., Tribe R. 1999. Approximate statistics of gapped alignments. J. Comp. Biol. 6, 91–112.
Bairoch A., Apweiler R. 2000. The Swiss-Prot protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Res. 25, 45–48.
Junker V.L., Apweiler R., Bairoch A. 1999. Representation of functional information in the SWISS-PROT data bank. Bioinformatics. 15, 1066–1067.
Sussman J.L., Abola E.E., Lin D., Jiang J., Manning N.O., Prilusky J. 1999. The protein data bank. Genetica. 106, 149–158.
Laskin A., Korotkov E., Kudryashov N. 2002. In: Proceedings of the 3rd International Conference on Bioinformatics of Genome Regulation and Structure. Novosibirsk. 3, 97–99.
Rao S.T., Rossman M.G. 1973. Comparison of super-secondary structures in proteins. J. Mol. Biol. 76, 241–256.
Rossman M.G., Moras D., Olsen K.W. 1974. Chemical and biological evolution of a nucleotide-binding protein. Nature. 250, 194–199.
Sander C., Schneider R. 1991. Database of homology-derived protein structures. Proteins. 9, 56–68.
Brenner S.E., Chothia C., Hubbard T.I.P., Murzin A.G. 1995. SCOP: a structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol. 247, 536–540.
Lesk A.M. 1995. NAD-binding domains of dehydrogenases. Curr. Opin. Struct. Biol. 5, 775–783.
Bellamacina C.R. 1996. The nicotinamide inucleotide binding motif: a comparison of nucleotide binding proteins. FASEB J. 10, 1257–1269.
Baker P.J., Britton K.L., Rice D.W., Rob A., Stillman T.J. 1992. Structural consequences of sequence paterns in the fingerprint region of the nucleotide binding fold. J. Mol. Biol. 228, 662–671.
Kutzenko A.S., Lamzin V.S., Popov V.O. 1998. Conserved supersecondary structural motif in NAD-binding dehydrogenases. FEBS Lett. 423, 105–109.
Fjellstorm O., Olausson T., Hu X., Kallebring B., Ahmad S., Bragg P.D., Rydstrom J. 1995. Three-dimensional structure prediction of the NAD-binding site of proton-pumping transhydrogenase from Escherichia coli. Proteins. 2, 91–104.
Wierenga R.K., Terpstra P., Hol W.G.J. 1986. Prediction of the occurence of ADP-binding β αβ-fold in proteins, using an amino acid sequence fingerprint. J. Mol. Biol. 187, 101–107.
McKie J.H., Douglas K.T. 1991. Evidence for gene duplication forming similar binding folds for NAD(P)H and FAD in pyridine nucleotide-dependent flavoenzymes. FEBS Lett. 279, 5–8.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Laskin, A.A., Korotkov, E.V., Chaley, M.B. et al. The Locally Optimal Method of Cyclic Alignment to Reveal Latent Periodicities in Genetic Texts: the NAD-binding Protein Sites. Molecular Biology 37, 561–570 (2003). https://doi.org/10.1023/A:1025139427862
Issue Date:
DOI: https://doi.org/10.1023/A:1025139427862