Soft Computing

, Volume 10, Issue 4, pp 305–314 | Cite as

Threading with environment-specific score by artificial neural networks

Article

Abstract

Protein threading programs align a probe amino acid sequence onto a library of representative folds of known protein structure to identify a structural homology. A scoring function is usually formulated in terms of the threading energy to evaluate the protein sequence-structure fitness. In this paper, a model named threading with environment-specific score (TES) is proposed to build a new threading score function with the use of artificial neural networks. Given a protein structure with a residue level environment description, the compatibility of residue in sequence with its structural environment is presented. A threading score is constructed by log-odds scores of predicted probabilities from the trained model to determine which residue best fits its environment. Two decoy sets are used to test the proposed TES method on discrimination of native and decoy protein three-dimensional structure. The results showed that the performance of the proposed method is comparable to those of knowledge-based potential energy function.

Keywords

Protein threading Decoy set Knowledge-based potential Artificial neural network 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Baldi P, Brunak S (2001) Bioinformaics: the machine learning approach. MIT Press, CambridgeGoogle Scholar
  2. 2.
    Baldi P, Brunak S, Chauvin Y, Andersen CAF, Nielsen H (2000) Assessing the accuracy of prediction algorithms for classification: an overview. Bioinformatics 16(5):412–424Google Scholar
  3. 3.
    Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H et al (2000) The Protein Data Bank. Nucleic Acids Res 28:235–242Google Scholar
  4. 4.
    Bernstein FC, Koetzle TF, Williams GJB, Meyer E Jr, Brice MD, Rodgers JR, Kennard O, Shimanouchi T, Tasumi M (1977) The Protein Data Bank: a computer-based archival file for macromolecular structures. J Mol Biol 15:937–946Google Scholar
  5. 5.
    Bowie JU, Luthy R, Eisenberg D (1991) A method to identify protein sequences that fold into a known three-dimensional structure. Science 253:164–170Google Scholar
  6. 6.
    Braxenthaler M, Samudrala R, Pedersen J, Luo R, Milash B Moult J (1997) PROSTAR: the protein potential test site. http://prostar.carb.nist.gov
  7. 7.
    Bryant SH, Lawrence CE (1993) An empirical energy function for threading protein sequence through the folding motif. Proteins Struct Funct Genet 16(1):92–112Google Scholar
  8. 8.
    Cybenko G (1989) Approximation by superpositions of a sigmoidal function. Math Control Signal Syst 2(4):303–314Google Scholar
  9. 9.
    Gatchell DW, Dennis S, Vajda S (2000) Discrimination of nearnative protein structures from misfold models by empirical free energy functions. Proteins Struct Funct Genet 41:518–534Google Scholar
  10. 10.
    Hanley JA, McNeil BJ (1982) The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143:29–36Google Scholar
  11. 11.
    Holm L, Sander C (1992) Evaluation of protein models by atomic solvation preference. J Mol Biol 225:93–105Google Scholar
  12. 12.
    Holm L, Sander C (1997) Dali/ FSSP classification of three-dimensional protein folds. Nucleic Acids Res 25:231–234Google Scholar
  13. 13.
    Jadwiga RB, Robert GR Jr, Temple FS (1999) Performance of threading scoring function designed using new optimisation method. J Comput Biol 6:299–311Google Scholar
  14. 14.
    Jones DT, Miller RT, Thornton JM (1995) Successful protein fold recognition by optimal sequence threading validated by rigorous blind testing. Proteins Struct Funct Genet 23:387–397Google Scholar
  15. 15.
    Jones DT (1999) GenTHREADER: an efficient and reliable protein fold recognition method for genomic sequences. J Mol Biol 287(4):797–815Google Scholar
  16. 16.
    Lathrop RH, Smith TF (1996) Global optimum protein threading with gapped alignment and empirical pair potentials. J Mol Biol 255:641–665Google Scholar
  17. 17.
    Lazaridis T, Karplus M (2000) Effective energy functions for protein structure prediction. Curr Opin Struct Biol 10:139–145Google Scholar
  18. 18.
    Lin K, May ACW, Taylor WR (2002) Threading using neural network: the measure of protein sequence-structure compatibility. Bioinformatics 18(10):1350–1357Google Scholar
  19. 19.
    Lo Conte L, Brenner SE, Hubbard TJP, Chothia C, Murzin A (2002) SCOP database in 2002: refinements accommodate structural genomics. Nucleic Acids Res 30(1):264–267Google Scholar
  20. 20.
    Lu H, Skolnick J (2001) A distance-dependent atomic knowledge-based potential for improved protein structure selection. Proteins Struct Funct Genet 44:223–232Google Scholar
  21. 21.
    McConkey BJ, Sobolev V, Edelman M (2003) Discrimination of native protein structures using atom-atom contact scoring. Proc Natl Acad Sci USA 100:3215–3220Google Scholar
  22. 22.
    McGuffin LJ, Jones DT (2003) Improvement of the GenTHREADER method for genomic fold recognition. Bioinformatics 19:874–881Google Scholar
  23. 23.
    Mosimann S, Meleshko R, James M (1995) A critical assessment of comparative molecular modelling of tertiary structures in proteins. Proteins Struct Funct Genet 23:301–317Google Scholar
  24. 24.
    Murzin AG, Brenner SE, Hubbard T, Chothia C (1995) SCOP: a structural classification of proteins database for the invertigation of sequences and structures. J Mol Biol 241(4):536–540Google Scholar
  25. 25.
    Orengo CA, Michie AD, Jones S, Jones DT, Swindells MB, Thornton JM (1997) CATH- a hierarchic classification of protein domain structures. Structure 5:1093–1108Google Scholar
  26. 26.
    Park B, Levitt M (1996) Energy functions that discriminate X-ray and near-native folds from well-constructed decoys. J Mol Biol 258:367–392Google Scholar
  27. 27.
    Russ WP, Ranganathan R (2002) Knowledge-based potential functions in protein design. Curr Opin Struct Biol 12:447–452Google Scholar
  28. 28.
    Samudrala R, Moult J (1998) An all-atom distance-dependent conditional probability discriminatory function for protein structure prediction. J Mol Biol 275:895–916Google Scholar
  29. 29.
    Samudrala R, Huang ES, Levitt M (1998) Selection of the most native-like conformations from a set of models constructed by homology modelling. Unpublished results.Google Scholar
  30. 30.
    Samudrala R, Xia Y, Levitt M, Huang ES (1999) A combined approach for ab initio construction of low resolution protein tertiary structures from sequence. In: Proceedings of the pacific symposium on biocomputing, pp 505–516Google Scholar
  31. 31.
    Samudrala R, Levitt M (2000) Decoys `R' Us: a database of incorrect conformations to improve protein structure prediction. Protein Sci 9:1399–1401Google Scholar
  32. 32.
    Samudrala R, Levitt M (2002) A comprehensive analysis of 40 blind protein structure predictions. BMC Struct Biol 2:3–18Google Scholar
  33. 33.
    Skolnick J, Kolinski A, Ortiz A (2000) Derivation of protein-specific pair potentials based on weak sequence fragment similarity. Proteins Struct Funct Genet 38:3–16Google Scholar
  34. 34.
    Simons KT, Kooperberg C, Huang ES, Baker D (1997) Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and bayesian scoring functions. J Mol Biol 268:209–225Google Scholar
  35. 35.
    Sippl MJ (1995) Knowledge-based potentials for proteins. Curr Opin Struct Biol 5:229–235Google Scholar
  36. 36.
    Taylor WR (1997) Multiple sequence threading: an analysis of alignment quality and stability. J Mol Biol 269:902–943Google Scholar
  37. 37.
    Taylor WR, Orengo CA (1989) Protein structure alignment. J Mol Biol 208(1):1–22Google Scholar
  38. 38.
    Thiele R, Zimmer R, Lengauer T (1999) Protein threading by recursive dynamic programming. J Mol Biol 290:757–779Google Scholar
  39. 39.
    Unger R, Moult J (1991) An analysis of protein folding pathways. Biochemistry 30:3816–3823Google Scholar
  40. 40.
    Vendruscolo M, Najmanovich R, Domany E (2000) Can a pairwise contact potential stabilize native protein folds against decoys obtained by threading?. Proteins Struct Funct Genet 38:134–148Google Scholar
  41. 41.
    Wang K, Fain B, Levitt M, Samudrala R (2004) Improved protein structure selection using decoy-dependent discriminatory functions. BMC Struct Biol 4(1):8Google Scholar
  42. 42.
    Xia Y, Huang ES, Levitt M, Samudrala R (2000) Ab initio construction of protein tertiary structures using a hierarchical approach. J Mol Biol 300:171–185Google Scholar
  43. 43.
    Zhou H, Zhou Y (2002) Distance-scaled, finite ideal-gas reference state improves structure-derived potentials of mean force for structure selection and stability prediction. Protein Sci 11:2714–2726Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  1. 1.School of Computing ScienceMiddlesex UniversityLondon

Personalised recommendations