Improved scoring function for comparative modeling using the M4T method

  • Dmitry Rykunov
  • Elliot Steinberger
  • Carlos J. Madrid-Aliste
  • András Fiser
Article

Abstract

Improvements in comparative protein structure modeling for the remote target-template sequence similarity cases are possible through the optimal combination of multiple template structures and by improving the quality of target-template alignment. Recently developed MMM and M4T methods were designed to address these problems. Here we describe new developments in both the alignment generation and the template selection parts of the modeling algorithms. We set up a new scoring function in MMM to deliver more accurate target-template alignments. This was achieved by developing and incorporating into the composite scoring function a novel statistical pairwise potential that combines local and non-local terms. The non-local term of the statistical potential utilizes a shuffled reference state definition that helped to eliminate most of the false positive signal from the background distribution of pairwise contacts. The accuracy of the scoring function was further increased by using BLOSUM mutation table scores.

Keywords

Homology modeling Comparative modeling Multiple mapping method Target-template alignment Template selection 

Abbreviations

MMM

Multiple mapping method

M4T

Multiple mapping method with multiple templates

References

  1. 1.
    Burley SK, Almo SC, Bonanno JB, Capel M, Chance MR, Gaasterland T, Lin D, Sali A, Studier FW, Swaminathan S (1999) Structural genomics: beyond the human genome project. Nat Genet 23:151PubMedCrossRefGoogle Scholar
  2. 2.
    Manjasetty BA, Shi W, Zhan C, Fiser A, Chance MR (2007) A high-throughput approach to protein structure analysis. Genet Eng (N Y) 28:105–128CrossRefGoogle Scholar
  3. 3.
    Cardozo T, Totrov M, Abagyan R (1995) Homology modeling by the ICM method. Proteins 23:403. doi:10.1002/prot.340230314 PubMedCrossRefGoogle Scholar
  4. 4.
    Chothia C, Lesk AM, Levitt M, Amit AG, Mariuzza RA, Phillips SE, Poljak RJ (1986) The predicted structure of immunoglobulin D1.3 and its comparison with the crystal structure. Science 233:755. doi:10.1126/science.3090684 PubMedCrossRefGoogle Scholar
  5. 5.
    Fiser A (2004) Protein structure modeling in the proteomics era. Expert Rev Proteomics 1:97–110. doi:10.1586/14789450.1.1.97 PubMedCrossRefGoogle Scholar
  6. 6.
    Greer J (1981) Comparative model-building of the mammalian serine proteases. J Mol Biol 153:1027. doi:10.1016/0022-2836(81)90465-4 PubMedCrossRefGoogle Scholar
  7. 7.
    Levitt M (1992) Accurate modeling of protein conformation by automatic segment matching. J Mol Biol 226:507. doi:10.1016/0022-2836(92)90964-L PubMedCrossRefGoogle Scholar
  8. 8.
    Sutcliffe MJ, Haneef I, Carney D, Blundell TL (1987) Knowledge based modelling of homologous proteins, part I: three-dimensional frameworks derived from the simultaneous superposition of multiple structures. Protein Eng 1:377. doi:10.1093/protein/1.5.377 PubMedCrossRefGoogle Scholar
  9. 9.
    Yang AS, Honig B (1999) Sequence to structure alignment in comparative modeling using PrISM. Proteins 37:66. doi:10.1002/(SICI)1097-0134(1999)37:3+<66::AID-PROT10>3.0.CO;2-KCrossRefGoogle Scholar
  10. 10.
    Rai BK, Fiser A (2006) Multiple mapping method: a novel approach to the sequence-to-structure alignment problem in comparative protein structure modeling. Proteins Struct Funct Bioinform 63:644–661. doi:10.1002/prot.20835 CrossRefGoogle Scholar
  11. 11.
    Rai BK, Madrid-Aliste CJ, Fajardo JE, Fiser A (2006) MMM: a sequence-to-structure alignment protocol. Bioinformatics 22:2691–2692. doi:10.1093/bioinformatics/btl449 PubMedCrossRefGoogle Scholar
  12. 12.
    Sanchez R, Sali A (1997) Evaluation of comparative protein structure modeling by MODELLER-3. Proteins (Suppl 1):50–58. doi:10.1002/(SICI)1097-0134(1997)1+<50::AID-PROT8>3.0.CO;2-SGoogle Scholar
  13. 13.
    Venclovas C, Margelevicius M (2005) Comparative modeling in CASP6 using consensus approach to template selection, sequence-structure alignment, and structure assessment. Proteins 61(Suppl 7):99–105. doi:10.1002/prot.20725 PubMedCrossRefGoogle Scholar
  14. 14.
    Contreras-Moreira B, Fitzjohn PW, Offman M, Smith GR, Bates PA (2003) Novel use of a genetic algorithm for protein structure prediction: searching template and sequence alignment space. Proteins 53(Suppl 6):424–429. doi:10.1002/prot.10549 PubMedCrossRefGoogle Scholar
  15. 15.
    Fernandez-Fuentes N, Madrid-Aliste CJ, Rai BK, Fajardo JE, Fiser A (2007) M4T: a comparative protein structure modeling server. Nucleic Acids Res 35:W363–W368PubMedCrossRefGoogle Scholar
  16. 16.
    Fernandez-Fuentes N, Rai BK, Madrid-Aliste CJ, Eduardo Fajardo J, Fiser A (2007) Comparative protein structure modeling by combining multiple templates and optimizing sequence-to-structure alignments. Bioinformatics 23:2558–2565. doi:10.1093/bioinformatics/btm377 PubMedCrossRefGoogle Scholar
  17. 17.
    Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE (2000) The Protein Data Bank. Nucleic Acids Res 28:235. doi:10.1093/nar/28.1.235 PubMedCrossRefGoogle Scholar
  18. 18.
    Li W, Godzik A (2006) Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 22:1658. doi:10.1093/bioinformatics/btl158 PubMedCrossRefGoogle Scholar
  19. 19.
    Thompson JD, Higgins DG, Gibson TJ (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22:4673–4680. doi:10.1093/nar/22.22.4673 PubMedCrossRefGoogle Scholar
  20. 20.
    Madhusudhan MS, Marti-Renom MA, Sanchez R, Sali A (2006) Variable gap penalty for protein sequence-structure alignment. Protein Eng Des Sel 19:129–133. doi:10.1093/protein/gzj005 PubMedCrossRefGoogle Scholar
  21. 21.
    Sanchez R, Sali A (1998) Large-scale protein structure modeling of the Saccharomyces cerevisiae genome. Proc Natl Acad Sci USA 95:13597–13602. doi:10.1073/pnas.95.23.13597 PubMedCrossRefGoogle Scholar
  22. 22.
    Edgar RC (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32:1792–1797. doi:10.1093/nar/gkh340 PubMedCrossRefGoogle Scholar
  23. 23.
    Russell RB, Barton GJ (1992) Multiple protein sequence alignment from tertiary structure comparison: assignment of global and residue confidence levels. Proteins 14:309–323. doi:10.1002/prot.340140216 PubMedCrossRefGoogle Scholar
  24. 24.
    Shi J, Blundell TL, Mizuguchi K (2001) FUGUE: sequence-structure homology recognition using environment-specific substitution tables and structure-dependent gap penalties. J Mol Biol 310:243. doi:10.1006/jmbi.2001.4762 PubMedCrossRefGoogle Scholar
  25. 25.
    Rice DW, Eisenberg D (1997) A 3D–1D substitution matrix for protein fold recognition that includes predicted secondary structure of the sequence. J Mol Biol 267:1026–1038. doi:10.1006/jmbi.1997.0924 PubMedCrossRefGoogle Scholar
  26. 26.
    Miyazawa S, Jernigan RL (1996) Residue-residue potentials with a favorable contact pair term and an unfavorable high packing density term, for simulation and threading. J Mol Biol 256:623–644. doi:10.1006/jmbi.1996.0114 PubMedCrossRefGoogle Scholar
  27. 27.
    Henikoff S, Henikoff JG (1992) Amino acid substitution matrices from protein blocks. Proc Natl Acad Sci USA 89:10915–10919. doi:10.1073/pnas.89.22.10915 PubMedCrossRefGoogle Scholar
  28. 28.
    Rykunov D, Fiser A (2007) Effects of amino acid composition, finite size of proteins, and sparse statistics on distance-dependent statistical pair potentials. Proteins Struct Funct Bioinform 67:559–568. doi:10.1002/prot.21279 CrossRefGoogle Scholar
  29. 29.
    Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402. doi:10.1093/nar/25.17.3389 PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media B.V. 2008

Authors and Affiliations

  • Dmitry Rykunov
    • 1
    • 2
  • Elliot Steinberger
    • 1
    • 2
  • Carlos J. Madrid-Aliste
    • 1
    • 2
  • András Fiser
    • 1
    • 2
  1. 1.Department of Systems and Computational BiologyAlbert Einstein College of MedicineBronxUSA
  2. 2.Department of BiochemistryAlbert Einstein College of MedicineBronxUSA

Personalised recommendations