Protein Structure Modeling with MODELLER

  • Benjamin Webb
  • Andrej Sali
Part of the Methods in Molecular Biology book series (MIMB, volume 1137)


Genome sequencing projects have resulted in a rapid increase in the number of known protein sequences. In contrast, only about one-hundredth of these sequences have been characterized at atomic resolution using experimental structure determination methods. Computational protein structure modeling techniques have the potential to bridge this sequence–structure gap. In this chapter, we present an example that illustrates the use of MODELLER to construct a comparative model for a protein with unknown structure. Automation of a similar protocol has resulted in models of useful accuracy for domains in more than half of all known protein sequences.


Comparative modeling Fold assignment Sequence–structure alignment Model assessment Multiple templates 



We are grateful to all members of our research group. We also acknowledge support from National Institutes of Health (U54 GM094625) as well as computing hardware support from Ron Conway, Mike Homer, Hewlett-Packard, NetApp, IBM, and Intel.


  1. 1.
    Baker D, Sali A (2001) Protein structure prediction and structural genomics. Science 294(5540):93–96PubMedCrossRefGoogle Scholar
  2. 2.
    Schwede T, Sali A, Honig B et al (2009) Outcome of a workshop on applications of protein models in biomedical research. Structure 17(2):151–159PubMedCentralPubMedCrossRefGoogle Scholar
  3. 3.
    Zhang Y (2008) Progress and challenges in protein structure prediction. Curr Opin Struct Biol 18(3):342–348PubMedCentralPubMedCrossRefGoogle Scholar
  4. 4.
    Marti-Renom MA, Stuart AC, Fiser A et al (2000) Comparative protein structure modeling of genes and genomes. Annu Rev Biophys Biomol Struct 29:291–325PubMedCrossRefGoogle Scholar
  5. 5.
    Eswar N, Sali A (2009) Protein structure modeling. In: Sussman JL, Spadon P (eds) From molecules to medicine, structure of biological macromolecules and its relevance in combating new diseases and bioterrorism, NATO science for peace and security series: a—chemistry and biology. Springer, Dordrecht, The Netherlands, pp 139–151Google Scholar
  6. 6.
    Ginalski K (2006) Comparative modeling for protein structure prediction. Curr Opin Struct Biol 16(2):172–177PubMedCrossRefGoogle Scholar
  7. 7.
    Das R, Baker D (2008) Macromolecular modeling with rosetta. Annu Rev Biochem 77:363–382PubMedCrossRefGoogle Scholar
  8. 8.
    Zhang Y, Skolnick J (2004) Automated structure prediction of weakly homologous proteins on a genomic scale. Proc Natl Acad Sci USA 101(20):7594–7599PubMedCentralPubMedCrossRefGoogle Scholar
  9. 9.
    Simons KT, Bonneau R, Ruczinski I, et al (1999) Ab initio protein structure prediction of CASP III targets using ROSETTA. Proteins (Suppl 3):171–176Google Scholar
  10. 10.
    Pieper U, Webb BM, Barkan DT et al (2011) ModBase, a database of annotated comparative protein structure models, and associated resources. Nucleic Acids Res 39:465–474CrossRefGoogle Scholar
  11. 11.
    Fiser A, Do RKG, Sali A (2000) Modeling of loops in protein structures. Protein Sci 9(9):1753–1773PubMedCentralPubMedCrossRefGoogle Scholar
  12. 12.
    Sali A, Blundell TL (1993) Comparative protein modelling by satisfaction of spatial restraints. J Mol Biol 234(3):779–815PubMedCrossRefGoogle Scholar
  13. 13.
    Marti-Renom MA, Madhusudhan MS, Sali A (2004) Alignment of protein sequences by their profiles. Protein Sci 13(4):1071–1087PubMedCentralPubMedCrossRefGoogle Scholar
  14. 14.
    Madhusudhan MS, Marti-Renom MA, Sanchez R et al (2006) Variable gap penalty for protein sequence-structure alignment. Protein Eng Des Sel 19(3):129–133PubMedCrossRefGoogle Scholar
  15. 15.
    Madhusudhan MS, Webb BM, Marti-Renom MA et al (2009) Alignment of multiple protein structures based on sequence and structure features. Protein Eng Des Sel 22:569–574PubMedCentralPubMedCrossRefGoogle Scholar
  16. 16.
    Brooks BR, Brooks CL 3rd, Mackerell AD Jr et al (2009) CHARMM: the biomolecular simulation program. J Comput Chem 30(10):1545–1614PubMedCentralPubMedCrossRefGoogle Scholar
  17. 17.
    Sali A, Overington JP (1994) Derivation of rules for comparative protein modeling from a database of protein structure alignments. Protein Sci 3(9):1582–1596PubMedCentralPubMedCrossRefGoogle Scholar
  18. 18.
    Shen MY, Sali A (2006) Statistical potential for assessment and prediction of protein structures. Protein Sci 15(11):2507–2524PubMedCentralPubMedCrossRefGoogle Scholar
  19. 19.
    Wu G, Fiser A, ter Kuile B et al (1999) Convergent evolution of Trichomonas vaginalis lactate dehydrogenase from malate dehydrogenase. Proc Natl Acad Sci USA 96(11):6285–6290PubMedCentralPubMedCrossRefGoogle Scholar
  20. 20.
    Berman HM, Westbrook J, Feng Z et al (2000) The protein data bank. Nucleic Acids Res 28(1):235–242PubMedCentralPubMedCrossRefGoogle Scholar
  21. 21.
    Smith TF, Waterman MS (1981) Identification of common molecular subsequences. J Mol Biol 147(1):195–197PubMedCrossRefGoogle Scholar
  22. 22.
    Needleman SB, Wunsch CD (1970) A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol 48(3):443–453PubMedCrossRefGoogle Scholar
  23. 23.
    John B, Sali A (2003) Comparative protein structure modeling by iterative alignment, model building and model assessment. Nucleic Acids Res 31(14):3982–3992PubMedCentralPubMedCrossRefGoogle Scholar
  24. 24.
    Melo F, Sanchez R, Sali A (2002) Statistical potentials for fold assessment. Protein Sci 11(2):430–448PubMedCentralPubMedCrossRefGoogle Scholar
  25. 25.
    Eramian D, Eswar N, Shen M et al (2008) How well can the accuracy of comparative protein structure models be predicted? Protein Sci 17(11):1881–1893PubMedCentralPubMedCrossRefGoogle Scholar
  26. 26.
    Vajda S, Kozakov D (2009) Convergence and combination of methods in protein–protein docking. Curr Opin Struct Biol 19(2):164–170PubMedCentralPubMedCrossRefGoogle Scholar
  27. 27.
    Lensink MF, Wodak SJ (2010) Docking and scoring protein interactions: CAPRI 2009. Proteins 78(15):3073–3084PubMedCrossRefGoogle Scholar
  28. 28.
    Alber F, Forster F, Korkin D et al (2008) Integrating diverse data for structure determination of macromolecular assemblies. Annu Rev Biochem 77:443–477PubMedCrossRefGoogle Scholar
  29. 29.
    Russel D, Lasker K, Webb B et al (2012) Putting the pieces together: integrative structure determination of macromolecular assemblies. PLoS Biol 10(1):e1001244PubMedCentralPubMedCrossRefGoogle Scholar
  30. 30.
    Robinson C, Sali A, Baumeister W (2007) The molecular sociology of the cell. Nature 450(7172):973–982PubMedCrossRefGoogle Scholar
  31. 31.
    Ward A, Sali A, Wilson I (2013) Structural biology unleashed. Science 339:913–915PubMedCentralPubMedCrossRefGoogle Scholar
  32. 32.
    Lasker K, Sali A, Wolfson HJ (2010) Determining macromolecular assembly structures by molecular docking and fitting into an electron density map. Proteins Struct Funct Bioinform 78:3205–3211CrossRefGoogle Scholar
  33. 33.
    Tjioe E, Lasker K, Webb B et al (2011) MultiFit: a web server for fitting multiple protein structures into their electron microscopy density map. Nucleic Acids Res 39:167–170CrossRefGoogle Scholar
  34. 34.
    Schneidman-Duhovny D, Hammel M, Sali A (2011) Macromolecular docking restrained by a small angle X-ray scattering profile. J Struct Biol 3:461–471CrossRefGoogle Scholar
  35. 35.
    Rost B (1999) Twilight zone of protein sequence alignments. Protein Eng 12(2):85–94PubMedCrossRefGoogle Scholar
  36. 36.
    May AC (2004) Percent sequence identity; the need to be explicit. Structure 12(5):737–738PubMedCrossRefGoogle Scholar
  37. 37.
    Altschul SF, Madden TL, Schaffer AA et al (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25(17):3389–3402PubMedCentralPubMedCrossRefGoogle Scholar
  38. 38.
    Pearson WR (1998) Empirical statistical estimates for sequence similarity searches. J Mol Biol 276(1):71–84PubMedCrossRefGoogle Scholar
  39. 39.
    Henikoff S, Henikoff JG (1992) Amino acid substitution matrices from protein blocks. Proc Natl Acad Sci USA 89(22):10915–10919PubMedCentralPubMedCrossRefGoogle Scholar
  40. 40.
    Zhou H, Zhou Y (2005) Fold recognition by combining sequence profiles derived from evolution and from depth-dependent structural alignment of fragments. Proteins 58(2):321–328PubMedCentralPubMedCrossRefGoogle Scholar
  41. 41.
    McGuffin LJ, Jones DT (2003) Improvement of the GenTHREADER method for genomic fold recognition. Bioinformatics 19(7):874–881PubMedCrossRefGoogle Scholar
  42. 42.
    Karchin R, Cline M, Mandel-Gutfreund Y et al (2003) Hidden Markov models that use predicted local structure for fold recognition: alphabets of backbone geometry. Proteins 51(4):504–514PubMedCrossRefGoogle Scholar
  43. 43.
    Shi J, Blundell TL, Mizuguchi K (2001) FUGUE: sequence-structure homology recognition using environment-specific substitution tables and structure-dependent gap penalties. J Mol Biol 310(1):243–257PubMedCrossRefGoogle Scholar
  44. 44.
    Dunbrack RL Jr (2006) Sequence comparison and protein structure prediction. Curr Opin Struct Biol 16(3):374–384PubMedCrossRefGoogle Scholar
  45. 45.
    Xiang Z (2006) Advances in homology protein structure modeling. Curr Protein Pept Sci 7(3):217–227PubMedCentralPubMedCrossRefGoogle Scholar
  46. 46.
    Eramian D, Shen M, Devos D et al (2006) A composite score for predicting errors in protein structure models. Protein Sci 15(7):1653–1666PubMedCentralPubMedCrossRefGoogle Scholar
  47. 47.
    Jacobson MP, Pincus DL, Rapp CS et al (2004) A hierarchical approach to all-atom protein loop prediction. Proteins 55(2):351–367PubMedCrossRefGoogle Scholar
  48. 48.
    Zhao S, Zhu K, Li J et al (2011) Progress in super long loop prediction. Proteins 79(10):2920–2935PubMedCentralPubMedCrossRefGoogle Scholar
  49. 49.
    Fernandez-Fuentes N, Oliva B, Fiser A (2006) A supersecondary structure library and search algorithm for modeling loops in protein structures. Nucleic Acids Res 34(7):2085–2097PubMedCentralPubMedCrossRefGoogle Scholar
  50. 50.
    van Vlijmen HW, Karplus M (1997) PDB-based protein loop prediction: parameters for selection and methods for optimization. J Mol Biol 267(4):975–1001PubMedCrossRefGoogle Scholar
  51. 51.
    Coutsias EA, Seok C, Jacobson MP et al (2004) A kinematic view of loop closure. J Comput Chem 25(4):510–528PubMedCrossRefGoogle Scholar
  52. 52.
    Sanchez R, Sali A (1997) Evaluation of comparative protein structure modeling by MODELLER-3. Proteins (Suppl 1):50–58Google Scholar
  53. 53.
    Srinivasan N, Blundell TL (1993) An evaluation of the performance of an automated procedure for comparative modelling of protein tertiary structure. Protein Eng 6(5):501–512PubMedCrossRefGoogle Scholar
  54. 54.
    Sanchez R, Sali A (1998) Large-scale protein structure modeling of the Saccharomyces cerevisiae genome. Proc Natl Acad Sci USA 95(23):13597–13602PubMedCentralPubMedCrossRefGoogle Scholar
  55. 55.
    Chothia C, Lesk AM (1986) The relation between the divergence of sequence and structure in proteins. EMBO J 5(4):823–826PubMedCentralPubMedGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  • Benjamin Webb
    • 1
    • 2
  • Andrej Sali
    • 1
    • 2
  1. 1.Department of Bioengineering and Therapeutic SciencesUniversity of CaliforniaSan FranciscoUSA
  2. 2.Department of Pharmaceutical Chemistry, California Institute for Quantitative Biosciences (QB3)University of California San FranciscoSan FranciscoUSA

Personalised recommendations