Skip to main content
Log in

Annotation of proteins of unknown function: initial enzyme results

  • Published:
Journal of Structural and Functional Genomics

Abstract

Working with a combination of ProMOL (a plugin for PyMOL that searches a library of enzymatic motifs for local structural homologs), BLAST and Pfam (servers that identify global sequence homologs), and Dali (a server that identifies global structural homologs), we have begun the process of assigning functional annotations to the approximately 3,500 structures in the Protein Data Bank that are currently classified as having “unknown function”. Using a limited template library of 388 motifs, over 500 promising in silico matches have been identified by ProMOL, among which 65 exceptionally good matches have been identified. The characteristics of the exceptionally good matches are discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

References

  1. Bernstein FC, Koetzle TF, Williams GJB et al (1977) The protein data bank: a computer-based archival file for macromolecular structures. J Mol Biol 112:535–542

    Article  CAS  PubMed  Google Scholar 

  2. Berman HM, Westbrook J, Feng Z et al (2000) The protein data bank. Nucleic Acids Res 28:235–242

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  3. Altschul SF, Madden TL, Schäffer AA et al (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402. doi:10.1093/nar/25.17.3389

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  4. Finn RD, Clements J, Eddy SR (2011) HMMER web server: interactive sequence similarity searching. Nucleic Acids Res 39:W29–W37. doi:10.1093/nar/gkr367

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  5. The Uniprot Consortium (2008) The universal protein resource (UniProt). Nucleic Acids Res 36:D190–D195. doi:10.1093/nar/gkm895

    Article  Google Scholar 

  6. Sonnhammer EL, Eddy SR, Durbin R (1997) Pfam: a comprehensive database of protein domain families based on seed alignments. Proteins 28:405–420

    Article  CAS  PubMed  Google Scholar 

  7. Finn RD, Miller BL, Clements J, Bateman A (2014) iPfam: a database of protein family and domain interactions found in the Protein Data Bank. Nucleic Acids Res 42:D364–D373. doi:10.1093/nar/gkt1210

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  8. Gifford LK, Carter LG, Gabanyi MJ et al (2012) The protein structure initiative structural biology knowledgebase technology portal: a structural biology web resource. J Struct Funct Genomics 13:57–62. doi:10.1007/s10969-012-9133-7

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  9. Holm L, Rosenström P (2010) Dali server: conservation mapping in 3D. Nucleic Acids Res 38:W545–W549. doi:10.1093/nar/gkq366

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  10. Fischer M, Zhang QC, Dey F et al (2011) MarkUs: a server to navigate sequence-structure-function space. Nucleic Acids Res 39:W357–W361. doi:10.1093/nar/gkr468

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  11. Hanson B, Westin C, Rosa M et al (2014) Estimation of protein function using template-based alignment of enzyme active sites. BMC Bioinformatics 15:87. doi:10.1186/1471-2105-15-87

    Article  PubMed Central  PubMed  Google Scholar 

  12. Delano WL. The PyMOL molecular graphics system. Schrodinger, LLC., San Carlos, CA, USA

  13. Porter CT (2004) The catalytic site atlas: a resource of catalytic sites and residues identified in enzymes using structural data. Nucleic Acids Res 32:D129–D133. doi:10.1093/nar/gkh028

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  14. Berman HM, Westbrook JD, Gabanyi MJ et al (2009) The protein structure initiative structural genomics knowledgebase. Nucleic Acids Res 37:D365–D368. doi:10.1093/nar/gkn790

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  15. Torrance JW, Bartlett GJ, Porter CT, Thornton JM (2005) Using a library of structural templates to recognise catalytic sites and explore their evolution in homologous families. J Mol Biol 347:565–581. doi:10.1016/j.jmb.2005.01.044

    Article  CAS  PubMed  Google Scholar 

  16. Trott O, Olson AJ (2010) AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J Comput Chem 31:455–461. doi:10.1002/jcc.21334

    PubMed Central  CAS  PubMed  Google Scholar 

  17. Seiler CY, Park JG, Sharma A et al (2014) DNASU plasmid and PSI:biology-materials repositories: resources to accelerate biological research. Nucleic Acids Res 42:D1253–D1260. doi:10.1093/nar/gkt1060

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  18. Cormier C, Park J, Fiacco M et al (2011) PSI:biology-materials repository: a biologist’s resource for protein expression plasmids. J Struct Funct Genomics 12:55–62. doi:10.1007/s10969-011-9100-8

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  19. Cormier C, Mohr S, Zuo D et al (2010) Protein structure initiative material repository: an open shared public resource of structural genomics plasmids for the biological community. Nucleic Acids Res 38:D743–D749. doi:10.1093/nar/gkp999

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  20. Vedadi M, Lew J, Artz J et al (2005) Genome-scale protein expression and structural biology of Plasmodium falciparum and related Apicomplexan organisms. Mol Biochem Parasitol 151:100–110. doi:10.1016/j.molbiopara.2006.10.011

    Article  Google Scholar 

  21. Tan K, Rakowski E, Jedrzejczak R, Joachimiak A (2009) The crystal structure of a functionally unknown conserved protein from Enterococcus faecalis V583. doi: 10.2210/pdb3l1w/pdb

  22. Mol CD, Kuo C-F, Thayer MM et al (1995) Structure and function of the multifunctional DNA-repair enzyme exonuclease III. Nature 374:381–386. doi:10.1038/374381a0

    Article  CAS  PubMed  Google Scholar 

  23. Kuzin AP, Chen Y, Seetharaman J et al (2006) X-Ray structure of the hypothetical protein YXIM_BACsu from Bacillus subtilis. doi: 10.2210/pdb2o14/pdb

  24. Ho YS, Sheffield PJ, Masuyama J et al (1999) Probing the substrate specificity of the intracellular brain platelet-activating factor acetylhydrolase. Protein Eng 12:693–700

    Article  CAS  PubMed  Google Scholar 

  25. Burkhard P, Taylor P, Walkinshaw MD (2000) X-ray structures of small ligand-FKBP complexes provide an estimate for hydrophobic interaction energies. J Mol Biol 295:953–962. doi:10.1006/jmbi.1999.3411

    Article  CAS  PubMed  Google Scholar 

  26. Patel S, Albert A, Blundell TL (2001) Hal2p: Ion selectivity and implications on inhibition mechanism. doi: 10.2210/pdb1k9z/pdb

  27. Mursula AM, Hiltunen JK, Wierenga RK (2003) Structural studies on delta(3)-delta(2)-enoyl-CoA isomerase: the variable mode of assembly of the trimeric disks of the crotonase superfamily. FEBS Lett 557:81–87. doi:10.1016/S0014-5793(03)01450-9

    Article  Google Scholar 

  28. Joint Center for Structural Genomics (JCSG) (2012) Crystal structure of a hypothetical protein (BACUNI_01323) from Bacteroides uniformis ATCC 8492 at 2.32 A resolution. doi: 10.2210/pdb4ghb/pdb

  29. Sundaresan V, Yamaguchi M, Chartron J, Stout CD (2003) Conformational change in the NADP(H) binding domain of transhydrogenase defines four states. Biochemistry 42:12143–12153. doi:10.1021/bi035006q

    Article  CAS  Google Scholar 

  30. Kim Y, Skarina T, Beasley S et al (2001) Crystal structure of Escherichia coli EC1530, a glyoxylate induced protein YgbM. Proteins 48:427–430. doi:10.1002/prot.10160

    Article  CAS  Google Scholar 

  31. Schmitt E, Mechulam Y, Fromant M et al (1997) Crystal structure at 1.2 A resolution and active site mapping of Escherichia coli peptidyl-tRNA hydrolase. EMBO J 16:4760–4769. doi:10.1093/emboj/16.15.4760

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  32. Inoue M, Kigawa T, Yokoyama S (2002) Solution structure of the cullin-3 homologue. doi: 10.2210/pdb1iuy/pdb

  33. Van Pouderoyen G, Snijder HJ, Benen JA, Dijkstra BW (2002) Structural insights into the processivity of endopolygalacturonase I from Aspergillus niger. FEBS Lett 554:462–466. doi:10.1016/S0014-5793(03)01221-3

    Article  Google Scholar 

  34. Christen B, Hornemann S, Damberger FF, Wüthrich K (2009) Prion protein NMR structure from tammar wallaby (Macropus eugenii) shows that the beta2-alpha2 loop is modulated by long-range sequence effects. J Mol Biol 389:833–845. doi:10.1016/j.jmb.2009.04.040

    Article  CAS  PubMed  Google Scholar 

  35. Teplyakov A, Obmolova G, Badet-Denisot MA et al (1997) Involvement of the C terminus in intramolecular nitrogen channeling in glucosamine 6-phosphate synthase: evidence from a 1.6 A crystal structure of the isomerase domain. Structure 6:1047–1055. doi:10.1016/S0969-2126(98)00105-1

    Article  Google Scholar 

  36. Zahn R, Liu A, Luhrs T et al (1999) NMR solution structure of the human prion protein. Proc Natl Acad Sci 97:145–null. doi: 10.1073/PNAS.97.1.145

  37. Gorman J, Shapiro L (2003) Structural Genomics target NYSGRC-T920 related to A/B hydrolase fold. doi: 10.2210/pdb1r3d/pdb

  38. Min JR, Antoshenko T, Hong W et al (2005) Crystal structure of acetyltransferases domain of human testis-specific chromodomain protein Y 1. doi: 10.2210/pdb2fbm/pdb

  39. Nocek B, Borovilos M, Clancy S, Joachimiak A (2006) Crystal structure of hypothetical protein MM_3350 from Methanosarcina mazei Go1. doi: 10.2210/pdb2i1s/pdb

  40. Chang C, Chhor G, Cobb G, Joachimiak A (2009) Crystal structure of uncharacterized protein BP1543 from Bordetella pertussis Tohama I. doi: 10.2210/pdb3kk4/pdb

  41. Vorobiev S, Scott L, Schauder C et al (2011) PDB ID: 3HFQ Crystal structure of the lp_2219 protein from Lactobacillus plantarum. doi:10.2210/pdb3hfq/pdb

  42. Stone CB, Sugiman-Marangos SN, Junop MS, Mahony JB (2011) Crystal Structure of Cpn0803 from C. pneumoniae. doi: 10.2210/pdb3q9d/pdb

  43. Joint Center for Structural Genomics (JCSG) (2012) Crystal structure of a hypothetical protein (lpg1103) from Legionella pneumophila subsp. pneumophila str. Philadelphia 1 at 1.15 A resolution. doi: 10.2210/pdb4ezi/pdb

  44. Jiang M, Chen X, Wu X-H et al (2009) Catalytic mechanism of SHCHC synthase in the menaquinone biosynthesis of Escherichia coli: identification and mutational analysis of the active site residues. Biochemistry 48:6921–6931. doi:10.1021/bi900897h

    Article  CAS  Google Scholar 

  45. Holden HM, Benning MM, Haller T, Gerlt JA (2001) The crotonase superfamily: divergently related enzymes that catalyze different reactions involving acyl coenzyme a thioesters. Acc Chem Res 34:145–157. doi:10.1021/ar000053l

    Article  CAS  PubMed  Google Scholar 

  46. Kajander T, Merckel MC, Thompson A et al (2001) The structure of Neurospora crassa 3-carboxy-cis, cis-muconate lactonizing enzyme, a beta propeller cycloisomerase. Structure 10:483–492. doi:10.1016/S0969-2126(02)00744-X

    Article  Google Scholar 

  47. Sievers F, Wilm A, Dineen D et al (2011) Fast, scalable generation of high-quality protein multiple sequence alignments using clustal omega. Mol Syst Biol 7:539. doi:10.1038/msb.2011.75

    Article  PubMed Central  PubMed  Google Scholar 

  48. Goujon M, McWilliam H, Li W et al (2010) A new bioinformatics analysis tools framework at EMBL–EBI. Nucleic Acids Res 38:W695–W699. doi:10.1093/nar/gkq313

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  49. Kleerebezem M, Boekhorst J, van Kranenburg R et al (2003) Complete genome sequence of Lactobacillus plantarum WCFS1. Proc Natl Acad Sci 100:1990–1995. doi:10.1073/pnas.0337704100

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  50. The Gene Ontology Consortium (2013) Gene ontology annotations and resources. Nucleic Acids Res 41:D530–D535. doi:10.1093/nar/gks1050

    Article  PubMed Central  Google Scholar 

  51. Hunter S, Jones P, Mitchell A et al (2012) InterPro in 2011: new developments in the family and domain prediction database. Nucleic Acids Res 40:D306–D312. doi:10.1093/nar/gkr948

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  52. Lima CD, Kniewel R, Solorzano V, Wu J (2003) Structure of a putative 7-bladed propeller isomerase. doi: 10.2210/pdb1ri6/pdb

Download references

Acknowledgments

NIGMS 2R15GM078077-02, NIGMS 3R15GM078077-02S1, NIGMS 3R15GM078077-02S2, Dowling College, Rochester Institute of Technology. We would like to thank the following team members who supported our efforts on this project: (from RIT) Weinishet Tedla-Boyd, Tananda Richards; (from Dowling College) Mogjan Asadi, Limone Rosa.

Conflict of interest

None of the authors have reported any conflicts of interest in the completion of the research described in this manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Paul A. Craig.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (DOC 305 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

McKay, T., Hart, K., Horn, A. et al. Annotation of proteins of unknown function: initial enzyme results. J Struct Funct Genomics 16, 43–54 (2015). https://doi.org/10.1007/s10969-015-9194-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10969-015-9194-5

Keywords

Navigation