Skip to main content
Log in

A simple recipe for the non-expert bioinformaticist for building experimentally-testable hypotheses for proteins with no known homologs

  • Published:
Journal of Structural and Functional Genomics

Abstract

The study of the protein–protein interactions (PPIs) of unique ORFs is a strategy for deciphering the biological roles of unique ORFs of interest. For uniform reference, we define unique ORFs as those for which no matching protein is found after PDB-BLAST search with default parameters. The uniqueness of the ORFs generally precludes the straightforward use of structure-based approaches in the design of experiments to explore PPIs. Many open-source bioinformatics tools, from the commonly-used to the relatively esoteric, have been built and validated to perform analyses and/or predictions of sorts on proteins. How can these available tools be combined into a protocol that helps the non-expert bioinformaticist researcher to design experiments to explore the PPIs of their unique ORF? Here we define a pragmatic protocol based on accessibility of software to achieve this and we make it concrete by applying it on two proteins—the ImuB and ImuA’ proteins from Mycobacterium tuberculosis. The protocol is pragmatic in that decisions are made largely based on the availability of easy-to-use freeware. We define the following basic and user-friendly software pathway to build testable PPI hypotheses for a query protein sequence: PSI-PRED → MUSTER → metaPPISP → ASAView and ConSurf. Where possible, other analytical and/or predictive tools may be included. Our protocol combines the software predictions and analyses with general bioinformatics principles to arrive at consensus, prioritised and testable PPI hypotheses.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Abbreviations

ORF:

Open reading frame

TB:

Tuberculosis

PPIs:

Protein–protein interactions

References

  1. Skolnick J, Fetrow JS, Kolinski A (2000) Nat Biotechnol 18:283–287

    Article  PubMed  CAS  Google Scholar 

  2. Chandonia JM, Kim SH, Brenner SE (2006) Proteins 62:356–370

    Article  PubMed  CAS  Google Scholar 

  3. Marsden RL, Lewis TA, Orengo CA (2007) BMC Bioinform 8:86

    Article  Google Scholar 

  4. Pir P, Ulgen KO, Hayes A, Ilsen Onsan Z, Kirdar B, Oliver SG (2006) Yeast 23:553–571

    Article  PubMed  CAS  Google Scholar 

  5. Bryan K, Cunningham P (2008) BMC Genomics 9(Suppl 2):S20

    Article  PubMed  Google Scholar 

  6. Warner DF, Ndwandwe DE, Abrahams GL, Kana BD, Machowski EE, Venclovas C, Mizrahi V (2010) Proc Natl Acad Sci USA 107:13093–13098

    Article  PubMed  CAS  Google Scholar 

  7. Sanchez R, Pieper U, Melo F, Eswar N, Marti-Renom MA, Madhusudhan MS, Mirkovic N, Sali A (2000) Nat Struct Biol 7(Suppl):986–990

    Article  PubMed  CAS  Google Scholar 

  8. Lichtarge O (2001) Nat Struct Biol 8:918–920

    Article  PubMed  CAS  Google Scholar 

  9. Dokholyan NV, Shakhnovich EI (2001) J Mol Biol 312:289–307

    Article  PubMed  CAS  Google Scholar 

  10. Lupas AN, Ponting CP, Russell RB (2001) J Struct Biol 134:191–203

    Article  PubMed  CAS  Google Scholar 

  11. Holm L, Sander C (1993) J Mol Biol 233:123–138

    Article  PubMed  CAS  Google Scholar 

  12. Holm L, Sander C (1997) Proteins 28:72–82

    Article  PubMed  CAS  Google Scholar 

  13. Rost B (1997) Fold Des 2:S19–S24

    Article  PubMed  CAS  Google Scholar 

  14. Rychlewski L, Jaroszewski L, Li W, Godzik A (2000) Protein Sci 9:232–241

    Article  PubMed  CAS  Google Scholar 

  15. Wu S, Zhang Y (2008) Proteins 72:547–556

    Article  PubMed  CAS  Google Scholar 

  16. Tuncbag N, Kar G, Keskin O, Gursoy A, Nussinov R (2009) Brief Bioinform 10:217–232

    Article  PubMed  CAS  Google Scholar 

  17. Fernandez-Recio J (2011) Wiley Interdiscip Rev Comput Mol Sci 1:680–698

    Article  CAS  Google Scholar 

  18. Qin S, Zhou HX (2007) Bioinformatics 23:3386–3387

    Article  PubMed  CAS  Google Scholar 

  19. Ofran Y, Rost B (2007) PLoS Comput Biol 3:e119

    Article  PubMed  Google Scholar 

  20. Xia JF, Zhao XM, Song J, Huang DS (2010) BMC Bioinform 11:174

    Article  Google Scholar 

  21. Lise S, Buchan D, Pontil M, Jones DT (2011) PLoS ONE 6:e16774

    Article  PubMed  CAS  Google Scholar 

  22. Bogan AA, Thorn KS (1998) J Mol Biol 280:1–9

    Article  PubMed  CAS  Google Scholar 

  23. Lo Conte L, Chothia C, Janin J (1999) J Mol Biol 285:2177–2198

    Article  PubMed  CAS  Google Scholar 

  24. Chen R, Chen W, Yang S, Wu D, Wang Y, Tian Y, Shi Y (2011) BMC Bioinform 12:311

    Article  CAS  Google Scholar 

  25. Li J, Liu Q (2009) Bioinformatics 25:743–750

    Article  PubMed  CAS  Google Scholar 

  26. Liu Q, Li J (2010) BMC Bioinform 11:244

    Article  Google Scholar 

  27. Ashkenazy H, Erez E, Martz E, Pupko T, Ben-Tal N (2010) Nucleic Acids Res 38:W529–W533

    Article  PubMed  CAS  Google Scholar 

  28. Ma B, Elkayam T, Wolfson H, Nussinov R (2003) Proc Natl Acad Sci USA 100:5772–5777

    Article  PubMed  CAS  Google Scholar 

  29. Boshoff HI, Reed MB, Barry CE III, Mizrahi V (2003) Cell 113:183–193

    Article  PubMed  CAS  Google Scholar 

  30. Galhardo RS, Rocha RP, Marques MV, Menck CF (2005) Nucleic Acids Res 33:2603–2614

    Article  PubMed  CAS  Google Scholar 

  31. Bryson K, McGuffin LJ, Marsden RL, Ward JJ, Sodhi JS, Jones DT (2005) Nucleic Acids Res 33:W36–W38

    Article  PubMed  CAS  Google Scholar 

  32. McGuffin LJ, Bryson K, Jones DT (2000) Bioinformatics 16:404–405

    Article  PubMed  CAS  Google Scholar 

  33. Eyrich VA, Marti-Renom MA, Przybylski D, Madhusudhan MS, Fiser A, Pazos F, Valencia A, Sali A, Rost B (2001) Bioinformatics 17:1242–1243

    Article  PubMed  CAS  Google Scholar 

  34. Rost B, Eyrich VA (2001) Proteins Suppl 5:192–199

    Article  Google Scholar 

  35. Rost B (2001) J Struct Biol 134:204–218

    Article  PubMed  CAS  Google Scholar 

  36. Xu J, Jiao F, Yu L (2008) Methods Mol Biol 413:91–121

    PubMed  CAS  Google Scholar 

  37. Godzik A (2003) Methods Biochem Anal 44:525–546

    PubMed  CAS  Google Scholar 

  38. Zhou H, Skolnick J (2010) Proteins 78:2041–2048

    PubMed  CAS  Google Scholar 

  39. Zhang Y (2008) BMC Bioinform 9:40

    Article  Google Scholar 

  40. Roy A, Kucukural A, Zhang Y (2010) Nat Protoc 5:725–738

    Article  PubMed  CAS  Google Scholar 

  41. Rost B, Yachdav G, Liu J (2004) Nucleic Acids Res 32:W321–W326

    Article  PubMed  CAS  Google Scholar 

  42. Sali A, Blundell TL (1993) J Mol Biol 234:779–815

    Article  PubMed  CAS  Google Scholar 

  43. Arnold K, Bordoli L, Kopp J, Schwede T (2006) Bioinformatics 22:195–201

    Article  PubMed  CAS  Google Scholar 

  44. Schwede T, Kopp J, Guex N, Peitsch MC (2003) Nucleic Acids Res 31:3381–3385

    Article  PubMed  CAS  Google Scholar 

  45. Guex N, Peitsch MC (1997) Electrophoresis 18:2714–2723

    Article  PubMed  CAS  Google Scholar 

  46. Arnold K, Kiefer F, Kopp J, Battey JN, Podvinec M, Westbrook JD, Berman HM, Bordoli L, Schwede T (2009) J Struct Funct Genomics 10:1–8

    Article  PubMed  CAS  Google Scholar 

  47. Berman HM, Westbrook JD, Gabanyi MJ, Tao W, Shah R, Kouranov A, Schwede T, Arnold K, Kiefer F, Bordoli L et al (2009) Nucleic Acids Res 37:D365–D368

    Article  PubMed  CAS  Google Scholar 

  48. Roy A, Yang J, Zhang Y (2012) Nucleic Acids Res 1–7

  49. Hulo N, Bairoch A, Bulliard V, Cerutti L, Cuche BA, de Castro E, Lachaize C, Langendijk-Genevaux PS, Sigrist CJ (2008) Nucleic Acids Res 36:D245–D249

    Article  PubMed  CAS  Google Scholar 

  50. Chen Z, Yang H, Pavletich NP (2008) Nature 453:484–489

    Google Scholar 

  51. Thompson JD, Higgins DG, Gibson TJ (1994) Nucleic Acids Res 22:4673–4680

    Article  PubMed  CAS  Google Scholar 

  52. Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG (1997) Nucleic Acids Res 25:4876–4882

    Article  PubMed  CAS  Google Scholar 

  53. Ahmad S, Gromiha M, Fawareh H, Sarai A (2004) BMC Bioinform 5:51

    Article  Google Scholar 

  54. Sammond DW, Eletr ZM, Purbeck C, Kimple RJ, Siderovski DP, Kuhlman B (2007) J Mol Biol 371:1392–1404

    Article  PubMed  CAS  Google Scholar 

  55. Glaser F, Pupko T, Paz I, Bell RE, Bechor-Shental D, Martz E, Ben-Tal N (2003) Bioinformatics 19:163–164

    Article  PubMed  CAS  Google Scholar 

  56. Landau M, Mayrose I, Rosenberg Y, Glaser F, Martz E, Pupko T, Ben-Tal N (2005) Nucleic Acids Res 33:W299–W302

    Article  PubMed  CAS  Google Scholar 

  57. Valencia A (2005) Curr Opin Struct Biol 15:267–274

    Article  PubMed  CAS  Google Scholar 

  58. Pazhouhandeh M, Dieterle M, Marrocco K, Lechner E, Berry B, Brault V, Hemmer O, Kretsch T, Richards KE, Genschik P et al (2006) Proc Natl Acad Sci USA 103:1994–1999

    Article  PubMed  CAS  Google Scholar 

  59. Nolandt O, Kern V, Muller H, Pfaff E, Theilmann L, Welker R, Krausslich HG (1997) J Gen Virol 78(Pt 6):1331–1340

    PubMed  CAS  Google Scholar 

  60. Homann HE, Willenbrink W, Buchholz CJ, Neubert WJ (1991) J Virol 65:1304–1309

    PubMed  CAS  Google Scholar 

  61. Schell-Steven A, Stein K, Amoros M, Landgraf C, Volkmer-Engert R, Rottensteiner H, Erdmann R (2005) Mol Cell Biol 25:3007–3018

    Article  PubMed  CAS  Google Scholar 

  62. Rubinson EH, Metz AH, O’Quin J, Eichman BF (2008) J Mol Biol 381:13–23

    Article  PubMed  CAS  Google Scholar 

  63. Taylor EJ, Smith NL, Turkenburg JP, D’Souza S, Gilbert HJ, Davies GJ (2006) Biochem J 395:31–37

    Article  PubMed  CAS  Google Scholar 

  64. Zoltowski BD, Vaidya AT, Top D, Widom J, Young MW, Crane BR (2011) Nature 480:396–399

    Article  PubMed  CAS  Google Scholar 

  65. Girzalsky W, Rehling P, Stein K, Kipper J, Blank L, Kunau WH, Erdmann R (1999) J Cell Biol 144:1151–1162

    Article  PubMed  CAS  Google Scholar 

  66. Douangamath A, Filipp FV, Klein AT, Barnett P, Zou P, Voorn-Brouwer T, Vega MC, Mayans OM, Sattler M, Distel B et al (2002) Mol Cell 10:1007–1017

    Article  PubMed  CAS  Google Scholar 

  67. Pires JR, Hong X, Brockmann C, Volkmer-Engert R, Schneider-Mergener J, Oschkinat H, Erdmann R (2003) J Mol Biol 326:1427–1435

    Article  PubMed  CAS  Google Scholar 

  68. Neufeld C, Filipp FV, Simon B, Neuhaus A, Schuller N, David C, Kooshapur H, Madl T, Erdmann R, Schliebs W et al (2009) EMBO J 28:745–754

    Article  PubMed  CAS  Google Scholar 

  69. Marrero A, Duquerroy S, Trapani S, Goulas T, Guevara T, Andersen GR, Navaza J, Sottrup-Jensen L, Gomis-Ruth FX (2012) Angew Chem Int Ed Engl 3340–3344

  70. Kelley LA, Sternberg MJ (2009) Nat Protoc 4:363–371

    Article  PubMed  CAS  Google Scholar 

  71. Ohlson T, Wallner B, Elofsson A (2004) Proteins 57:188–197

    Article  PubMed  CAS  Google Scholar 

  72. Soding J, Biegert A, Lupas AN (2005) Nucleic Acids Res 33:W244–W248

    Article  PubMed  Google Scholar 

  73. Soding J (2005) Bioinformatics 21:951–960

    Article  PubMed  Google Scholar 

  74. Remmert M, Biegert A, Hauser A, Soding J (2011) Nat Methods 9:173–175

    Article  PubMed  Google Scholar 

  75. Sikic M, Tomic S, Vlahovicek K (2009) PLoS Comput Biol 5:e1000278

    Article  PubMed  Google Scholar 

  76. Chen XW, Jeong JC (2009) Bioinformatics 25:585–591

    Article  PubMed  Google Scholar 

  77. Park Y (2009) BMC Bioinform 10:419

    Article  CAS  Google Scholar 

  78. Zhou HX, Qin S (2007) Bioinformatics 23:2203–2209

    Article  PubMed  CAS  Google Scholar 

  79. Jones S, Marin A, Thornton JM (2000) Protein Eng 13:77–82

    Article  PubMed  CAS  Google Scholar 

  80. Sevrioukova IF, Li H, Zhang H, Peterson JA, Poulos TL (1999) Proc Natl Acad Sci USA 96:1863–1868

    Article  PubMed  CAS  Google Scholar 

  81. Tjong H, Zhou HX (2007) Nucleic Acids Res 35:1465–1477

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgments

A. Z. holds a Sydney Brenner Fellowship. Y. S. holds a Claude Leon Fellowship. We would like to thank Professor Sir Tom Blundell of the University of Cambridge UK for reading the manuscript and making helpful suggestions. We thank Professor Yang Zhang of the University of Michigan USA for reading the manuscript. We thank Mr. Renxiang Yan of the University of Michigan USA for technical assistance with local runs of the MUSTER program. We thank Dr Tichaona Mangwende for helpful discussions and suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alexander Zawaira.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zawaira, A., Shibayama, Y. A simple recipe for the non-expert bioinformaticist for building experimentally-testable hypotheses for proteins with no known homologs. J Struct Funct Genomics 13, 185–200 (2012). https://doi.org/10.1007/s10969-012-9141-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10969-012-9141-7

Keywords

Navigation