Journal of Structural and Functional Genomics

, Volume 13, Issue 4, pp 185–200 | Cite as

A simple recipe for the non-expert bioinformaticist for building experimentally-testable hypotheses for proteins with no known homologs



The study of the protein–protein interactions (PPIs) of unique ORFs is a strategy for deciphering the biological roles of unique ORFs of interest. For uniform reference, we define unique ORFs as those for which no matching protein is found after PDB-BLAST search with default parameters. The uniqueness of the ORFs generally precludes the straightforward use of structure-based approaches in the design of experiments to explore PPIs. Many open-source bioinformatics tools, from the commonly-used to the relatively esoteric, have been built and validated to perform analyses and/or predictions of sorts on proteins. How can these available tools be combined into a protocol that helps the non-expert bioinformaticist researcher to design experiments to explore the PPIs of their unique ORF? Here we define a pragmatic protocol based on accessibility of software to achieve this and we make it concrete by applying it on two proteins—the ImuB and ImuA’ proteins from Mycobacterium tuberculosis. The protocol is pragmatic in that decisions are made largely based on the availability of easy-to-use freeware. We define the following basic and user-friendly software pathway to build testable PPI hypotheses for a query protein sequence: PSI-PRED → MUSTER → metaPPISP → ASAView and ConSurf. Where possible, other analytical and/or predictive tools may be included. Our protocol combines the software predictions and analyses with general bioinformatics principles to arrive at consensus, prioritised and testable PPI hypotheses.


Functional genomics Sequence similarity Sequence identity Homology Fold recognition Protein–protein interface 



Open reading frame




Protein–protein interactions

Supplementary material

10969_2012_9141_MOESM1_ESM.doc (787 kb)
Supplementary material 1 (DOC 787 kb)
10969_2012_9141_MOESM2_ESM.doc (58 kb)
Supplementary material 2 (DOC 57 kb)
10969_2012_9141_MOESM3_ESM.doc (34 kb)
Supplementary material 3 (DOC 33 kb)
10969_2012_9141_MOESM4_ESM.doc (32 kb)
Supplementary material 4 (DOC 32 kb)
10969_2012_9141_MOESM5_ESM.doc (542 kb)
Supplementary material 5 (DOC 541 kb)
10969_2012_9141_MOESM6_ESM.doc (1 mb)
Supplementary material 6 (DOC 1046 kb)
10969_2012_9141_MOESM7_ESM.doc (44 kb)
Supplementary material 7 (DOC 44 kb)
10969_2012_9141_MOESM8_ESM.doc (49 kb)
Supplementary material 8 (DOC 49 kb)
10969_2012_9141_MOESM9_ESM.doc (904 kb)
Supplementary material 9 (DOC 903 kb)
10969_2012_9141_MOESM10_ESM.doc (92 kb)
Supplementary material 10 (DOC 91 kb)
10969_2012_9141_MOESM11_ESM.doc (1.4 mb)
Supplementary material 11 (DOC 1460 kb)
10969_2012_9141_MOESM12_ESM.doc (1.7 mb)
Supplementary material 12 (DOC 1710 kb)
10969_2012_9141_MOESM13_ESM.doc (1.2 mb)
Supplementary material 13 (DOC 1254 kb)
10969_2012_9141_MOESM14_ESM.doc (44 kb)
Supplementary material 14 (DOC 44 kb)
10969_2012_9141_MOESM15_ESM.doc (33 kb)
Supplementary material 15 (DOC 33 kb)
10969_2012_9141_MOESM16_ESM.doc (60 kb)
Supplementary material 16 (DOC 61 kb)


  1. 1.
    Skolnick J, Fetrow JS, Kolinski A (2000) Nat Biotechnol 18:283–287PubMedCrossRefGoogle Scholar
  2. 2.
    Chandonia JM, Kim SH, Brenner SE (2006) Proteins 62:356–370PubMedCrossRefGoogle Scholar
  3. 3.
    Marsden RL, Lewis TA, Orengo CA (2007) BMC Bioinform 8:86CrossRefGoogle Scholar
  4. 4.
    Pir P, Ulgen KO, Hayes A, Ilsen Onsan Z, Kirdar B, Oliver SG (2006) Yeast 23:553–571PubMedCrossRefGoogle Scholar
  5. 5.
    Bryan K, Cunningham P (2008) BMC Genomics 9(Suppl 2):S20PubMedCrossRefGoogle Scholar
  6. 6.
    Warner DF, Ndwandwe DE, Abrahams GL, Kana BD, Machowski EE, Venclovas C, Mizrahi V (2010) Proc Natl Acad Sci USA 107:13093–13098PubMedCrossRefGoogle Scholar
  7. 7.
    Sanchez R, Pieper U, Melo F, Eswar N, Marti-Renom MA, Madhusudhan MS, Mirkovic N, Sali A (2000) Nat Struct Biol 7(Suppl):986–990PubMedCrossRefGoogle Scholar
  8. 8.
    Lichtarge O (2001) Nat Struct Biol 8:918–920PubMedCrossRefGoogle Scholar
  9. 9.
    Dokholyan NV, Shakhnovich EI (2001) J Mol Biol 312:289–307PubMedCrossRefGoogle Scholar
  10. 10.
    Lupas AN, Ponting CP, Russell RB (2001) J Struct Biol 134:191–203PubMedCrossRefGoogle Scholar
  11. 11.
    Holm L, Sander C (1993) J Mol Biol 233:123–138PubMedCrossRefGoogle Scholar
  12. 12.
    Holm L, Sander C (1997) Proteins 28:72–82PubMedCrossRefGoogle Scholar
  13. 13.
    Rost B (1997) Fold Des 2:S19–S24PubMedCrossRefGoogle Scholar
  14. 14.
    Rychlewski L, Jaroszewski L, Li W, Godzik A (2000) Protein Sci 9:232–241PubMedCrossRefGoogle Scholar
  15. 15.
    Wu S, Zhang Y (2008) Proteins 72:547–556PubMedCrossRefGoogle Scholar
  16. 16.
    Tuncbag N, Kar G, Keskin O, Gursoy A, Nussinov R (2009) Brief Bioinform 10:217–232PubMedCrossRefGoogle Scholar
  17. 17.
    Fernandez-Recio J (2011) Wiley Interdiscip Rev Comput Mol Sci 1:680–698CrossRefGoogle Scholar
  18. 18.
    Qin S, Zhou HX (2007) Bioinformatics 23:3386–3387PubMedCrossRefGoogle Scholar
  19. 19.
    Ofran Y, Rost B (2007) PLoS Comput Biol 3:e119PubMedCrossRefGoogle Scholar
  20. 20.
    Xia JF, Zhao XM, Song J, Huang DS (2010) BMC Bioinform 11:174CrossRefGoogle Scholar
  21. 21.
    Lise S, Buchan D, Pontil M, Jones DT (2011) PLoS ONE 6:e16774PubMedCrossRefGoogle Scholar
  22. 22.
    Bogan AA, Thorn KS (1998) J Mol Biol 280:1–9PubMedCrossRefGoogle Scholar
  23. 23.
    Lo Conte L, Chothia C, Janin J (1999) J Mol Biol 285:2177–2198PubMedCrossRefGoogle Scholar
  24. 24.
    Chen R, Chen W, Yang S, Wu D, Wang Y, Tian Y, Shi Y (2011) BMC Bioinform 12:311CrossRefGoogle Scholar
  25. 25.
    Li J, Liu Q (2009) Bioinformatics 25:743–750PubMedCrossRefGoogle Scholar
  26. 26.
    Liu Q, Li J (2010) BMC Bioinform 11:244CrossRefGoogle Scholar
  27. 27.
    Ashkenazy H, Erez E, Martz E, Pupko T, Ben-Tal N (2010) Nucleic Acids Res 38:W529–W533PubMedCrossRefGoogle Scholar
  28. 28.
    Ma B, Elkayam T, Wolfson H, Nussinov R (2003) Proc Natl Acad Sci USA 100:5772–5777PubMedCrossRefGoogle Scholar
  29. 29.
    Boshoff HI, Reed MB, Barry CE III, Mizrahi V (2003) Cell 113:183–193PubMedCrossRefGoogle Scholar
  30. 30.
    Galhardo RS, Rocha RP, Marques MV, Menck CF (2005) Nucleic Acids Res 33:2603–2614PubMedCrossRefGoogle Scholar
  31. 31.
    Bryson K, McGuffin LJ, Marsden RL, Ward JJ, Sodhi JS, Jones DT (2005) Nucleic Acids Res 33:W36–W38PubMedCrossRefGoogle Scholar
  32. 32.
    McGuffin LJ, Bryson K, Jones DT (2000) Bioinformatics 16:404–405PubMedCrossRefGoogle Scholar
  33. 33.
    Eyrich VA, Marti-Renom MA, Przybylski D, Madhusudhan MS, Fiser A, Pazos F, Valencia A, Sali A, Rost B (2001) Bioinformatics 17:1242–1243PubMedCrossRefGoogle Scholar
  34. 34.
    Rost B, Eyrich VA (2001) Proteins Suppl 5:192–199CrossRefGoogle Scholar
  35. 35.
    Rost B (2001) J Struct Biol 134:204–218PubMedCrossRefGoogle Scholar
  36. 36.
    Xu J, Jiao F, Yu L (2008) Methods Mol Biol 413:91–121PubMedGoogle Scholar
  37. 37.
    Godzik A (2003) Methods Biochem Anal 44:525–546PubMedGoogle Scholar
  38. 38.
    Zhou H, Skolnick J (2010) Proteins 78:2041–2048PubMedGoogle Scholar
  39. 39.
    Zhang Y (2008) BMC Bioinform 9:40CrossRefGoogle Scholar
  40. 40.
    Roy A, Kucukural A, Zhang Y (2010) Nat Protoc 5:725–738PubMedCrossRefGoogle Scholar
  41. 41.
    Rost B, Yachdav G, Liu J (2004) Nucleic Acids Res 32:W321–W326PubMedCrossRefGoogle Scholar
  42. 42.
    Sali A, Blundell TL (1993) J Mol Biol 234:779–815PubMedCrossRefGoogle Scholar
  43. 43.
    Arnold K, Bordoli L, Kopp J, Schwede T (2006) Bioinformatics 22:195–201PubMedCrossRefGoogle Scholar
  44. 44.
    Schwede T, Kopp J, Guex N, Peitsch MC (2003) Nucleic Acids Res 31:3381–3385PubMedCrossRefGoogle Scholar
  45. 45.
    Guex N, Peitsch MC (1997) Electrophoresis 18:2714–2723PubMedCrossRefGoogle Scholar
  46. 46.
    Arnold K, Kiefer F, Kopp J, Battey JN, Podvinec M, Westbrook JD, Berman HM, Bordoli L, Schwede T (2009) J Struct Funct Genomics 10:1–8PubMedCrossRefGoogle Scholar
  47. 47.
    Berman HM, Westbrook JD, Gabanyi MJ, Tao W, Shah R, Kouranov A, Schwede T, Arnold K, Kiefer F, Bordoli L et al (2009) Nucleic Acids Res 37:D365–D368PubMedCrossRefGoogle Scholar
  48. 48.
    Roy A, Yang J, Zhang Y (2012) Nucleic Acids Res 1–7Google Scholar
  49. 49.
    Hulo N, Bairoch A, Bulliard V, Cerutti L, Cuche BA, de Castro E, Lachaize C, Langendijk-Genevaux PS, Sigrist CJ (2008) Nucleic Acids Res 36:D245–D249PubMedCrossRefGoogle Scholar
  50. 50.
    Chen Z, Yang H, Pavletich NP (2008) Nature 453:484–489Google Scholar
  51. 51.
    Thompson JD, Higgins DG, Gibson TJ (1994) Nucleic Acids Res 22:4673–4680PubMedCrossRefGoogle Scholar
  52. 52.
    Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG (1997) Nucleic Acids Res 25:4876–4882PubMedCrossRefGoogle Scholar
  53. 53.
    Ahmad S, Gromiha M, Fawareh H, Sarai A (2004) BMC Bioinform 5:51CrossRefGoogle Scholar
  54. 54.
    Sammond DW, Eletr ZM, Purbeck C, Kimple RJ, Siderovski DP, Kuhlman B (2007) J Mol Biol 371:1392–1404PubMedCrossRefGoogle Scholar
  55. 55.
    Glaser F, Pupko T, Paz I, Bell RE, Bechor-Shental D, Martz E, Ben-Tal N (2003) Bioinformatics 19:163–164PubMedCrossRefGoogle Scholar
  56. 56.
    Landau M, Mayrose I, Rosenberg Y, Glaser F, Martz E, Pupko T, Ben-Tal N (2005) Nucleic Acids Res 33:W299–W302PubMedCrossRefGoogle Scholar
  57. 57.
    Valencia A (2005) Curr Opin Struct Biol 15:267–274PubMedCrossRefGoogle Scholar
  58. 58.
    Pazhouhandeh M, Dieterle M, Marrocco K, Lechner E, Berry B, Brault V, Hemmer O, Kretsch T, Richards KE, Genschik P et al (2006) Proc Natl Acad Sci USA 103:1994–1999PubMedCrossRefGoogle Scholar
  59. 59.
    Nolandt O, Kern V, Muller H, Pfaff E, Theilmann L, Welker R, Krausslich HG (1997) J Gen Virol 78(Pt 6):1331–1340PubMedGoogle Scholar
  60. 60.
    Homann HE, Willenbrink W, Buchholz CJ, Neubert WJ (1991) J Virol 65:1304–1309PubMedGoogle Scholar
  61. 61.
    Schell-Steven A, Stein K, Amoros M, Landgraf C, Volkmer-Engert R, Rottensteiner H, Erdmann R (2005) Mol Cell Biol 25:3007–3018PubMedCrossRefGoogle Scholar
  62. 62.
    Rubinson EH, Metz AH, O’Quin J, Eichman BF (2008) J Mol Biol 381:13–23PubMedCrossRefGoogle Scholar
  63. 63.
    Taylor EJ, Smith NL, Turkenburg JP, D’Souza S, Gilbert HJ, Davies GJ (2006) Biochem J 395:31–37PubMedCrossRefGoogle Scholar
  64. 64.
    Zoltowski BD, Vaidya AT, Top D, Widom J, Young MW, Crane BR (2011) Nature 480:396–399PubMedCrossRefGoogle Scholar
  65. 65.
    Girzalsky W, Rehling P, Stein K, Kipper J, Blank L, Kunau WH, Erdmann R (1999) J Cell Biol 144:1151–1162PubMedCrossRefGoogle Scholar
  66. 66.
    Douangamath A, Filipp FV, Klein AT, Barnett P, Zou P, Voorn-Brouwer T, Vega MC, Mayans OM, Sattler M, Distel B et al (2002) Mol Cell 10:1007–1017PubMedCrossRefGoogle Scholar
  67. 67.
    Pires JR, Hong X, Brockmann C, Volkmer-Engert R, Schneider-Mergener J, Oschkinat H, Erdmann R (2003) J Mol Biol 326:1427–1435PubMedCrossRefGoogle Scholar
  68. 68.
    Neufeld C, Filipp FV, Simon B, Neuhaus A, Schuller N, David C, Kooshapur H, Madl T, Erdmann R, Schliebs W et al (2009) EMBO J 28:745–754PubMedCrossRefGoogle Scholar
  69. 69.
    Marrero A, Duquerroy S, Trapani S, Goulas T, Guevara T, Andersen GR, Navaza J, Sottrup-Jensen L, Gomis-Ruth FX (2012) Angew Chem Int Ed Engl 3340–3344Google Scholar
  70. 70.
    Kelley LA, Sternberg MJ (2009) Nat Protoc 4:363–371PubMedCrossRefGoogle Scholar
  71. 71.
    Ohlson T, Wallner B, Elofsson A (2004) Proteins 57:188–197PubMedCrossRefGoogle Scholar
  72. 72.
    Soding J, Biegert A, Lupas AN (2005) Nucleic Acids Res 33:W244–W248PubMedCrossRefGoogle Scholar
  73. 73.
    Soding J (2005) Bioinformatics 21:951–960PubMedCrossRefGoogle Scholar
  74. 74.
    Remmert M, Biegert A, Hauser A, Soding J (2011) Nat Methods 9:173–175PubMedCrossRefGoogle Scholar
  75. 75.
    Sikic M, Tomic S, Vlahovicek K (2009) PLoS Comput Biol 5:e1000278PubMedCrossRefGoogle Scholar
  76. 76.
    Chen XW, Jeong JC (2009) Bioinformatics 25:585–591PubMedCrossRefGoogle Scholar
  77. 77.
    Park Y (2009) BMC Bioinform 10:419CrossRefGoogle Scholar
  78. 78.
    Zhou HX, Qin S (2007) Bioinformatics 23:2203–2209PubMedCrossRefGoogle Scholar
  79. 79.
    Jones S, Marin A, Thornton JM (2000) Protein Eng 13:77–82PubMedCrossRefGoogle Scholar
  80. 80.
    Sevrioukova IF, Li H, Zhang H, Peterson JA, Poulos TL (1999) Proc Natl Acad Sci USA 96:1863–1868PubMedCrossRefGoogle Scholar
  81. 81.
    Tjong H, Zhou HX (2007) Nucleic Acids Res 35:1465–1477PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media B.V. 2012

Authors and Affiliations

  1. 1.Gene Expression and Biophysics Group, Synthetic BiologyERAPretoriaSouth Africa

Personalised recommendations