Abstract
The study of the protein–protein interactions (PPIs) of unique ORFs is a strategy for deciphering the biological roles of unique ORFs of interest. For uniform reference, we define unique ORFs as those for which no matching protein is found after PDB-BLAST search with default parameters. The uniqueness of the ORFs generally precludes the straightforward use of structure-based approaches in the design of experiments to explore PPIs. Many open-source bioinformatics tools, from the commonly-used to the relatively esoteric, have been built and validated to perform analyses and/or predictions of sorts on proteins. How can these available tools be combined into a protocol that helps the non-expert bioinformaticist researcher to design experiments to explore the PPIs of their unique ORF? Here we define a pragmatic protocol based on accessibility of software to achieve this and we make it concrete by applying it on two proteins—the ImuB and ImuA’ proteins from Mycobacterium tuberculosis. The protocol is pragmatic in that decisions are made largely based on the availability of easy-to-use freeware. We define the following basic and user-friendly software pathway to build testable PPI hypotheses for a query protein sequence: PSI-PRED → MUSTER → metaPPISP → ASAView and ConSurf. Where possible, other analytical and/or predictive tools may be included. Our protocol combines the software predictions and analyses with general bioinformatics principles to arrive at consensus, prioritised and testable PPI hypotheses.
Similar content being viewed by others
Abbreviations
- ORF:
-
Open reading frame
- TB:
-
Tuberculosis
- PPIs:
-
Protein–protein interactions
References
Skolnick J, Fetrow JS, Kolinski A (2000) Nat Biotechnol 18:283–287
Chandonia JM, Kim SH, Brenner SE (2006) Proteins 62:356–370
Marsden RL, Lewis TA, Orengo CA (2007) BMC Bioinform 8:86
Pir P, Ulgen KO, Hayes A, Ilsen Onsan Z, Kirdar B, Oliver SG (2006) Yeast 23:553–571
Bryan K, Cunningham P (2008) BMC Genomics 9(Suppl 2):S20
Warner DF, Ndwandwe DE, Abrahams GL, Kana BD, Machowski EE, Venclovas C, Mizrahi V (2010) Proc Natl Acad Sci USA 107:13093–13098
Sanchez R, Pieper U, Melo F, Eswar N, Marti-Renom MA, Madhusudhan MS, Mirkovic N, Sali A (2000) Nat Struct Biol 7(Suppl):986–990
Lichtarge O (2001) Nat Struct Biol 8:918–920
Dokholyan NV, Shakhnovich EI (2001) J Mol Biol 312:289–307
Lupas AN, Ponting CP, Russell RB (2001) J Struct Biol 134:191–203
Holm L, Sander C (1993) J Mol Biol 233:123–138
Holm L, Sander C (1997) Proteins 28:72–82
Rost B (1997) Fold Des 2:S19–S24
Rychlewski L, Jaroszewski L, Li W, Godzik A (2000) Protein Sci 9:232–241
Wu S, Zhang Y (2008) Proteins 72:547–556
Tuncbag N, Kar G, Keskin O, Gursoy A, Nussinov R (2009) Brief Bioinform 10:217–232
Fernandez-Recio J (2011) Wiley Interdiscip Rev Comput Mol Sci 1:680–698
Qin S, Zhou HX (2007) Bioinformatics 23:3386–3387
Ofran Y, Rost B (2007) PLoS Comput Biol 3:e119
Xia JF, Zhao XM, Song J, Huang DS (2010) BMC Bioinform 11:174
Lise S, Buchan D, Pontil M, Jones DT (2011) PLoS ONE 6:e16774
Bogan AA, Thorn KS (1998) J Mol Biol 280:1–9
Lo Conte L, Chothia C, Janin J (1999) J Mol Biol 285:2177–2198
Chen R, Chen W, Yang S, Wu D, Wang Y, Tian Y, Shi Y (2011) BMC Bioinform 12:311
Li J, Liu Q (2009) Bioinformatics 25:743–750
Liu Q, Li J (2010) BMC Bioinform 11:244
Ashkenazy H, Erez E, Martz E, Pupko T, Ben-Tal N (2010) Nucleic Acids Res 38:W529–W533
Ma B, Elkayam T, Wolfson H, Nussinov R (2003) Proc Natl Acad Sci USA 100:5772–5777
Boshoff HI, Reed MB, Barry CE III, Mizrahi V (2003) Cell 113:183–193
Galhardo RS, Rocha RP, Marques MV, Menck CF (2005) Nucleic Acids Res 33:2603–2614
Bryson K, McGuffin LJ, Marsden RL, Ward JJ, Sodhi JS, Jones DT (2005) Nucleic Acids Res 33:W36–W38
McGuffin LJ, Bryson K, Jones DT (2000) Bioinformatics 16:404–405
Eyrich VA, Marti-Renom MA, Przybylski D, Madhusudhan MS, Fiser A, Pazos F, Valencia A, Sali A, Rost B (2001) Bioinformatics 17:1242–1243
Rost B, Eyrich VA (2001) Proteins Suppl 5:192–199
Rost B (2001) J Struct Biol 134:204–218
Xu J, Jiao F, Yu L (2008) Methods Mol Biol 413:91–121
Godzik A (2003) Methods Biochem Anal 44:525–546
Zhou H, Skolnick J (2010) Proteins 78:2041–2048
Zhang Y (2008) BMC Bioinform 9:40
Roy A, Kucukural A, Zhang Y (2010) Nat Protoc 5:725–738
Rost B, Yachdav G, Liu J (2004) Nucleic Acids Res 32:W321–W326
Sali A, Blundell TL (1993) J Mol Biol 234:779–815
Arnold K, Bordoli L, Kopp J, Schwede T (2006) Bioinformatics 22:195–201
Schwede T, Kopp J, Guex N, Peitsch MC (2003) Nucleic Acids Res 31:3381–3385
Guex N, Peitsch MC (1997) Electrophoresis 18:2714–2723
Arnold K, Kiefer F, Kopp J, Battey JN, Podvinec M, Westbrook JD, Berman HM, Bordoli L, Schwede T (2009) J Struct Funct Genomics 10:1–8
Berman HM, Westbrook JD, Gabanyi MJ, Tao W, Shah R, Kouranov A, Schwede T, Arnold K, Kiefer F, Bordoli L et al (2009) Nucleic Acids Res 37:D365–D368
Roy A, Yang J, Zhang Y (2012) Nucleic Acids Res 1–7
Hulo N, Bairoch A, Bulliard V, Cerutti L, Cuche BA, de Castro E, Lachaize C, Langendijk-Genevaux PS, Sigrist CJ (2008) Nucleic Acids Res 36:D245–D249
Chen Z, Yang H, Pavletich NP (2008) Nature 453:484–489
Thompson JD, Higgins DG, Gibson TJ (1994) Nucleic Acids Res 22:4673–4680
Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG (1997) Nucleic Acids Res 25:4876–4882
Ahmad S, Gromiha M, Fawareh H, Sarai A (2004) BMC Bioinform 5:51
Sammond DW, Eletr ZM, Purbeck C, Kimple RJ, Siderovski DP, Kuhlman B (2007) J Mol Biol 371:1392–1404
Glaser F, Pupko T, Paz I, Bell RE, Bechor-Shental D, Martz E, Ben-Tal N (2003) Bioinformatics 19:163–164
Landau M, Mayrose I, Rosenberg Y, Glaser F, Martz E, Pupko T, Ben-Tal N (2005) Nucleic Acids Res 33:W299–W302
Valencia A (2005) Curr Opin Struct Biol 15:267–274
Pazhouhandeh M, Dieterle M, Marrocco K, Lechner E, Berry B, Brault V, Hemmer O, Kretsch T, Richards KE, Genschik P et al (2006) Proc Natl Acad Sci USA 103:1994–1999
Nolandt O, Kern V, Muller H, Pfaff E, Theilmann L, Welker R, Krausslich HG (1997) J Gen Virol 78(Pt 6):1331–1340
Homann HE, Willenbrink W, Buchholz CJ, Neubert WJ (1991) J Virol 65:1304–1309
Schell-Steven A, Stein K, Amoros M, Landgraf C, Volkmer-Engert R, Rottensteiner H, Erdmann R (2005) Mol Cell Biol 25:3007–3018
Rubinson EH, Metz AH, O’Quin J, Eichman BF (2008) J Mol Biol 381:13–23
Taylor EJ, Smith NL, Turkenburg JP, D’Souza S, Gilbert HJ, Davies GJ (2006) Biochem J 395:31–37
Zoltowski BD, Vaidya AT, Top D, Widom J, Young MW, Crane BR (2011) Nature 480:396–399
Girzalsky W, Rehling P, Stein K, Kipper J, Blank L, Kunau WH, Erdmann R (1999) J Cell Biol 144:1151–1162
Douangamath A, Filipp FV, Klein AT, Barnett P, Zou P, Voorn-Brouwer T, Vega MC, Mayans OM, Sattler M, Distel B et al (2002) Mol Cell 10:1007–1017
Pires JR, Hong X, Brockmann C, Volkmer-Engert R, Schneider-Mergener J, Oschkinat H, Erdmann R (2003) J Mol Biol 326:1427–1435
Neufeld C, Filipp FV, Simon B, Neuhaus A, Schuller N, David C, Kooshapur H, Madl T, Erdmann R, Schliebs W et al (2009) EMBO J 28:745–754
Marrero A, Duquerroy S, Trapani S, Goulas T, Guevara T, Andersen GR, Navaza J, Sottrup-Jensen L, Gomis-Ruth FX (2012) Angew Chem Int Ed Engl 3340–3344
Kelley LA, Sternberg MJ (2009) Nat Protoc 4:363–371
Ohlson T, Wallner B, Elofsson A (2004) Proteins 57:188–197
Soding J, Biegert A, Lupas AN (2005) Nucleic Acids Res 33:W244–W248
Soding J (2005) Bioinformatics 21:951–960
Remmert M, Biegert A, Hauser A, Soding J (2011) Nat Methods 9:173–175
Sikic M, Tomic S, Vlahovicek K (2009) PLoS Comput Biol 5:e1000278
Chen XW, Jeong JC (2009) Bioinformatics 25:585–591
Park Y (2009) BMC Bioinform 10:419
Zhou HX, Qin S (2007) Bioinformatics 23:2203–2209
Jones S, Marin A, Thornton JM (2000) Protein Eng 13:77–82
Sevrioukova IF, Li H, Zhang H, Peterson JA, Poulos TL (1999) Proc Natl Acad Sci USA 96:1863–1868
Tjong H, Zhou HX (2007) Nucleic Acids Res 35:1465–1477
Acknowledgments
A. Z. holds a Sydney Brenner Fellowship. Y. S. holds a Claude Leon Fellowship. We would like to thank Professor Sir Tom Blundell of the University of Cambridge UK for reading the manuscript and making helpful suggestions. We thank Professor Yang Zhang of the University of Michigan USA for reading the manuscript. We thank Mr. Renxiang Yan of the University of Michigan USA for technical assistance with local runs of the MUSTER program. We thank Dr Tichaona Mangwende for helpful discussions and suggestions.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Zawaira, A., Shibayama, Y. A simple recipe for the non-expert bioinformaticist for building experimentally-testable hypotheses for proteins with no known homologs. J Struct Funct Genomics 13, 185–200 (2012). https://doi.org/10.1007/s10969-012-9141-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10969-012-9141-7