A simple recipe for the non-expert bioinformaticist for building experimentally-testable hypotheses for proteins with no known homologs

Zawaira, Alexander; Shibayama, Youtaro

doi:10.1007/s10969-012-9141-7

A simple recipe for the non-expert bioinformaticist for building experimentally-testable hypotheses for proteins with no known homologs

Published: 07 September 2012

Volume 13, pages 185–200, (2012)
Cite this article

Journal of Structural and Functional Genomics

Alexander Zawaira¹ &
Youtaro Shibayama¹

464 Accesses
Explore all metrics

Abstract

The study of the protein–protein interactions (PPIs) of unique ORFs is a strategy for deciphering the biological roles of unique ORFs of interest. For uniform reference, we define unique ORFs as those for which no matching protein is found after PDB-BLAST search with default parameters. The uniqueness of the ORFs generally precludes the straightforward use of structure-based approaches in the design of experiments to explore PPIs. Many open-source bioinformatics tools, from the commonly-used to the relatively esoteric, have been built and validated to perform analyses and/or predictions of sorts on proteins. How can these available tools be combined into a protocol that helps the non-expert bioinformaticist researcher to design experiments to explore the PPIs of their unique ORF? Here we define a pragmatic protocol based on accessibility of software to achieve this and we make it concrete by applying it on two proteins—the ImuB and ImuA’ proteins from Mycobacterium tuberculosis. The protocol is pragmatic in that decisions are made largely based on the availability of easy-to-use freeware. We define the following basic and user-friendly software pathway to build testable PPI hypotheses for a query protein sequence: PSI-PRED → MUSTER → metaPPISP → ASAView and ConSurf. Where possible, other analytical and/or predictive tools may be included. Our protocol combines the software predictions and analyses with general bioinformatics principles to arrive at consensus, prioritised and testable PPI hypotheses.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Using structural knowledge in the protein data bank to inform the search for potential host-microbe protein interactions in sequence space: application to Mycobacterium tuberculosis

Article Open access 04 April 2017

Gaurang Mahajan & Shekhar C. Mande

On the necessity of dissecting sequence similarity scores into segment-specific contributions for inferring protein homology, function prediction and annotation

Article Open access 02 June 2014

Wing-Cheong Wong, Sebastian Maurer-Stroh, … Frank Eisenhaber

Computational Methods for the Elucidation of Protein Structure and Interactions

Abbreviations

ORF:: Open reading frame
TB:: Tuberculosis
PPIs:: Protein–protein interactions

References

Skolnick J, Fetrow JS, Kolinski A (2000) Nat Biotechnol 18:283–287
Article PubMed CAS Google Scholar
Chandonia JM, Kim SH, Brenner SE (2006) Proteins 62:356–370
Article PubMed CAS Google Scholar
Marsden RL, Lewis TA, Orengo CA (2007) BMC Bioinform 8:86
Article Google Scholar
Pir P, Ulgen KO, Hayes A, Ilsen Onsan Z, Kirdar B, Oliver SG (2006) Yeast 23:553–571
Article PubMed CAS Google Scholar
Bryan K, Cunningham P (2008) BMC Genomics 9(Suppl 2):S20
Article PubMed Google Scholar
Warner DF, Ndwandwe DE, Abrahams GL, Kana BD, Machowski EE, Venclovas C, Mizrahi V (2010) Proc Natl Acad Sci USA 107:13093–13098
Article PubMed CAS Google Scholar
Sanchez R, Pieper U, Melo F, Eswar N, Marti-Renom MA, Madhusudhan MS, Mirkovic N, Sali A (2000) Nat Struct Biol 7(Suppl):986–990
Article PubMed CAS Google Scholar
Lichtarge O (2001) Nat Struct Biol 8:918–920
Article PubMed CAS Google Scholar
Dokholyan NV, Shakhnovich EI (2001) J Mol Biol 312:289–307
Article PubMed CAS Google Scholar
Lupas AN, Ponting CP, Russell RB (2001) J Struct Biol 134:191–203
Article PubMed CAS Google Scholar
Holm L, Sander C (1993) J Mol Biol 233:123–138
Article PubMed CAS Google Scholar
Holm L, Sander C (1997) Proteins 28:72–82
Article PubMed CAS Google Scholar
Rost B (1997) Fold Des 2:S19–S24
Article PubMed CAS Google Scholar
Rychlewski L, Jaroszewski L, Li W, Godzik A (2000) Protein Sci 9:232–241
Article PubMed CAS Google Scholar
Wu S, Zhang Y (2008) Proteins 72:547–556
Article PubMed CAS Google Scholar
Tuncbag N, Kar G, Keskin O, Gursoy A, Nussinov R (2009) Brief Bioinform 10:217–232
Article PubMed CAS Google Scholar
Fernandez-Recio J (2011) Wiley Interdiscip Rev Comput Mol Sci 1:680–698
Article CAS Google Scholar
Qin S, Zhou HX (2007) Bioinformatics 23:3386–3387
Article PubMed CAS Google Scholar
Ofran Y, Rost B (2007) PLoS Comput Biol 3:e119
Article PubMed Google Scholar
Xia JF, Zhao XM, Song J, Huang DS (2010) BMC Bioinform 11:174
Article Google Scholar
Lise S, Buchan D, Pontil M, Jones DT (2011) PLoS ONE 6:e16774
Article PubMed CAS Google Scholar
Bogan AA, Thorn KS (1998) J Mol Biol 280:1–9
Article PubMed CAS Google Scholar
Lo Conte L, Chothia C, Janin J (1999) J Mol Biol 285:2177–2198
Article PubMed CAS Google Scholar
Chen R, Chen W, Yang S, Wu D, Wang Y, Tian Y, Shi Y (2011) BMC Bioinform 12:311
Article CAS Google Scholar
Li J, Liu Q (2009) Bioinformatics 25:743–750
Article PubMed CAS Google Scholar
Liu Q, Li J (2010) BMC Bioinform 11:244
Article Google Scholar
Ashkenazy H, Erez E, Martz E, Pupko T, Ben-Tal N (2010) Nucleic Acids Res 38:W529–W533
Article PubMed CAS Google Scholar
Ma B, Elkayam T, Wolfson H, Nussinov R (2003) Proc Natl Acad Sci USA 100:5772–5777
Article PubMed CAS Google Scholar
Boshoff HI, Reed MB, Barry CE III, Mizrahi V (2003) Cell 113:183–193
Article PubMed CAS Google Scholar
Galhardo RS, Rocha RP, Marques MV, Menck CF (2005) Nucleic Acids Res 33:2603–2614
Article PubMed CAS Google Scholar
Bryson K, McGuffin LJ, Marsden RL, Ward JJ, Sodhi JS, Jones DT (2005) Nucleic Acids Res 33:W36–W38
Article PubMed CAS Google Scholar
McGuffin LJ, Bryson K, Jones DT (2000) Bioinformatics 16:404–405
Article PubMed CAS Google Scholar
Eyrich VA, Marti-Renom MA, Przybylski D, Madhusudhan MS, Fiser A, Pazos F, Valencia A, Sali A, Rost B (2001) Bioinformatics 17:1242–1243
Article PubMed CAS Google Scholar
Rost B, Eyrich VA (2001) Proteins Suppl 5:192–199
Article Google Scholar
Rost B (2001) J Struct Biol 134:204–218
Article PubMed CAS Google Scholar
Xu J, Jiao F, Yu L (2008) Methods Mol Biol 413:91–121
PubMed CAS Google Scholar
Godzik A (2003) Methods Biochem Anal 44:525–546
PubMed CAS Google Scholar
Zhou H, Skolnick J (2010) Proteins 78:2041–2048
PubMed CAS Google Scholar
Zhang Y (2008) BMC Bioinform 9:40
Article Google Scholar
Roy A, Kucukural A, Zhang Y (2010) Nat Protoc 5:725–738
Article PubMed CAS Google Scholar
Rost B, Yachdav G, Liu J (2004) Nucleic Acids Res 32:W321–W326
Article PubMed CAS Google Scholar
Sali A, Blundell TL (1993) J Mol Biol 234:779–815
Article PubMed CAS Google Scholar
Arnold K, Bordoli L, Kopp J, Schwede T (2006) Bioinformatics 22:195–201
Article PubMed CAS Google Scholar
Schwede T, Kopp J, Guex N, Peitsch MC (2003) Nucleic Acids Res 31:3381–3385
Article PubMed CAS Google Scholar
Guex N, Peitsch MC (1997) Electrophoresis 18:2714–2723
Article PubMed CAS Google Scholar
Arnold K, Kiefer F, Kopp J, Battey JN, Podvinec M, Westbrook JD, Berman HM, Bordoli L, Schwede T (2009) J Struct Funct Genomics 10:1–8
Article PubMed CAS Google Scholar
Berman HM, Westbrook JD, Gabanyi MJ, Tao W, Shah R, Kouranov A, Schwede T, Arnold K, Kiefer F, Bordoli L et al (2009) Nucleic Acids Res 37:D365–D368
Article PubMed CAS Google Scholar
Roy A, Yang J, Zhang Y (2012) Nucleic Acids Res 1–7
Hulo N, Bairoch A, Bulliard V, Cerutti L, Cuche BA, de Castro E, Lachaize C, Langendijk-Genevaux PS, Sigrist CJ (2008) Nucleic Acids Res 36:D245–D249
Article PubMed CAS Google Scholar
Chen Z, Yang H, Pavletich NP (2008) Nature 453:484–489
Google Scholar
Thompson JD, Higgins DG, Gibson TJ (1994) Nucleic Acids Res 22:4673–4680
Article PubMed CAS Google Scholar
Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG (1997) Nucleic Acids Res 25:4876–4882
Article PubMed CAS Google Scholar
Ahmad S, Gromiha M, Fawareh H, Sarai A (2004) BMC Bioinform 5:51
Article Google Scholar
Sammond DW, Eletr ZM, Purbeck C, Kimple RJ, Siderovski DP, Kuhlman B (2007) J Mol Biol 371:1392–1404
Article PubMed CAS Google Scholar
Glaser F, Pupko T, Paz I, Bell RE, Bechor-Shental D, Martz E, Ben-Tal N (2003) Bioinformatics 19:163–164
Article PubMed CAS Google Scholar
Landau M, Mayrose I, Rosenberg Y, Glaser F, Martz E, Pupko T, Ben-Tal N (2005) Nucleic Acids Res 33:W299–W302
Article PubMed CAS Google Scholar
Valencia A (2005) Curr Opin Struct Biol 15:267–274
Article PubMed CAS Google Scholar
Pazhouhandeh M, Dieterle M, Marrocco K, Lechner E, Berry B, Brault V, Hemmer O, Kretsch T, Richards KE, Genschik P et al (2006) Proc Natl Acad Sci USA 103:1994–1999
Article PubMed CAS Google Scholar
Nolandt O, Kern V, Muller H, Pfaff E, Theilmann L, Welker R, Krausslich HG (1997) J Gen Virol 78(Pt 6):1331–1340
PubMed CAS Google Scholar
Homann HE, Willenbrink W, Buchholz CJ, Neubert WJ (1991) J Virol 65:1304–1309
PubMed CAS Google Scholar
Schell-Steven A, Stein K, Amoros M, Landgraf C, Volkmer-Engert R, Rottensteiner H, Erdmann R (2005) Mol Cell Biol 25:3007–3018
Article PubMed CAS Google Scholar
Rubinson EH, Metz AH, O’Quin J, Eichman BF (2008) J Mol Biol 381:13–23
Article PubMed CAS Google Scholar
Taylor EJ, Smith NL, Turkenburg JP, D’Souza S, Gilbert HJ, Davies GJ (2006) Biochem J 395:31–37
Article PubMed CAS Google Scholar
Zoltowski BD, Vaidya AT, Top D, Widom J, Young MW, Crane BR (2011) Nature 480:396–399
Article PubMed CAS Google Scholar
Girzalsky W, Rehling P, Stein K, Kipper J, Blank L, Kunau WH, Erdmann R (1999) J Cell Biol 144:1151–1162
Article PubMed CAS Google Scholar
Douangamath A, Filipp FV, Klein AT, Barnett P, Zou P, Voorn-Brouwer T, Vega MC, Mayans OM, Sattler M, Distel B et al (2002) Mol Cell 10:1007–1017
Article PubMed CAS Google Scholar
Pires JR, Hong X, Brockmann C, Volkmer-Engert R, Schneider-Mergener J, Oschkinat H, Erdmann R (2003) J Mol Biol 326:1427–1435
Article PubMed CAS Google Scholar
Neufeld C, Filipp FV, Simon B, Neuhaus A, Schuller N, David C, Kooshapur H, Madl T, Erdmann R, Schliebs W et al (2009) EMBO J 28:745–754
Article PubMed CAS Google Scholar
Marrero A, Duquerroy S, Trapani S, Goulas T, Guevara T, Andersen GR, Navaza J, Sottrup-Jensen L, Gomis-Ruth FX (2012) Angew Chem Int Ed Engl 3340–3344
Kelley LA, Sternberg MJ (2009) Nat Protoc 4:363–371
Article PubMed CAS Google Scholar
Ohlson T, Wallner B, Elofsson A (2004) Proteins 57:188–197
Article PubMed CAS Google Scholar
Soding J, Biegert A, Lupas AN (2005) Nucleic Acids Res 33:W244–W248
Article PubMed Google Scholar
Soding J (2005) Bioinformatics 21:951–960
Article PubMed Google Scholar
Remmert M, Biegert A, Hauser A, Soding J (2011) Nat Methods 9:173–175
Article PubMed Google Scholar
Sikic M, Tomic S, Vlahovicek K (2009) PLoS Comput Biol 5:e1000278
Article PubMed Google Scholar
Chen XW, Jeong JC (2009) Bioinformatics 25:585–591
Article PubMed Google Scholar
Park Y (2009) BMC Bioinform 10:419
Article CAS Google Scholar
Zhou HX, Qin S (2007) Bioinformatics 23:2203–2209
Article PubMed CAS Google Scholar
Jones S, Marin A, Thornton JM (2000) Protein Eng 13:77–82
Article PubMed CAS Google Scholar
Sevrioukova IF, Li H, Zhang H, Peterson JA, Poulos TL (1999) Proc Natl Acad Sci USA 96:1863–1868
Article PubMed CAS Google Scholar
Tjong H, Zhou HX (2007) Nucleic Acids Res 35:1465–1477
Article PubMed CAS Google Scholar

Download references

Acknowledgments

A. Z. holds a Sydney Brenner Fellowship. Y. S. holds a Claude Leon Fellowship. We would like to thank Professor Sir Tom Blundell of the University of Cambridge UK for reading the manuscript and making helpful suggestions. We thank Professor Yang Zhang of the University of Michigan USA for reading the manuscript. We thank Mr. Renxiang Yan of the University of Michigan USA for technical assistance with local runs of the MUSTER program. We thank Dr Tichaona Mangwende for helpful discussions and suggestions.

Author information

Authors and Affiliations

Gene Expression and Biophysics Group, Synthetic Biology, ERA, Building 20, CSIR Biosciences, Meiring Naude Road, Brummeria, Pretoria, 0001, South Africa
Alexander Zawaira & Youtaro Shibayama

Authors

Alexander Zawaira
View author publications
You can also search for this author in PubMed Google Scholar
Youtaro Shibayama
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alexander Zawaira.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (DOC 787 kb)

Supplementary material 2 (DOC 57 kb)

Supplementary material 3 (DOC 33 kb)

Supplementary material 4 (DOC 32 kb)

Supplementary material 5 (DOC 541 kb)

Supplementary material 6 (DOC 1046 kb)

Supplementary material 7 (DOC 44 kb)

Supplementary material 8 (DOC 49 kb)

Supplementary material 9 (DOC 903 kb)

Supplementary material 10 (DOC 91 kb)

Supplementary material 11 (DOC 1460 kb)

Supplementary material 12 (DOC 1710 kb)

Supplementary material 13 (DOC 1254 kb)

Supplementary material 14 (DOC 44 kb)

Supplementary material 15 (DOC 33 kb)

Supplementary material 16 (DOC 61 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zawaira, A., Shibayama, Y. A simple recipe for the non-expert bioinformaticist for building experimentally-testable hypotheses for proteins with no known homologs. J Struct Funct Genomics 13, 185–200 (2012). https://doi.org/10.1007/s10969-012-9141-7

Download citation

Received: 29 February 2012
Accepted: 08 August 2012
Published: 07 September 2012
Issue Date: December 2012
DOI: https://doi.org/10.1007/s10969-012-9141-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

A simple recipe for the non-expert bioinformaticist for building experimentally-testable hypotheses for proteins with no known homologs

Abstract

Access this article

Similar content being viewed by others

Using structural knowledge in the protein data bank to inform the search for potential host-microbe protein interactions in sequence space: application to Mycobacterium tuberculosis

On the necessity of dissecting sequence similarity scores into segment-specific contributions for inferring protein homology, function prediction and annotation

Computational Methods for the Elucidation of Protein Structure and Interactions

Abbreviations

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (DOC 787 kb)

Supplementary material 2 (DOC 57 kb)

Supplementary material 3 (DOC 33 kb)

Supplementary material 4 (DOC 32 kb)

Supplementary material 5 (DOC 541 kb)

Supplementary material 6 (DOC 1046 kb)

Supplementary material 7 (DOC 44 kb)

Supplementary material 8 (DOC 49 kb)

Supplementary material 9 (DOC 903 kb)

Supplementary material 10 (DOC 91 kb)

Supplementary material 11 (DOC 1460 kb)

Supplementary material 12 (DOC 1710 kb)

Supplementary material 13 (DOC 1254 kb)

Supplementary material 14 (DOC 44 kb)

Supplementary material 15 (DOC 33 kb)

Supplementary material 16 (DOC 61 kb)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A simple recipe for the non-expert bioinformaticist for building experimentally-testable hypotheses for proteins with no known homologs

Abstract

Access this article

Similar content being viewed by others

Abbreviations

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation