Assessing Coverage of Protein Interaction Data Using Capture–Recapture Models

Kelly, W. P.; Stumpf, M. P. H.

doi:10.1007/s11538-011-9680-2

Assessing Coverage of Protein Interaction Data Using Capture–Recapture Models

Original Article
Open access
Published: 26 August 2011

Volume 74, pages 356–374, (2012)
Cite this article

Download PDF

You have full access to this open access article

Bulletin of Mathematical Biology Aims and scope Submit manuscript

Assessing Coverage of Protein Interaction Data Using Capture–Recapture Models

Download PDF

W. P. Kelly¹ &
M. P. H. Stumpf^1,2

486 Accesses
3 Citations
1 Altmetric
Explore all metrics

Abstract

Protein interaction networks comprise thousands of individual binary links between distinct proteins. Whilst these data have attracted considerable attention and been the focus of many different studies, the networks, their structure, function, and how they change over time are still not fully known. More importantly, there is still considerable uncertainty regarding their size, and the quality of the available data continues to be questioned. Here, we employ statistical models of the experimental sampling process, in particular capture–recapture methods, in order to assess the false discovery rate and size of protein interaction networks. We uses these methods to gauge the ability of different experimental systems to find the true binary interactome. Our model allows us to obtain estimates for the size and false-discovery rate from simple considerations regarding the number of repeatedly interactions, and provides suggestions as to how we can exploit this information in order to reduce the effects of noise in such data. In particular our approach does not require a reference dataset. We estimate that approximately more than half of the true physical interactome has now been sampled in yeast.

Article PDF

The social and structural architecture of the yeast protein interactome

Article Open access 15 November 2023

Structural Pattern Discovery in Protein–Protein Interaction Networks

PINOT: an intuitive resource for integrating protein-protein interactions

Article Open access 11 June 2020

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Alm, E., & Arkin, A. (2003). Biological networks. Curr. Opin. Struct. Biol., 13(2), 193–202.
Article Google Scholar
Bader, J. S., Chaudhuri, A., Rothberg, J., & Chant, J. (2004). Gaining confidence in high-throughput protein interaction networks. Nat. Biotechnol., 22(1), 78–85.
Article Google Scholar
Brun, C., Chevenet, F., Martin, D., Wojcik, J., Guénoche, A., & Jacq, B. (2003). Functional classification of proteins for the prediction of cellular function from a protein–protein interaction network. Genome Biol., 5(1), R6.
Article Google Scholar
Bunge, J., & Fitzpatrick, M. (1993). Estimating the number of species: A review. J. Am. Stat. Assoc., 88(421), 364–373.
Article Google Scholar
Burnham, K. P., & Overton, W. S. (1978). Estimation of the size of a closed population when capture probabilities vary among animals. Biometrika, 65(3), 625–633.
Article MATH Google Scholar
Chao, A. (2001). An overview of closed capture–recapture models. J. Agric. Biol. Environ. Stat., 6(2), 158–175.
Article Google Scholar
Chiang, T., Scholtens, D., Sarkar, D., & Gentleman, R. (2007). Coverage and error models of protein–protein interaction data by directed graph analysis. Genome Biol., 8, R186.
Article Google Scholar
de Silva, E., & Stumpf, M. P. H. (2005). Complex networks and simple models in biology. J. R. Soc. Interface, 2(5), 419–430.
Article Google Scholar
de Silva, E., Thorne, T., Ingram, P. J., Agrafioti, I., Swire, J., Wiuf, C., & Stumpf, M. P. H. (2006). The effects of incomplete protein interaction data on structural and evolutionary inferences. BMC Biol., 4(39), 39.
Article Google Scholar
D’haeseleer, P., & Church, G. (2004). Estimating and improving protein interaction error rates. In Proceedings of the IEEE computational systems bioinformatics conference.
Google Scholar
Drees, B. L., Thorsson, V., Carter, G. W., Rives, A. W., Raymond, M. Z., Avila-Campillo, I., Shannon, P., & Galitski, T. (2005). Derivation of genetic interaction networks from quantitative phenotype data. Genome Biol., 6(4), R38.
Article Google Scholar
Gentleman, R., & Huber, W. (2007). Making the most of high-throughput protein-interaction data. Genome Biol., 8(10), 112.
Article Google Scholar
Grigoriev, A. (2003). On the number of protein–protein interactions in the yeast proteome. Nucleic Acids Res., 31(14), 4157–4161.
Article Google Scholar
Hart, G. T., Ramani, A. K., & Marcotte, E. M. (2006). How complete are current yeast and human protein-interaction networks? Genome Biol., 7(11), 120.
Article Google Scholar
Heo, M., Maslov, S., & Shakhnovich, E. (2011). Topology of protein interaction network shapes protein abundances and strengths of their functional and nonspecific interactions. Proc. Natl. Acad. Sci., 108(10), 4258–4263.
Article Google Scholar
Hirschman, J. E., Balakrishnan, R., Christie, K. R., Costanzo, M. C., Dwight, S. S., Engel, S. R., Fisk, D. G., Hong, E. L., Livstone, M. S., Nash, R., Park, J., Oughtred, R., Skrzypek, M., Starr, B., Theesfeld, C. L., Williams, J., Andrada, R., Binkley, G., Dong, Q., Lane, C., Miyasato, S., Sethuraman, A., Schroeder, M., Thanawala, M. K., Weng, S., Dolinski, K., Botstein, D., & Cherry, J. M. (2006). Genome snapshot: a new resource at the saccharomyces genome database (sgd) presenting an overview of the saccharomyces cerevisiae genome. Nucleic Acids Res., 34(Database issue), D442–D445.
Article Google Scholar
Huang, H., Jedynak, B. M., & Bader, J. S. (2007). Where have all the interactions gone? estimating the coverage of two-hybrid protein interaction maps. PLoS Comput. Biol., 3(11), e214.
Article MathSciNet Google Scholar
Kelly, W. P., & Stumpf, M. P. H. (2008). Protein–protein interactions: from global to local analyses. Curr. Opin. Biotechnol., 19, 396–403.
Article Google Scholar
Kelly, W. P., & Stumpf, M. P. H. (2010). Trees on networks: resolving statistical patterns of phylogenetic similarities among interacting proteins. BMC Bioinform., 11, 470.
Article Google Scholar
Lèbre, S., Becq, J., Devaux, F., Stumpf, M. P. H., & Lelandais, G. (2010). Statistical inference of the time-varying structure of gene-regulation networks. BMC Syst. Biol., 4, 130.
Article Google Scholar
Marras, E., Travaglione, A., & Capobianco, E. (2010). Sub-modular resolution analysis by network mixture models. Stat. Appl. Genet. Mol. Biol., 9(1), 19.
MathSciNet Google Scholar
Schlitt, T., & Brazma, A. (2005). Modelling gene networks at different organisational levels. FEBS Lett., 579, 1859–1866.
Article Google Scholar
Shokouhi, M., Zobel, J., & Scholer, F. (2006). Capturing collection size for distributed non-cooperative retrieval. In SIGIR proceedings (pp. 316–323).
Google Scholar
Stumpf, M. P. H., Wiuf, C., & May, R. M. (2005). Subnets of scale-free networks are not scale-free: sampling properties of networks. Proc. Natl. Acad. Sci., 102(12), 4221–4224.
Article Google Scholar
Stumpf, M. P. H., Thorne, T., de Silva, E., Stewart, R., An, H., Lappe, M., & Wiuf, C. (2008). Estimating the size of the human interactome. Proc. Natl. Acad. Sci., 105(19), 6959–6964.
Article Google Scholar
Thorne, T. W., Ho, H.-L., Huvet, M., Haynes, K., & Stumpf, M. P. H. (2011). Prediction of putative protein interactions through evolutionary analysis of osmotic stress response in the model yeast Saccharomyces cerevisae. Fungal Genet. Biol., 48, 504–511.
Article Google Scholar
von Mering, C., Krause, R., Snel, B., Cornell, M., Oliver, S. G., Fields, S., & Bork, P. (2002). Comparative assessment of large-scale data sets of protein–protein interactions. Nature, 417(6887), 399–403.
Article Google Scholar
Xu, J., Wu, S., & Li, X. (2007). Estimating collection size with logistic regression. In SIGIR proceedings (pp. 789–790).
Google Scholar
Yang, L., Vondriska, T. M., Han, Z., MacLellan, W. R., Weiss, J. N., & Qu, Z. (2008). Deducing topology of protein–protein interaction networks from experimentally measured sub-networks. BMC Bioinform., 9, 301.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Centre for Bioinformatics, Imperial College London, London, UK
W. P. Kelly & M. P. H. Stumpf
Institute of Mathematical Sciences, Imperial College London, London, UK
M. P. H. Stumpf

Authors

W. P. Kelly
View author publications
You can also search for this author in PubMed Google Scholar
M. P. H. Stumpf
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to M. P. H. Stumpf.

Rights and permissions

Open Access This is an open access article distributed under the terms of the Creative Commons Attribution Noncommercial License (https://creativecommons.org/licenses/by-nc/2.0), which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

Reprints and permissions

About this article

Cite this article

Kelly, W.P., Stumpf, M.P.H. Assessing Coverage of Protein Interaction Data Using Capture–Recapture Models. Bull Math Biol 74, 356–374 (2012). https://doi.org/10.1007/s11538-011-9680-2

Download citation

Received: 27 August 2010
Accepted: 14 July 2011
Published: 26 August 2011
Issue Date: February 2012
DOI: https://doi.org/10.1007/s11538-011-9680-2

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Assessing Coverage of Protein Interaction Data Using Capture–Recapture Models

Abstract

Article PDF

Similar content being viewed by others

The social and structural architecture of the yeast protein interactome

Structural Pattern Discovery in Protein–Protein Interaction Networks

PINOT: an intuitive resource for integrating protein-protein interactions

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Assessing Coverage of Protein Interaction Data Using Capture–Recapture Models

Abstract

Article PDF

Similar content being viewed by others

The social and structural architecture of the yeast protein interactome

Structural Pattern Discovery in Protein–Protein Interaction Networks

PINOT: an intuitive resource for integrating protein-protein interactions

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation