Predicting Interacting Protein Pairs by Coevolutionary Paralog Matching

  • Thomas Gueudré
  • Carlo Baldassi
  • Andrea Pagnani
  • Martin WeigtEmail author
Part of the Methods in Molecular Biology book series (MIMB, volume 2074)


Even if we know that two families of homologous proteins interact, we do not necessarily know, which specific proteins interact inside each species. The reason is that most families contain paralogs, i.e., more than one homologous sequence per species. We have developed a tool to predict interacting paralogs between the two protein families, which is based on the idea of inter-protein coevolution: our algorithm matches those members of the two protein families, which belong to the same species and collectively maximize the detectable coevolutionary signal. It is applicable even in cases, where simpler methods based, e.g., on genomic co-localization of genes coding for interacting proteins or orthology-based methods fail. In this method paper, we present an efficient implementation of this idea based on freely available software.

Key words

Protein–protein interaction Coevolution Predicting interacting paralogs Direct coupling analysis Paralog matching 



A.P. and M.W. acknowledge funding by the EU H2020 research and innovation program MSCA-RISE-2016 under grant agreement No. 734439 INFERNET.


  1. 1.
    Shoemaker BA, Panchenko AR (2007) Deciphering protein−protein interactions. Part I. Experimental techniques and databases. PLoS Comput Biol 3(3):e42CrossRefGoogle Scholar
  2. 2.
    Rao VS, Srinivas K, Sujini GN, Kumar GN (2014) Protein-protein interaction detection: methods and analysis. Int J Proteomics 2014:147648CrossRefGoogle Scholar
  3. 3.
    Dandekar T, Snel B, Huynen M, Bork P (1998) Conservation of gene order: a fingerprint of proteins that physically interact. Trends Biochem Sci 23(9):324–328CrossRefGoogle Scholar
  4. 4.
    Galperin MY, Koonin EV (2000) Who’s your neighbor? New computational approaches for functional genomics. Nat Biotechnol 18(6):609–613CrossRefGoogle Scholar
  5. 5.
    Marcotte CJV, Marcotte EM (2002) Predicting functional linkages from gene fusions with confidence. Appl Bioinforma 1(2):93–100Google Scholar
  6. 6.
    Marcotte EM et al (1999) Detecting protein function and protein-protein interactions from genome sequences. Science 285(5428):751–753CrossRefGoogle Scholar
  7. 7.
    Pellegrini M, Marcotte EM, Thompson MJ, Eisenberg D, Yeates TO (1999) Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. Proc Natl Acad Sci U S A 96(8):4285–4288CrossRefGoogle Scholar
  8. 8.
    Pazos F, Valencia A (2001) Similarity of phylogenetic trees as indicator of protein−protein interaction. Protein Eng 14(9):609–614CrossRefGoogle Scholar
  9. 9.
    Juan D, Pazos F, Valencia A (2008) High-confidence prediction of global interactomes based on genome-wide coevolutionary networks. Proc Natl Acad Sci U S A 105(3):934–939CrossRefGoogle Scholar
  10. 10.
    Gueudré T, Baldassi C, Zamparo M, Weigt M, Pagnani A (2016) Simultaneous identification of specifically interacting paralogs and interprotein contacts by direct coupling analysis. Proc Natl Acad Sci U S A 113(43):12186–12191CrossRefGoogle Scholar
  11. 11.
    Szurmant H, Weigt M (2018) Inter-residue, inter-protein and inter-family coevolution: bridging the scales. Curr Opin Struct Biol 50:26–32CrossRefGoogle Scholar
  12. 12.
    Cocco S, Feinauer C, Figliuzzi M, Monasson R, Weigt M (2018) Inverse statistical physics of protein sequences: a key issues review. Rep Prog Phys 81(3):032601CrossRefGoogle Scholar
  13. 13.
    Bitbol AF, Dwyer RS, Colwell LJ, Wingreen NS (2016) Inferring interaction partners from protein sequences. Proc Natl Acad Sci U S A 113(43):12180–12185CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2020

Authors and Affiliations

  • Thomas Gueudré
    • 1
  • Carlo Baldassi
    • 2
    • 3
  • Andrea Pagnani
    • 1
    • 3
    • 4
  • Martin Weigt
    • 5
    Email author
  1. 1.Italian Institute for Genomic MedicineTurinItaly
  2. 2.Bocconi Institute for Data Science and AnalyticsBocconi UniversityMilanItaly
  3. 3.INFNSezione di TorinoTorinoItaly
  4. 4.Dipartimento di Scienza Applicata e TecnologiaPolitecnico di TorinoTorinoItaly
  5. 5.Sorbonne Université, CNRSInstitut de Biologie Paris Seine, Biologie Computationnelle et Quantitative—LCQBParisFrance

Personalised recommendations