Two local dissimilarity measures for weighted graphs with application to protein interaction networks

  • Jean-Baptiste Angelelli
  • Anaïs Baudot
  • Christine Brun
  • Alain Guénoche
Regular Article

Abstract

We extend the Czekanowski-Dice dissimilarity measure, classically used to cluster the vertices of unweighted graphs, to weighted ones. The first proposed formula corresponds to edges weighted by a probability of existence. The second one is adapted to edges weighted by intensity or strength. We show on simulated graphs that the class identification process is improved by computing weighted compared to unweighted edges. Finally, an application to a drosophila protein network illustrates the fact that using these new formulas improves the ’biological accuracy’ of partitioning.

Keywords

Graph distance Graph partitioning Heuristic optimisation Biological networks 

Mathematics Subject Classification (2000)

05C12 90C35 90C59 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Andjilani M, Droz JP, Benahmed M and Tabone E (2006). Down-regulation of fak and iaps by laminin during cisplatin-induced apoptosis in testicular germ cell tumors. Int J Oncol 28(2): 535–542 Google Scholar
  2. Bader GD and Hogue CW (2003). An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinf 4: 2 CrossRefGoogle Scholar
  3. Baudot A, Martin O, Mouren P, Chevenet F, Guénoche A, Jacq B and Brun C (2006). Prodistin web site: a tool for the functional classification of proteins from interaction networks. BMC Bioinf 22(2): 248–250 Google Scholar
  4. Baudot A, Jacq B and Brun C (2004). A scale of functional divergence for yeast duplicated genes revealed from the analysis of the protein–protein interaction network. Genome Biol 5: R76 CrossRefGoogle Scholar
  5. Brandes U, Gaertler M, Wagner D (2003) Experiments on graph clustering algorithms. In: Di Battista G, Zwick U (eds) ESA 2003. LNCS, vol 2832. Springer, Heidelberg, pp 568–579Google Scholar
  6. Brun C, Chevenet F, Martin D, Wojcik J, Guénoche A and Jacq B (2003). Functional classification of proteins for the prediction of cellular function from a protein–protein interaction network. Genome Biol 5: R6 CrossRefGoogle Scholar
  7. Brun C, Herrmann C and Guénoche A (2004). Clustering proteins from interaction networks for the prediction of cellular functions. BMC Bioinf 5: 95 CrossRefGoogle Scholar
  8. Charon I, Denoed L, Guénoche A and Hudry O (2006). Maximum transfer distance between partitions. J Classif 23(1): 103–121 CrossRefGoogle Scholar
  9. Chen J, Chua HN, Hsu W, Lee ML, Ng SK, Saito R, Sung WK and Wong L (2006). Increasing confidence of protein–protein interactomes. Genome Inf 17(2): 284–297 Google Scholar
  10. Chua HN, Sung WK and Wong L (2006). Exploiting indirect neighbours and topological weight to predict protein function from protein-protein interactions. Bioinformatics 22(13): 1623–1630 CrossRefGoogle Scholar
  11. Day W (1981). The complexity of computing metric distances between partitions. Math Soc Sci 1: 269–287 MATHCrossRefGoogle Scholar
  12. Dice LR (1945). Measures of the amount of ecologic association between species. Ecology 26(3): 297–302 CrossRefGoogle Scholar
  13. van Dongen S (2000) Graph clustering by flow simulation. Ph.D. Thesis, University of UtrechtGoogle Scholar
  14. Fichet B and Le Calvé G (1984). Structure géometrique des principaux indices de dissimilarité sur signes de présence–absence. Stat Anal Donnés 3: 11–44 Google Scholar
  15. Formstecher E, Arresta S, Collura V, Hamberger A, Meil A, Trehin A, Reverdy C, Betin V, Maire S, Brun C, Jacq B, Arpin M, Bellaiche Y, Bellusci S, Benaroch P, Bornens M, Chanet R, Chavrier P, Delattre O, Doye V, Fehon R, Faye G, Galli T, Girault JA, Goud B, Johannes L, Junier MP, Mirouse V, Mukherjee A, Papadopoulo D, Perez F, Plessis A, Rosbach M, Ross C, Saule S, Stoppa-Lyonnet D, Vincent A, White M, Legrain P, Wojcik J, Camonis J, Daviet L and Gunzburg J (2005). Protein interaction mapping: a drosophila case study. Genome Res 15: 376–384 CrossRefGoogle Scholar
  16. Gascuel O (1997). BioNJ: an improved version of the nj algorithm based on a simple model of sequence data. Mol Biol Evol 14(7): 685–695 Google Scholar
  17. Girvan M and Newman MEJ (2002). Community structure in social and biological networks. Proc Natl Acad Sci USA 99(12): 7821–7826 MATHCrossRefMathSciNetGoogle Scholar
  18. Guénoche A (2004) Clustering by vertex density in a graph. In: Banks D et al Proceedings of IFCS congress classification, clustering and data mining applications. Springer, Berlin, pp 14–24Google Scholar
  19. Guénoche A (2005) Comparing recent methods in graph partitioning. ICGT’05. Electronic notes in discrete mathematics, vol 22, pp 83–89Google Scholar
  20. Guénoche A (2008) Comparison of algorithms in graph partitioning. RAIRO (to appear)Google Scholar
  21. Kuntz P (1992) Représentation euclidienne d’un graphe abstrait en vue de sa segmentation. Ph.D. Thesis, Ecole des Hautes Etudes en Sciences Sociales, ParisGoogle Scholar
  22. Nemwan MEJ (2006). Modularity and community structure in networks. PNAS 103(23): 8577–8582 CrossRefGoogle Scholar
  23. Pons P and Latapy M (2006). Computing communities in large networks using random walks. J Graph Algorithms Appl 10(2): 191–218 MathSciNetGoogle Scholar
  24. Régnier S (1965) Quelques aspects mathématiques des problèmes de classification automatique. ICC Bull 4, reprint (1983) Math Sci Hum 82:13–29Google Scholar
  25. Zhong W and Sternberg PW (2006). Genome-wide prediction of C. elegans genetic interactions. Science 318: 1481–1484 CrossRefGoogle Scholar

Copyright information

© Springer-Verlag 2008

Authors and Affiliations

  • Jean-Baptiste Angelelli
    • 1
  • Anaïs Baudot
    • 2
  • Christine Brun
    • 2
  • Alain Guénoche
    • 1
  1. 1.IML, CNRS-Université de la MéditerranéeMarseille cedex 9France
  2. 2.IBDML, CNRS-Université de la MéditerranéeMarseille cedex 9France

Personalised recommendations