An Important Connection Between Network Motifs and Parsimony Models

  • Teresa M. Przytycka
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3909)


We demonstrate an important connection between network motifs in certain biological networks and validity of evolutionary trees constructed using parsimony methods. Parsimony methods assume that taxa are described by a set of characters and infer phylogenetic trees by minimizing number of character changes required to explain observed character states. From the perspective of applicability of parsimony methods, it is important to assess whether the characters used to infer phylogeny are likely to provide a correct tree. We introduce a graph theoretical characterization that helps to select correct characters. Given a set of characters and a set of taxa, we construct a network called character overlap graph. We show that the character overlap graph for characters that are appropriate to use in parsimony methods is characterized by significant under-representation of subnetworks known as holes, and provide a mathematical validation for this observation. This characterization explains success in constructing evolutionary trees using parsimony method for some characters (e.g. protein domains) and lack of such success for other characters (e.g. introns). In the latter case, the understanding of mathematical obstacles to applying parsimony methods in a direct way has lead us to a new approach for dealing with inconsistent and/or noisy data. Namely, we introduce the concept of persistent characters which is similar but less restrictive than the well known concept of pairwise compatible characters. Application of this approach to introns produces the evolutionary tree consistent with the Coelomata hypothesis. In contrast, the direct application of a parsimony method, using introns as characters, produces a tree which is inconsistent with any of the two competing evolutionary hypotheses. Similarly, replacing persistence with pairwise compatibility does not lead to a correct tree. This indicates that the concept of persistence provides an important addition to the parsimony metohds.


Evolutionary Tree Parsimony Model Network Motif Chordal Graph Parsimony Method 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Adoutte, A., Balavoine, G., Lartillot, N., Lespinet, O., Prud’homme, B., de Rosa, R.: Special Feature: The new animal phylogeny: Reliability and implications. PNAS 97(9), 4453–4456 (2000)CrossRefGoogle Scholar
  2. 2.
    Aguinaldo, A.M., Turbeville, J.M., Linford, L.S., Rivera, M.C., Garey, J.R., Raff, R.A., Lake, J.A.: Evidence for a clade of nematodes, arthropods and other moulting animals. Nature 387, 489–493 (1997)CrossRefGoogle Scholar
  3. 3.
    Apic, G., Huber, W., Teichmann, S.A.: Multi-domain protein families and domain pairs: Comparison with known structures and a random model of domain recombination. J. Struc. Func. Genomics 4, 67–78 (2003)CrossRefGoogle Scholar
  4. 4.
    Barabasi, A.-L., Albert, R.: Emergence of scaling in random networks. Science 286, 509–512 (1999)CrossRefMathSciNetGoogle Scholar
  5. 5.
    Blair, J., Ikeo, K., Gojobori, T., Blair Hedges, S.: The evolutionary position of nematodes. BMC Evolutionary Biology 2(1), 7 (2002)CrossRefGoogle Scholar
  6. 6.
    Boeckmann, B., Bairoch, A., Apweiler, R., Blatter, M.-C., Estreicher, A., Gasteiger, E., Martin, M.J., Michoud, K., O’Donovan, C., Phan, I., Pilbout, S., Schneider, M.: The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Res. 31, 365–370 (2003)CrossRefGoogle Scholar
  7. 7.
    Buneman, P.: A characterisation of rigid circuit graphs. Discrete Math. 9, 205–212 (1974)MATHCrossRefMathSciNetGoogle Scholar
  8. 8.
    Camin, J.H., Sokal, R.R.: A method for deducting branching sequences in phylogeny. Evolution 19, 311–326 (1965)CrossRefGoogle Scholar
  9. 9.
    Day, W.H.E., Johnson, D., Sankoff, D.: The computational complexity of inferring rooted phylogenies by parsimony. Mathematical Biosciences 81, 33–42 (1986)MATHCrossRefMathSciNetGoogle Scholar
  10. 10.
    Deeds, E.J., Hennessey, H., Shakhnovich, E.I.: Prokaryotic phylogenies inferred from protein structural domains. Genome Res. 15(3), 393–402 (2005)CrossRefGoogle Scholar
  11. 11.
    Felsenstein, J.: Inferring Phylogenies. Sinauer Associates (2004)Google Scholar
  12. 12.
    Gavril, F.: The intersection graphs of subtrees in trees are exactly the chordal graphs. J. Comb. Theory (B) 16, 47–56 (1974)MATHCrossRefMathSciNetGoogle Scholar
  13. 13.
    Geer, L.Y., Domrachev, M., Lipman, D.J., Bryant, S.H.: CDART: protein homology by domain architecture. Genome Res. 12(10), 1619–1623 (2002)CrossRefGoogle Scholar
  14. 14.
    Golumbic, M.: Algorithmic Graph Theory and Perfect Graphs. Academic Press, New York (1980)MATHGoogle Scholar
  15. 15.
    Farris, J.S.: Phylogenetic analysis under Dollo’s law. Systematic Zoology 26(1), 77–88 (1977)CrossRefMathSciNetGoogle Scholar
  16. 16.
    Letunic, I., Goodstadt, L., Dickens, N.J., Doerks, T., Schultz, J., Mott, R., Ciccarelli, F., Copley, R.R., Ponting, C.P., Bork, P.P.: Recent improvements to the SMART domain-based sequence annotation resource. Nucleic Acids Res. 31(1), 242–244 (2002)CrossRefGoogle Scholar
  17. 17.
    Lewis, J.M., Yannakakis, M.: The node-deletion problem for hereditary properties is NP- complete. J. Comput. Syst. Sci. 20(2), 219–230 (1980)MATHCrossRefMathSciNetGoogle Scholar
  18. 18.
    McKee, T.A., McMorris, F.R.: Topics in intersection graph theory. SIAM Monographs on Discrete Mathematics and Applications (1999)Google Scholar
  19. 19.
    McMorris, F.R., Warnow, T., Wimer, T.: Triangulating vertex colored graphs. SIAM J. on Discrete Mathematics 7(2), 296–306 (1994)MATHCrossRefMathSciNetGoogle Scholar
  20. 20.
    Mehlhorn, K., Naher, S.: The LEDA Platform of Combinatorial and Geometric Computing. Cambridge University Press, Cambridge (1999)Google Scholar
  21. 21.
    Middendorf, M., Ziv, E., Wiggins, C.H.: From The Cover: Inferring network mechanisms: The Drosophila melanogaster protein interaction network. PNAS 102(9), 3192–3197 (2005)CrossRefGoogle Scholar
  22. 22.
    Milo, R., Itzkovitz, S., Kashtan, N., Levitt, R., Shen-Orr, S., Ayzenshtat, I., Sheffer, M., Alon, U.: Superfamilies of Evolved and Designed Networks. Science 303(5663), 1538–1542 (2004)CrossRefGoogle Scholar
  23. 23.
    Milo, R., Shen-Orr, S., Itzkovitz, S., Kashtan, N., Chklovskii, D., Alon, U.: Network Motifs: Simple Building Blocks of Complex Networks. Science 298(5594), 824–827 (2002)CrossRefGoogle Scholar
  24. 24.
    Przulj, N., Corneil, D.G., Jurisica, I.: Modeling interactome: scale-free or geometric? Bioinformatics 20(18), 3508–3515 (2004)CrossRefGoogle Scholar
  25. 25.
    Przytycka, T.M., Davis, G., Song, N., Durand, D.: Graph theoretical insight into evolution of multidomain proteins. In: Miyano, S., Mesirov, J., Kasif, S., Istrail, S., Pevzner, P.A., Waterman, M. (eds.) RECOMB 2005. LNCS (LNBI), vol. 3500, pp. 311–325. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  26. 26.
    Przytycka, T.M., Yu, Y.K.: Scale-free networks versus evolutionary drift. Computational Biology and Chemistry 28, 257–264 (2004)MATHCrossRefGoogle Scholar
  27. 27.
    Rogozin, I.B., Wolf, I.Y., Sorokin, A.V., Mirkin, B.G., Koonin, E.V.: Remarkable interkingdom conservation of intron positions and massive, lineage-specific intron loss and gain in eukaryotic evolution. Current Biology 13, 1512–1517 (2003)CrossRefGoogle Scholar
  28. 28.
    Tatusov, R., Fedorova, N., Jackson, J., Jacobs, A., Kiryutin, B., Koonin, E., Krylov, D., Mazumder, R., Mekhedov, S., Nikolskaya, A., Rao, B.S., Smirnov, S., Sverdlov, A., Vasudevan, S., Wolf, Y., Yin, J., Natale, D.: The cog database: an updated version includes eukaryotes. BMC Bioinformatics 4(1), 41 (2003)CrossRefGoogle Scholar
  29. 29.
    Winstanley, H.F., Abeln, S., Deane, C.M.: How old is your fold? Bioinformatics 21(Suppl. 1), i449–458 (2005)Google Scholar
  30. 30.
    Wolf, Y.I., Rogozin, I.B., Koonin, E.V.: Coelomata and Not Ecdysozoa: Evidence From Genome-Wide Phylogenetic Analysis. Genome Res. 14(1), 29–36 (2004)CrossRefGoogle Scholar
  31. 31.
    Wuchty, S.: Scale-free behavior in protein domain networks. Mol. Biol. Evol. 18, 1694–1702 (2001)Google Scholar
  32. 32.
    Wuchty, S., Almaas, E.: Evolutionary cores of domain co-occurrence networks. BMC Evolutionary Biology 5(1), 24 (2005)CrossRefGoogle Scholar
  33. 33.
    Yannakakis, M.: Computing the minimum fill-in is NP- complete. SIAM J. Alg and Discrete Math 2, 77–79 (1981)MATHCrossRefMathSciNetGoogle Scholar
  34. 34.
    Yeger-Lotem, E., Sattath, S., Kashtan, N., Itzkovitz, S., Milo, R., Pinter, R.Y., Alon, U., Margalit, H.: Network motifs in integrated cellular networks of transcription-regulation and protein-protein interaction. PNAS 101(16), 5934–5939 (2004)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Teresa M. Przytycka
    • 1
  1. 1.National Center for Biotechnology Information, US National Library of MedicineNational Institutes of HealthBethesdaUSA

Personalised recommendations