Inference and Validation of Networks

  • Ilias N. Flaounas
  • Marco Turchi
  • Tijl De Bie
  • Nello Cristianini
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5781)


We develop a statistical methodology to validate the result of network inference algorithms, based on principles of statistical testing and machine learning. The comparison of results with reference networks, by means of similarity measures and null models, allows us to measure the significance of results, as well as their predictive power. The use of Generalised Linear Models allows us to explain the results in terms of available ground truth which we expect to be partially relevant. We present these methods for the case of inferring a network of News Outlets based on their preference of stories to cover. We compare three simple network inference methods and show how our technique can be used to choose between them. All the methods presented here can be directly applied to other domains where network inference is used.


Network inference Network validation News Outlets network 


  1. 1.
    D’haeseleer, P., Liang, S., Somogyi, R.: Genetic network inference: from co-expression clustering to reverse engineering. Bioinformatics 16, 707–726 (2000)CrossRefGoogle Scholar
  2. 2.
    Ma’ayan, A.: Insights into the Organization of Biochemical Regulatory Networks Using Graph Theory Analyses. J. Biol. Chem. 284, 5451–5455 (2009)CrossRefGoogle Scholar
  3. 3.
    Paris, L., Bazzoni, G.: The Protein Interaction Network of the Epithelial Junctional Complex: A System-Level Analysis. Mol. Biol. Cell 19, 5409–5421 (2008)CrossRefGoogle Scholar
  4. 4.
    Pelillo, M.: Replicator Equations, Maximal Cliques, and Graph Isomorphism. Neural Computation 11(8), 1933–1955 (1999)CrossRefGoogle Scholar
  5. 5.
    Bunke, H.: Error Correcting Graph Matching: On the Influence of the Underlying Cost Function. IEEE Transactions on Pattern Analysis and Machine Intelligence 21, 917–922 (1999)CrossRefGoogle Scholar
  6. 6.
    Fernández, M.-L., Valiente, G.: A graph distance metric combining maximum common subgraph and minimum common supergraph. Pattern Recognition Letters 22, 753–758 (2001)CrossRefzbMATHGoogle Scholar
  7. 7.
    Jaccard, P.: Ètude comparative de la distribution florale dans une portion des Alpes et des Jura. Bulletin del la Societ. Vaudoise des Sciences Naturelles 37, 547–579 (1901)Google Scholar
  8. 8.
    Milo, R., Shen-Orr, S., Itzkovitz, S., Kashtan, N., Chklovskii, D., Alon, U.: Network Motifs: Simple Building Blocks of Complex Networks. Science 298, 824–827 (2002)CrossRefGoogle Scholar
  9. 9.
    Erdös, P., Rényi, A.: On Random Graphs. Publicationes Mathematicae 6, 290–297 (1959)MathSciNetzbMATHGoogle Scholar
  10. 10.
    Rao, A.R., Jana, R., Bandyopadhya, S.: A Markov chain Monte Carlo method for generating random (0, 1)-matrices with given marginals. Indian J. of Statistics 58, 225–242 (1996)MathSciNetGoogle Scholar
  11. 11.
    Milo, R., Kashtan, N., Itzkovitz, S., Newman, M.E.J., Alon, U.: On the uniform generation of random graphs with prescribed degree sequences (2003) Arxiv cond-mat/0312028Google Scholar
  12. 12.
    Nelder, J., Wedderburn, R.: Generalized Linear Models. Journal of the Royal Statistical Society 135 Series A (General), 370–384 (1972)CrossRefGoogle Scholar
  13. 13.
    McCullagh, P., Nelder, J.: Generalized Linear Models. Chapman and Hall, London (1989)CrossRefzbMATHGoogle Scholar
  14. 14.
    Turchi, M., Flaounas, I., Ali, O., De Bie, T., Snowsill, T., Cristianini, N.: Found In Translation. In: Buntine, W., et al. (eds.) ECML/PKDD 2009. LNCS, vol. 5781. Springer, Heidelberg (2009), (accepted for publication)Google Scholar
  15. 15.
    Koehn, P., Hoang, H., et al.: Moses: Open Source Toolkit for Statistical Machine Translation. Annual Meeting-Association for Computational Linguistics 45 (2007)Google Scholar
  16. 16.
    Koehn, P., Och, F.J., Marcu, D.: Statistical phrase-based translation. In: 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, pp. 48–54. Association for Computational Linguistics, Morristown (2003)Google Scholar
  17. 17.
    Liu, B.: Web Data Mining, Exploring Hyperlinks, Contents, and Usage Data. Springer, Heidelberg (2007)zbMATHGoogle Scholar
  18. 18.
    Hirsh, A., Fraser, H.: Protein dispensability and rate of evolution. Nature 411, 1046–1049 (2001)CrossRefGoogle Scholar
  19. 19.
    Shannon, P., Markiel, A., Ozier, O., Baliga, N.S., Wang, J.T., Ramage, D., Amin, N., Schwikowski, I.T.: Cytoscape: A Software Environment for Integrated Models of Biomolecular Interaction Networks. Genome Res. 13, 2498–2504 (2003)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Ilias N. Flaounas
    • 1
  • Marco Turchi
    • 2
  • Tijl De Bie
    • 2
  • Nello Cristianini
    • 1
    • 2
  1. 1.Department of Computer ScienceBristol UniversityBristolUnited Kingdom
  2. 2.Department of Engineering MathematicsBristol UniversityBristolUnited Kingdom

Personalised recommendations