Abstract
We develop a statistical methodology to validate the result of network inference algorithms, based on principles of statistical testing and machine learning. The comparison of results with reference networks, by means of similarity measures and null models, allows us to measure the significance of results, as well as their predictive power. The use of Generalised Linear Models allows us to explain the results in terms of available ground truth which we expect to be partially relevant. We present these methods for the case of inferring a network of News Outlets based on their preference of stories to cover. We compare three simple network inference methods and show how our technique can be used to choose between them. All the methods presented here can be directly applied to other domains where network inference is used.
Chapter PDF
Similar content being viewed by others
References
D’haeseleer, P., Liang, S., Somogyi, R.: Genetic network inference: from co-expression clustering to reverse engineering. Bioinformatics 16, 707–726 (2000)
Ma’ayan, A.: Insights into the Organization of Biochemical Regulatory Networks Using Graph Theory Analyses. J. Biol. Chem. 284, 5451–5455 (2009)
Paris, L., Bazzoni, G.: The Protein Interaction Network of the Epithelial Junctional Complex: A System-Level Analysis. Mol. Biol. Cell 19, 5409–5421 (2008)
Pelillo, M.: Replicator Equations, Maximal Cliques, and Graph Isomorphism. Neural Computation 11(8), 1933–1955 (1999)
Bunke, H.: Error Correcting Graph Matching: On the Influence of the Underlying Cost Function. IEEE Transactions on Pattern Analysis and Machine Intelligence 21, 917–922 (1999)
Fernández, M.-L., Valiente, G.: A graph distance metric combining maximum common subgraph and minimum common supergraph. Pattern Recognition Letters 22, 753–758 (2001)
Jaccard, P.: Ètude comparative de la distribution florale dans une portion des Alpes et des Jura. Bulletin del la Societ. Vaudoise des Sciences Naturelles 37, 547–579 (1901)
Milo, R., Shen-Orr, S., Itzkovitz, S., Kashtan, N., Chklovskii, D., Alon, U.: Network Motifs: Simple Building Blocks of Complex Networks. Science 298, 824–827 (2002)
Erdös, P., Rényi, A.: On Random Graphs. Publicationes Mathematicae 6, 290–297 (1959)
Rao, A.R., Jana, R., Bandyopadhya, S.: A Markov chain Monte Carlo method for generating random (0, 1)-matrices with given marginals. Indian J. of Statistics 58, 225–242 (1996)
Milo, R., Kashtan, N., Itzkovitz, S., Newman, M.E.J., Alon, U.: On the uniform generation of random graphs with prescribed degree sequences (2003) Arxiv cond-mat/0312028
Nelder, J., Wedderburn, R.: Generalized Linear Models. Journal of the Royal Statistical Society 135 Series A (General), 370–384 (1972)
McCullagh, P., Nelder, J.: Generalized Linear Models. Chapman and Hall, London (1989)
Turchi, M., Flaounas, I., Ali, O., De Bie, T., Snowsill, T., Cristianini, N.: Found In Translation. In: Buntine, W., et al. (eds.) ECML/PKDD 2009. LNCS, vol. 5781. Springer, Heidelberg (2009), http://patterns.enm.bris.ac.uk/publications/found-in-translation (accepted for publication)
Koehn, P., Hoang, H., et al.: Moses: Open Source Toolkit for Statistical Machine Translation. Annual Meeting-Association for Computational Linguistics 45 (2007)
Koehn, P., Och, F.J., Marcu, D.: Statistical phrase-based translation. In: 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, pp. 48–54. Association for Computational Linguistics, Morristown (2003)
Liu, B.: Web Data Mining, Exploring Hyperlinks, Contents, and Usage Data. Springer, Heidelberg (2007)
Hirsh, A., Fraser, H.: Protein dispensability and rate of evolution. Nature 411, 1046–1049 (2001)
Shannon, P., Markiel, A., Ozier, O., Baliga, N.S., Wang, J.T., Ramage, D., Amin, N., Schwikowski, I.T.: Cytoscape: A Software Environment for Integrated Models of Biomolecular Interaction Networks. Genome Res. 13, 2498–2504 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Flaounas, I.N., Turchi, M., De Bie, T., Cristianini, N. (2009). Inference and Validation of Networks. In: Buntine, W., Grobelnik, M., Mladenić, D., Shawe-Taylor, J. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2009. Lecture Notes in Computer Science(), vol 5781. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04180-8_40
Download citation
DOI: https://doi.org/10.1007/978-3-642-04180-8_40
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04179-2
Online ISBN: 978-3-642-04180-8
eBook Packages: Computer ScienceComputer Science (R0)