Reconstruction of Causal Networks by Set Covering

  • Nick Fyson
  • Tijl De Bie
  • Nello Cristianini
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6594)


We present a method for the reconstruction of networks, based on the order of nodes visited by a stochastic branching process. Our algorithm reconstructs a network of minimal size that ensures consistency with the data. Crucially, we show that global consistency with the data can be achieved through purely local considerations, inferring the neighbourhood of each node in turn. The optimisation problem solved for each individual node can be reduced to a set covering problem, which is known to be NP-hard but can be approximated well in practice. We then extend our approach to account for noisy data, based on the Minimum Description Length principle. We demonstrate our algorithms on synthetic data, generated by an SIR-like epidemiological model.


machine learning network inference data mining complex systems minimum description length 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Sprinzak, D., Elowitz, M.B.: Reconstruction of genetic circuits. Nature 438(7067), 443–448 (2005)CrossRefGoogle Scholar
  2. 2.
    Brown, E.N., Kass, R.E., Mitra, P.P.: Multiple neural spike train data analysis: state-of-the-art and future challenges. Nature Neuroscience 7(5), 456–461 (2004)CrossRefGoogle Scholar
  3. 3.
    Leskovec, J., McGlohon, M., Faloutsos, C., Glance, N., Hurst, M.: Cascading behavior in large blog graphs. In: SDM 2007 (2007)Google Scholar
  4. 4.
    Rodriguez, M., Leskovec, J., Krause, A.: Inferring networks of diffusion and influence. In: KDD 2010 (2010)Google Scholar
  5. 5.
    Chvatal, V.: A greedy heuristic for the set-covering problem. Mathematics of Operations Research 4(3), 233–235 (1979)MathSciNetCrossRefzbMATHGoogle Scholar
  6. 6.
    Slavík, P.: Improved performance of the greedy algorithm for partial cover. Information Processing Letters 64(5), 251–254 (1997)MathSciNetCrossRefzbMATHGoogle Scholar
  7. 7.
    Wallace, C.S., Boulton, D.M.: An information measure for classification. Computer Journal 11(2), 185–194 (1968)CrossRefzbMATHGoogle Scholar
  8. 8.
    MacKay, D.J.C.: Information Theory, Inference & Learning Algorithms, 1st edn. Cambridge University Press, Cambridge (2002)Google Scholar
  9. 9.
    Barabási, A.-L., Albert, R.: Emergence of scaling in random networks. Science 286(5439), 509–512 (1999)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Leskovec, J., Backstrom, L., Kleinberg, J.: Meme-tracking and the dynamics of the news cycle. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 497–506 (2009)Google Scholar
  11. 11.
    Snowsill, T., Nicart, F., Stefani, M., De Bie, T., Cristianini, N.: Finding surprising patterns in textual data streams. In: Proceedings of Cognitive Information Processing 2010 (April 2010)Google Scholar
  12. 12.
    Nageswara Rao, S., Viswanadham, N.: Fault diagnosis in dynamical systems: a graph theoretic approach. International Journal of Systems Science 18(4), 687–695 (1987)CrossRefzbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Nick Fyson
    • 1
    • 2
  • Tijl De Bie
    • 1
  • Nello Cristianini
    • 1
  1. 1.Intelligent Systems LaboratoryBristol UniversityBristolUK
  2. 2.Bristol Centre for Complexity SciencesBristol UniversityBristolUK

Personalised recommendations