Information Theoretic Prototype Selection for Unattributed Graphs

  • Lin Han
  • Luca Rossi
  • Andrea Torsello
  • Richard C. Wilson
  • Edwin R. Hancock
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7626)


In this paper we propose a prototype size selection method for a set of sample graphs. Our first contribution is to show how approximate set coding can be extended from the vector to graph domain. With this framework to hand we show how prototype selection can be posed as optimizing the mutual information between two partitioned sets of sample graphs. We show how the resulting method can be used for prototype graph size selection. In our experiments, we apply our method to a real-world dataset and investigate its performance on prototype size selection tasks.


Prototype Selection Mutual information Importance Sampling Partition function 


  1. 1.
    Han, L., Wilson, R.C., Hancock, E.R.: A Supergraph-based Generative Model. In: ICPR, pp. 1566–1569 (2010)Google Scholar
  2. 2.
    Torsello, A.: An Importance Sampling Approach to Learning Structural Representations of Shape. In: CVPR, pp. 1–7 (2008)Google Scholar
  3. 3.
    Buhmann, J.M., Chehreghani, M.H., Frank, M., Streich, A.P.: Information Theoretic Model Selection for Pattern Analysis. JMLR: Workshop and Conference Proceedings 7, 1–15 (2011)Google Scholar
  4. 4.
    Buhmann, J.M.: Information Thereotic Model Validation for Clustering. In: International Symposium on Information Theory, pp. 1398–1402 (2010)Google Scholar
  5. 5.
    Nene, S.A., Nayar, S.K., Murase, H.: Columbia Object Image Library(COIL100). Columbia University (1996)Google Scholar
  6. 6.
    National Center for Biotechnology Information,
  7. 7.
    Hammersley, J.M., Handscomb, D.C.: Monte Carlo Methods. Wiley, New York (1964)zbMATHCrossRefGoogle Scholar
  8. 8.
    Han, L., Hancock, E.R., Wilson, R.C.: Learning Generative Graph Prototypes Using Simplified von Neumann Entropy. In: GbRPR, p. 4251 (2011)Google Scholar
  9. 9.
    Rissanen, J.: Modelling by Shortest Data Description. Automatica 14, 465–471 (1978)zbMATHCrossRefGoogle Scholar
  10. 10.
    Schwarz, G.E.: Estimating the dimension of a model. Annals of Statistics 6, 461–464 (1978)MathSciNetzbMATHCrossRefGoogle Scholar
  11. 11.
    Foster, D.P., George, E.I.: The Risk Inflation Criterion for Multiple Regression. Annals of Statistics 22, 1947–1975 (1994)MathSciNetzbMATHCrossRefGoogle Scholar
  12. 12.
    White, D., Wilson, R.C.: Parts based generative models for graphs. In: ICPR, pp. 1–4 (2008)Google Scholar
  13. 13.
    Akaike, H.: A new look at the statistical model identification. IEEE Transactions on Automatic Control 19, 716–723 (1974)MathSciNetzbMATHCrossRefGoogle Scholar
  14. 14.
    Grnwald, P.D., Myung, I.J., Pitt, M.A.: Advances in Minimum Description Length: Theory and Applications. The MIT Press (2005)Google Scholar
  15. 15.
    Luo, B., Hancock, E.R.: Structural graph matching using the EM alogrithm and singular value decomposition. IEEE Transactions on PAMI 23, 1120–1136 (2001)CrossRefGoogle Scholar
  16. 16.
    Sinkhorn, R.: A Relationship Between Arbitrary Positive Matrices and Doubly Stochastic Matrices. The Annals of Mathematical Statistics 35, 876–879 (1964)MathSciNetzbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Lin Han
    • 1
  • Luca Rossi
    • 2
  • Andrea Torsello
    • 2
  • Richard C. Wilson
    • 1
  • Edwin R. Hancock
    • 1
  1. 1.Department of Computer ScienceUniversity of YorkUK
  2. 2.Department of Environmental Science, Informatics and StatisticsCa’ Foscari Univerisity of VeniceItaly

Personalised recommendations