Bias-Free Hypothesis Evaluation in Multirelational Domains

  • Christine Körner
  • Stefan Wrobel
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3918)


In propositional domains using a separate test set via random sampling or cross validation is generally considered to be an unbiased estimator of true error. In multirelational domains previous work has already noted that linkage of objects may cause these procedures to be biased and has proposed corrected sampling procedures. However, as we show in this paper, the existing procedures only address one particular case of bias introduced by linkage. In this paper we therefore introduce generalized subgraph sampling, a sampling procedure based on bin packing, which ensures that test sets are properly chosen to match the probability of reencountering previously seen objects and which includes previous approaches as a special case. Experiments with data from the Internet Movie Database illustrate the performance of our algorithm.


Transductive Learning Neighbor Probability Inductive Logic Program Probabilistic Relational Model Internet Movie Database 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Jensen, D., Neville, J.: Autocorrelation and linkage cause bias in evaluation of relational learners. In: Matwin, S., Sammut, C. (eds.) ILP 2002. LNCS (LNAI), vol. 2583, pp. 101–116. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  2. 2.
    Körner, C., Wrobel, S.: Bias-free hypothesis evaluation in multirelational domains. Technical report, Fraunhofer Institut Autonome Intelligente Systeme (2005),
  3. 3.
    Getoor, L., Friedman, N., Koller, D., Pfeffer, A.: Relational data mining. In: Dzeroski, S., Lavrac, N. (eds.) Learning Probabilistic Relational Models, pp. 307–335. Springer, Berlin (2001)Google Scholar
  4. 4.
    Taskar, B., Abbeel, P., Koller, D.: Discriminative probabilistic models for relational data. In: Proc. of the 18th Conference on Uncertainty in Artificial Intelligence (2002)Google Scholar
  5. 5.
    Neville, J., Jensen, D.: Collective classification with relational dependency networks. In: Proc. of the 2nd Multi-Relational Data Mining Workshop, 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2003)Google Scholar
  6. 6.
    Fürnkranz, J.: personal communicationGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Christine Körner
    • 1
  • Stefan Wrobel
    • 1
    • 2
  1. 1.Fraunhofer Institut Autonome Intelligente SystemeGermany
  2. 2.Dept. of Computer Science IIIUniversity of BonnGermany

Personalised recommendations