Advertisement

Machine Learning

, Volume 73, Issue 1, pp 87–106 | Cite as

A bias/variance decomposition for models using collective inference

Article

Abstract

Bias/variance analysis is a useful tool for investigating the performance of machine learning algorithms. Conventional analysis decomposes loss into errors due to aspects of the learning process, but in relational domains, the inference process used for prediction introduces an additional source of error. Collective inference techniques introduce additional error, both through the use of approximate inference algorithms and through variation in the availability of test-set information. To date, the impact of inference error on model performance has not been investigated. We propose a new bias/variance framework that decomposes loss into errors due to both the learning and inference processes. We evaluate the performance of three relational models on both synthetic and real-world datasets and show that (1) inference can be a significant source of error, and (2) the models exhibit different types of errors as data characteristics are varied.

Keywords

Statistical relational learning Collective inference Evaluation 

References

  1. Bernstein, A., Clearwater, S., & Provost, F. (2003). The relational vector-space model and industry classification. In Proceedings of the IJCAI-2003 workshop on learning statistical models from relational data (pp. 8–18). Google Scholar
  2. Craven, M., DiPasquo, D., Freitag, D., McCallum, A., Mitchell, T., Nigam, K., & Slattery, S. (1998). Learning to extract symbolic knowledge from the world wide web. In Proceedings of the 15th national conference on artificial intelligence (pp. 509–516). Google Scholar
  3. Domingos, P. (2000). A unified bias-variance decomposition for zero-one and squared loss. In Proceedings of the 17th national conference on artificial intelligence (pp. 564–569). Google Scholar
  4. Domingos, P., & Pazzani, M. (1997). On the optimality of the simple Bayesian classifier under zero-one loss. Machine Learning, 29, 103–130. MATHCrossRefGoogle Scholar
  5. Duda, R., Hart, P., & Stork, D. (2001). Pattern classification. New York: Wiley. MATHGoogle Scholar
  6. Friedman, J. (1997). On bias, variance, 0/1-loss, and the curse-of-dimensionality. Data Mining and Knowledge Discovery, 1(1), 55–77. CrossRefGoogle Scholar
  7. Geman, S., Bienenstock, E., & Doursat, R. (1992). Neural networks and the bias/variance dilemma. Neural Computation, 4, 1–58. CrossRefGoogle Scholar
  8. Getoor, L., Friedman, N., Koller, D., & Pfeffer, A. (2001). Learning probabilistic relational models. In Relational data mining (pp. 307–335). Berlin: Springer. Google Scholar
  9. Goodman, L. (1961). Snowball Sampling. Annals of Mathematical Statistics, 32, 148–170. MATHCrossRefMathSciNetGoogle Scholar
  10. Heckerman, D., Chickering, D., Meek, C., Rounthwaite, R., & Kadie, C. (2000). Dependency networks for inference, collaborative filtering and data visualization. Journal of Machine Learning Research, 1, 49–75. CrossRefGoogle Scholar
  11. Hill, S., Provost, F., & Volinsky, C. (2006). Network-based marketing: identifying likely adopters via consumer networks. Statistical Science, 22(2). Google Scholar
  12. Holte, R. (1993). Very simple classification rules perform well on most commonly used datatsets. Machine Learning, 11, 63–91. MATHCrossRefGoogle Scholar
  13. James, G. (2003). Variance and bias for general loss functions. Machine Learning, 51, 115–135. MATHCrossRefGoogle Scholar
  14. Jensen, D., & Neville, J. (2002). Linkage and autocorrelation cause feature selection bias in relational learning. In Proceedings of the 19th international conference on machine learning (pp. 259–266). Google Scholar
  15. Jensen, D., Neville, J., & Gallagher, B. (2004). Why collective inference improves relational classification. In Proceedings of the 10th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 593–598). Google Scholar
  16. Macskassy, S., & Provost, F. (2007). Classification in networked data: a toolkit and a univariate case study. Journal of Machine Learning Research, 8, 935–983. Google Scholar
  17. McCallum, A., Nigam, K., Rennie, J., & Seymore, K. (1999). A machine learning approach to building domain-specific search engines. In Proceedings of the 16th international joint conference on artificial intelligence (pp. 662–667). Google Scholar
  18. Murphy, K., Weiss, Y., & Jordan, M. (1999). Loopy belief propagation for approximate inference: an empirical study. In Proceedings of the 15th conference on uncertainty in artificial intelligence (pp. 467–479). Google Scholar
  19. Neville, J., & Jensen, D. (2004). Dependency networks for relational data. In Proceedings of the 4th IEEE international conference on data mining (pp. 170–177). Google Scholar
  20. Neville, J., & Jensen, D. (2005). Leveraging relational autocorrelation with latent group models. In Proceedings of the 5th IEEE international conference on data mining (pp. 322–329). Google Scholar
  21. Neville, J., Jensen, D., Friedland, L., & Hay, M. (2003). Learning relational probability trees. In Proceedings of the 9th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 625–630). Google Scholar
  22. Taskar, B., Abbeel, P., & Koller, D. (2002). Discriminative probabilistic models for relational data. In Proceedings of the 18th conference on uncertainty in artificial intelligence (pp. 485–492). Google Scholar
  23. Wainwright, M. (2005). Estimating the “wrong” Markov random field: benefits in the computation-limited setting. In Advances in neural information processing systems. Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2008

Authors and Affiliations

  1. 1.Departments of Computer Science and StatisticsPurdue UniversityWest LafayetteUSA
  2. 2.Department of Computer ScienceUniversity of Massachusetts AmherstAmherstUSA

Personalised recommendations