Knowledge and Information Systems

, Volume 30, Issue 1, pp 31–55 | Cite as

Correcting evaluation bias of relational classifiers with network cross validation

  • Jennifer Neville
  • Brian Gallagher
  • Tina Eliassi-Rad
  • Tao Wang
Open Access
Regular Paper

Abstract

Recently, a number of modeling techniques have been developed for data mining and machine learning in relational and network domains where the instances are not independent and identically distributed (i.i.d.). These methods specifically exploit the statistical dependencies among instances in order to improve classification accuracy. However, there has been little focus on how these same dependencies affect our ability to draw accurate conclusions about the performance of the models. More specifically, the complex link structure and attribute dependencies in relational data violate the assumptions of many conventional statistical tests and make it difficult to use these tests to assess the models in an unbiased manner. In this work, we examine the task of within-network classification and the question of whether two algorithms will learn models that will result in significantly different levels of performance. We show that the commonly used form of evaluation (paired t-test on overlapping network samples) can result in an unacceptable level of Type I error. Furthermore, we show that Type I error increases as (1) the correlation among instances increases and (2) the size of the evaluation set increases (i.e., the proportion of labeled nodes in the network decreases). We propose a method for network cross-validation that combined with paired t-tests produces more acceptable levels of Type I error while still providing reasonable levels of statistical power (i.e., 1−Type II error).

Keywords

Relational learning Collective classification Statistical tests Methodology 

References

  1. 1.
    Chakrabarti S, Dom B, Indyk P (1998) Enhanced hypertext categorization using hyperlinks. In: Proceedings of the ACM SIGMOD int’l conference on management of data. pp 307–318Google Scholar
  2. 2.
    Cohen P (1995) Empirical methods for artificial intelligence. MIT Press, CambridgeMATHGoogle Scholar
  3. 3.
    Dhurandhar A, Dobra A (2008) Probabilistic characterization of random decision trees. J Mach Learn Res 9: 2321–2348MATHGoogle Scholar
  4. 4.
    Dhurandhar A, Dobra A (2008) Study of classification models and model selection measures based on moment analysis. In: NIPS’08 workshop on new challenges in theoretical machine learning: learning with data-dependent concept spacesGoogle Scholar
  5. 5.
    Dietterich T (1998) Approximate statistical tests for comparing supervised classification learning algorithms. Neural Comput 10: 1895–1923CrossRefGoogle Scholar
  6. 6.
    Dundar M, Krishnapuram B, Bi J, Rao RB (2007) Learning classifiers when the training data is not iid. In: Proceedings of the 20th int’l joint conference on artificial intelligenceGoogle Scholar
  7. 7.
    Gallagher B, Eliassi-Rad T (2007) An examination of experimental methodology for classifiers of relational data. In: Workshop proceedings of the seventh IEEE int’l conference on data mining. pp 411–416Google Scholar
  8. 8.
    Gallagher B, Eliassi-Rad T (2008) Leveraging label-independent features for classification in sparsely labeled networks: an empirical study. In: Proceedings of the second ACM SIGKDD workshop on social network mining and analysis (SNA-KDD’08)Google Scholar
  9. 9.
    Gallagher B, Tong H, Eliassi-Rad T, Faloutsos C (2008) Using ghost edges for classification in sparsely labeled networks. In: Proceedings of the 14th ACM SIGKDD Int’l conference on knowledge discovery and data mining. ACM, New York, NY, USA, pp 256–264Google Scholar
  10. 10.
    Getoor L, Friedman N, Koller D, Taskar B (2001) Learning probabilistic models of relational structure. In: Proceedings of the 18th Int’l conference on machine learning. Morgan Kaufmann, San Francisco, CA, pp 170–177Google Scholar
  11. 11.
    Getoor L, Friedman N, Koller D, Taskar B (2002) Learning probabilistic models with link uncertainty. J Mach Learn Res 3: 679–707MathSciNetGoogle Scholar
  12. 12.
    Getoor L, Segal E, Taskar B, Koller D (2001) Probabilistic models of text and link structure for hypertext classification. In: In IJCAI’01 workshop on text learning: beyond supervision. Morgan Kaufmann, San Francisco, CA, pp 170–177Google Scholar
  13. 13.
    Harris K (2008) The national longitudinal study of adolescent health (add health), waves i & ii, 1994–1996; wave iii, 2001–2002 [machine-readable data file and documentation]. Carolina Population Center, University of North Carolina at Chapel Hill, Chapel Hill, NCGoogle Scholar
  14. 14.
    Herbster M, Lever G, Pontil M (2008) Exploiting cluster-structure to predict the labeling of a graph. In: NIPS’08 workshop on new challenges in theoretical machine learning: learning with data-dependent concept spacesGoogle Scholar
  15. 15.
    Herbster M, Lever G, Pontil M (2008) Online prediction on large diameter graphs. In: NIPS’08 workshop on new challenges in theoretical machine learning: learning with data-dependent concept spacesGoogle Scholar
  16. 16.
    Jensen D, Neville J, Gallagher B (2004) Why collective inference improves relational classification. In: Proceedings of the 10th ACM SIGKDD int’l conference on knowledge discovery and data mining. pp 593–598Google Scholar
  17. 17.
    Lu Q, Getoor L (2003) Link-based classification. In: Proceedings of the 20th int’l conference on machine learning. pp 496–503Google Scholar
  18. 18.
    Macskassy S, Provost F (2003) A simple relational classifier. In: Proceedings of the 2nd workshop on multi-relational data mining, KDD2003. pp 64–76Google Scholar
  19. 19.
    Macskassy S, Provost F (2007) Classification in networked data: a toolkit and a univariate case study. J Mach Learn Res 8(May): 935–983Google Scholar
  20. 20.
    Macskassy SA (2007) Classification in networked data: a toolkit and a univariate case study. In: Proceedings of the twenty-second conference on artificial intelligence. pp 590–595Google Scholar
  21. 21.
    McDowell L, Gupta K, Aha D (2007) Cautious inference in collective classification. In: Proceedings of the 22nd AAAI conference on artificial intelligenceGoogle Scholar
  22. 22.
    Mohri M, Rostamizadeh A (2007) Stability bounds for non-i.i.d. processes. In: Procedings of the neural information processing systems conference, 20Google Scholar
  23. 23.
    Mohri M, Rostamizadeh A (2010) Stability bounds for stationary ϕ-mixing and β-mixing processes. J Mach Learn Res 11: 789–814MathSciNetGoogle Scholar
  24. 24.
    Neville J, Gallagher B, Eliassi-Rad T (2009) Evaluating statistical tests for within-network classifiers of relational data. In: Proceedings of the 9th IEEE int’l conference on data mining. pp 397–406Google Scholar
  25. 25.
    Neville J, Jensen D (2005) Leveraging relational autocorrelation with latent group models. In: Proceedings of the 5th IEEE int’l conference on data mining. pp 322–329Google Scholar
  26. 26.
    Neville J, Jensen D (2007) Relational dependency networks. J Mach Learn Res 8: 653–692MATHGoogle Scholar
  27. 27.
    Neville J, Jensen D, Friedland L, Hay M (2003) Learning relational probability trees. In: Proceedings of the 9th ACM SIGKDD int’l conference on knowledge discovery and data mining. pp 625–630Google Scholar
  28. 28.
    Neville J, Jensen D, Gallagher B (2003) Simple estimators for relational Bayesian classifers. In: Proceedings of the 3rd IEEE int’l conference on data mining. pp 609–612Google Scholar
  29. 29.
    Perlich C, Provost F (2006) Acora: distribution-based aggregation for relational learning from identifier attributes. Mach Learn 62(1/2): 65–105CrossRefGoogle Scholar
  30. 30.
    Ralaivola L, Szafranski M, Stempfel G (2008) Chromatic pac-bayes bounds for non-iid data. In: NIPS’08 workshop on new challenges in theoretical machine learning: learning with data-dependent concept spacesGoogle Scholar
  31. 31.
    Sen P, Namata G, Bilgic M, Getoor L, Gallagher B, Eliassi-Rad T (2008) Collective classification in network data.. AI Mag 29(3): 93–106Google Scholar
  32. 32.
    Steinwart I, Christmann A (2009) Fast learning from non-i.i.d. observations. In: Proceedings of the neural information processing systems conference, 22Google Scholar
  33. 33.
    Taskar B (2009) Structured prediction cascades. In: NIPS’09 workshop on approximate learning of large scale graphical models: theory and applicationsGoogle Scholar
  34. 34.
    Taskar B, Abbeel P, Koller D (2002) Discriminative probabilistic models for relational data. In: Proceedings of the 18th conference on uncertainty in artificial intelligence. pp 485–492Google Scholar
  35. 35.
    Taskar B, Segal E, Koller D (2001) Probabilistic classification and clustering in relational data. In: Proceedings of the 17th int’l joint conference on artificial intelligence. pp 870–878Google Scholar
  36. 36.
    Usunier N, Amini MR, Gallinari P (2005) Generalization error bounds for classifiers trained with interdependent data. In: Proceedings of the neural information processing systems conference, 18Google Scholar
  37. 37.
    Vitale F, Cesa-Bianchi N, Gentile C (2008) Online graph predictionwith random trees. In: NIPS’08 workshop on new challenges in theoretical machine learning: learning with data-dependent concept spacesGoogle Scholar
  38. 38.
    Xu Z, Kersting K, Tresp V (2009) Multi-relational learning with gaussian processes. In: Proceedings of the 21st int’l joint conference on artificial intelligenceGoogle Scholar
  39. 39.
    Zhang X, Song L, Gretton A, Smola A (2008) Kernel measures of independence for non-iid data. In: Proceedings of the neural information processing systems conference, 21Google Scholar
  40. 40.
    Zhu X, Ghahramani Z, Lafferty J (2003) Semi-supervised learning using gaussian fields and harmonic functions. In: Proceedings of the 20th int’l conference on machine learningGoogle Scholar

Copyright information

© The Author(s) 2010

Authors and Affiliations

  • Jennifer Neville
    • 1
  • Brian Gallagher
    • 2
  • Tina Eliassi-Rad
    • 3
  • Tao Wang
    • 4
  1. 1.Departments of Computer Science and StatisticsPurdue UniversityWest LafayetteUSA
  2. 2.Lawrence Livermore National LaboratoryLivermoreUSA
  3. 3.Department of Computer ScienceRutgers UniversityPiscatawayUSA
  4. 4.Department of Computer SciencePurdue UniversityWest LafayetteUSA

Personalised recommendations