Encyclopedia of Machine Learning

2010 Edition
| Editors: Claude Sammut, Geoffrey I. Webb

Collective Classification

  • Prithviraj Sen
  • Galileo Namata
  • Mustafa Bilgic
  • Lise Getoor
Reference work entry
DOI: https://doi.org/10.1007/978-0-387-30164-8_140



Many real-world  classification problems can be best described as a set of objects interconnected via links to form a network structure. The links in the network denote relationships among the instances such that the class labels of the instances are often correlated. Thus, knowledge of the correct label for one instance improves our knowledge about the correct assignments to the other instances it connects to. The goal of collective classification is to jointly determine the correct label assignments of all the objects in the network.

Motivation and Background

Traditionally, a major focus of machine learning is to solve classification problems: given a corpus of documents, classify each according to its topic label; given a collection of e-mails, determine which are spam; given a sentence, determine the part-of-speech tag for each word; given a hand-written document, determine the characters, etc. However, much...

This is a preview of subscription content, log in to check access.

Recommended Reading

  1. Anguelov, D., Taskar, B., Chatalbashev, V., Koller, D., Gupta. D., Heitz, G., et al. (2005). Discriminative learning of Markov random fields for segmentation of 3d scan data. In IEEE computer society conference on computer vision and pattern recognition. IEEE Computer Society, Washington D.C.Google Scholar
  2. Berrou, C., Glavieux, A., & Thitimajshima, P. (1993). Near Shannon limit error-correcting coding and decoding: Turbo codes. In Proceedings of IEEE international communications conference, Geneva, Switzerland, IEEE.Google Scholar
  3. Besag, J. (1986). On the statistical analysis of dirty pictures. Journal of the Royal Statistical Society, B-48, 259–302.Google Scholar
  4. Carvalho, V., & Cohen, W. W. (2005). On the collective classification of email speech acts. In Special interest group on information retrieval, Salvador, Brazil, ACM.Google Scholar
  5. Chakrabarti, S., Dom, B., & Indyk, P. (1998). Enhanced hypertext categorization using hyperlinks. In International conference on management of data, Seattle, Washington New York: ACM.Google Scholar
  6. Chen, L., Wainwright, M., Cetin, M., & Willsky, A. (2003). Multitargetmultisensor data association using the tree-reweighted max-product algorithm. In SPIE Aerosense conference. Orlando, Florida.Google Scholar
  7. Getoor, L. (2005). Link-based classification. In Advanced methods for knowledge discovery from complex data. New York: Springer.Google Scholar
  8. Getoor, L., & Taskar, B. (Eds.). (2007). Introduction to statistical relational learning. Cambridge, MA: MIT Press.MATHGoogle Scholar
  9. Getoor, L., Segal, E., Taskar, B., & Koller, D. (2001). Probabilistic models of text and link structure fro hypertext classification. In Proceedings of the IJCAI workshop on text learning: Beyond supervision, Seattle, WA.Google Scholar
  10. Getoor, L., Friedman, N., Koller, D., & Taskar, B. (2002). Learning probabilistic models of link structure. Journal of Machine Learning Research, 3, 679–707.MathSciNetGoogle Scholar
  11. Hummel, R., & Zucker, S. (1983). On the foundations of relaxation labeling processes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 5, 267–287.MATHGoogle Scholar
  12. Jensen, D., Neville, J., & Gallagher, B. (2004). Why collective inference improves relational classification. In Proceedings of the 10th ACM SIGKDD international conference on knowledge discovery and data mining, Seattle, WA. ACM.Google Scholar
  13. Lafferty, J. D., McCallum, A., & Pereira, F. C. N. (2001). conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the international conference on machine learning, Washington DC. San Francisco, CA: Morgan Kaufmann.Google Scholar
  14. Lu, Q., & Getoor, L. (2003a). Link based classification. In Proceedings of the international conference on machine learning. AAAI Press, Washington, D.C.Google Scholar
  15. Lu, Q., & Getoor, L. (2003b). Link-based classification using labeled and unlabeled data. In ICML workshop on the continuum from labeled to unlabeled data in machine learning and data mining. Washington, D.C.Google Scholar
  16. Macskassy, S., & Provost, F. (2007). Classification in networked data: A toolkit and a univariate case study. Journal of Machine Learning Research, 8, 935–983.Google Scholar
  17. Macskassy, S. A. (2007). Improving learning in networked data by combining explicit and mined links. In Proceedings of the twenty-second conference on artificial intelligence. AAAI Press, Vancouver, Canada.Google Scholar
  18. McDowell, L. K., Gupta, K. M., & Aha, D. W. (2007). Cautious inference in collective classification. In Proceedings of AAAI. AAAI Press, Vancouver, Canada.Google Scholar
  19. Neville, J., & Jensen, D. (2007). Relational dependency networks. Journal of Machine Learning Research, 8, 653–692.Google Scholar
  20. Neville, J., & Jensen, D. (2000). Iterative classification in relation data. In Workshop on statistical relational learning, AAAI.Google Scholar
  21. Slattery, S., & Craven, M. (1998). Combining statistical and relational methods for learning in hypertext domains. In International conferences on inductive logic programming. Springer-Verlag, London, UK.Google Scholar
  22. Taskar, B., Abbeel, P., & Koller, D. (2002). Discriminative probabilistic models for relational data. In Proceedings of the annual conference on uncertainty in artificial intelligence. Morgan Kauffman, San Francisco, CA.Google Scholar
  23. Taskar, B., Guestrin, C., & Koller, D. (2003a). Max-margin markov networks. In Neural information processing systems. MIT Press, Cambridge, MA.Google Scholar
  24. Taskar, B., Wong, M. F., Abbeel, P., & Koller, D. (2003b). Link prediction in relational data. In Natural information processing systems. MIT Press, Cambridge, MA.Google Scholar
  25. Taskar, B., Chatalbashev, V., Koller, D., & Guestrin, C. (2005). Learning structured prediction models: A large margin approach. In Proceedings of the international conference on machine learning. ACM, New York, NY.Google Scholar
  26. Xu, L., Wilkinson, D., Southey, F., & Schuurmans, D. (2006). Discriminative unsupervised learning of structured predictors. In Proceedings of the international conference on machine learning. ACM, New York, NY.Google Scholar
  27. Yang, Y., Slattery, S., & Ghani, R. (2002). A study of approaches to hypertext categorization. Journal of Intelligent Information Systems. 18(2–3), 219–241.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  • Prithviraj Sen
  • Galileo Namata
  • Mustafa Bilgic
  • Lise Getoor

There are no affiliations available