The Combination of Text Classifiers Using Reliability Indicators

Bennett, Paul N.; Dumais, Susan T.; Horvitz, Eric

doi:10.1023/B:INRT.0000048491.59134.94

The Combination of Text Classifiers Using Reliability Indicators

Published: January 2005

Volume 8, pages 67–100, (2005)
Cite this article

Download PDF

Information Retrieval Aims and scope Submit manuscript

The Combination of Text Classifiers Using Reliability Indicators

Download PDF

Paul N. Bennett¹,
Susan T. Dumais² &
Eric Horvitz²

539 Accesses
30 Citations
3 Altmetric
Explore all metrics

Abstract

The intuition that different text classifiers behave in qualitatively different ways has long motivated attempts to build a better metaclassifier via some combination of classifiers. We introduce a probabilistic method for combining classifiers that considers the context-sensitive reliabilities of contributing classifiers. The method harnesses reliability indicators—variables that provide signals about the performance of classifiers in different situations. We provide background, present procedures for building metaclassifiers that take into consideration both reliability indicators and classifier outputs, and review a set of comparative studies undertaken to evaluate the methodology.

Article PDF

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

Aleatoric and epistemic uncertainty in machine learning: an introduction to concepts and methods

Article Open access 08 March 2021

Supervised Classification Algorithms in Machine Learning: A Survey and Review

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Al-Kofahi K, Tyrrell A, Vacher A, Travers T and Jackson P (2001) Combining multiple classifiers for text categorization. In: CIKM '01, Proceedings of the 10th ACM Conference on Information and Knowledge Management, pp. 97–104.
Bartell BT, Cottrell GW and Belew RK (1994) Automatic combination of multiple ranked retrieval systems. In: SIGIR '94, Proceedings of the 17th Annual International ACM Conference on Research and Development in Information Retrieval, pp. 173–181.
Belkin N, Cool C, Croft W and Callan J (1993) The effect of multiple query representations on information retrieval system performance. In: SIGIR '93, Proceedings of the 16th Annual International ACM Conference on Research and Development in Information Retrieval, pp. 339–346.
Bennett PN, Dumais ST and Horvitz E (2002) Probabilistic combination of text classifiers using reliability indicators: Models and results. In: SIGIR '02, Proceedings of the 25th Annual International ACM Conference on Research and Development in Information Retrieval, pp. 207–214.
Bennett PN, Dumais ST and Horvitz E (2003) Inductive transfer for text classification using generalized reliability indicators. In: Working Notes of ICML'03 (The 20th International Conference on Machine Learning), Workshop on The Continuum from Labeled to Unlabeled Data, pp. 72–79.
Chickering D, Heckerman D and Meek C (1997) A Bayesian approach to learning Bayesian networks with local structure. In: UAI '97, Proceedings of the 13th Conference on Uncertainty in Artificial Intelligence, pp. 80–89.
Dietterich T (2000) Ensemble methods. In: MCS '00, Proceedings of the 1st International Workshop on Multiple Classifier Systems, Springer, pp. 1–15.
Duda R, Hart P and Stork D (2001) Pattern Classification. John Wiley & Sons, Inc., New York, NY.
Google Scholar
Dumais ST and Chen H (2000) Hierarchical classification of web content. In: SIGIR '00, Proceedings of the 23rd Annual International ACM Conference on Research and Development in Information Retrieval, pp. 256–263.
Dumais ST, Platt J, Heckerman D and Sahami M (1998) Inductive learning algorithms and representations for text categorization. In: CIKM '98, Proceedings of the 7th ACM Conference on Information and Knowledge Management, pp. 148–155.
Gama J (1998a) Combining classifiers by constructive induction. In: ECML '98, Proceedings of the 10th European Conference on Machine Learning, pp. 178–189.
Gama J (1998b) Local cascade generalization. In: ICML '98, Proceedings of the 15th International Conference on Machine Learning, pp. 206–214.
Heckerman D, Chickering D, Meek C, Rounthwaite R and Kadie C (2000) Dependency networks for inference, collaborative filtering, and data visualization. Journal of Machine Learning Research, 1:49–75.
Google Scholar
Hersh W, Buckley C, Leone T and Hickam D (1994) OHSUMED: An interactive retrieval evaluation and new large test collection for research. In: SIGIR '94, Proceedings of the 17th Annual International ACM Conference on Research and Development in Information Retrieval, pp. 192–201.
Horvitz E, Breese J and Henrion M (1988) Decision theory in expert systems and artificial intelligence. International Journal of Approximate Reasoning, Special Issue on Uncertain Reasoning, 2:247–302.
Google Scholar
Horvitz E, Jacobs A and Hovel D (1999) Attention-sensitive alerting. In: UAI '99, Proceedings of the 15th Conference on Uncertainty in Artificial Intelligence, pp. 305–313.
Hull D, Pedersen J and Schuetze H (1996) Method combination for document filtering. In: SIGIR '96, Proceedings of the 19th Annual International ACM Conference on Research and Development in Information Retrieval, pp. 279–287.
Joachims T (1998) Text categorization with support vector machines: Learning with many relevant features. In: ECML '98, Proceedings of the 10th European Conference on Machine Learning, pp. 137–142.
Kargupta H and Chan P, Eds. (2000). Advances in Distributed and Parallel Knowledge Discovery. Cambridge, Massachusetts: AAAI Press/MIT Press.
Google Scholar
Katzer J, McGill M, Tessier J, Frakes W and DasGupta P (1982) A study of the overlap among document representations. Information Technology: Research and Development, 1:261–274.
Google Scholar
Kessler B, Nunberg G and Schütze H (1997) Automatic detection of text genre. In: ACL '97, Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics, pp. 32–38.
Klein LA (1999) Sensor and data fusion concepts and applications. 2nd edition. Society of Photo-Optical Instrumentation Engineers.
Lam Wand Lai KY (2001) Ameta-learning approach for text categorization. In: SIGIR '01, Proceedings of the 24th Annual International ACM Conference on Research and Development in Information Retrieval, pp. 303–309.
Larkey LS and Croft WB (1996) Combining classifiers in text categorization. In: SIGIR '96, Proceedings of the 19th Annual International ACM Conference on Research and Development in Information Retrieval, pp. 289–297.
Lewis DD (1995) A sequential algorithm for training text classifiers: Corrigendum and additional data. ACM SIGIR Forum, 29(2):13–19.
Google Scholar
Lewis DD (1997) Reuters-21578, distribution 1.0. http://www.daviddlewis.com/resources/testcollections-/reuters21578 (visited 2002).
Lewis DD and Gale WA (1994) A sequential algorithm for training text classifiers. In: SIGIR '94, Proceedings of the 17th Annual International ACM Conference on Research and Development in Information Retrieval, pp. 3–12.
Lewis DD, Schapire RE, Callan JP and Papka R (1996) Training algorithms for linear text classifiers. In: SIGIR '96, Proceedings of the 19th Annual International ACM Conference on Research and Development in Information Retrieval, pp. 298–306.
Li Y and Jain A (1998) Classification of text documents. The Computer Journal, 41(8):537–546.
Google Scholar
McCallum A and Nigam K (1998) A comparison of event models for naive bayes text classification. In: Working Notes of AAAI '98 (The 15th National Conference on Artificial Intelligence), Workshop on Learning for Text Categorization, pp. 41–48.
Nigam K, Lafferty J and McCallum A (1999) Using maximum entropy for text classification. In: Working Notes of IJCAI '99 (The 16th International Joint Conference on Artificial Intelligence), Workshop on Machine Learning for Information Filtering, pp. 61–67.
Platt JC (1999a) Fast training of support vector machines using sequential minimal optimization. In: Schölkopf B, Burges C and Smola A, Eds. Advances in Kernel Methods—Support Vector Learning. MIT Press, pp. 185–208.
Platt JC (1999b) Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In: Smola AJ, Bartlett P, Scholkopf B and Schuurmans D, Eds. Advances in Large Margin Classifiers. MIT Press, pp. 61–74.
Provost F and Fawcett T (2001) Robust classification for imprecise environments. Machine Learning, 42:203–231.
Google Scholar
Rajashekar T and Croft W (1995) Combining automatic and manual index representations in probabilistic retrieval. Journal of the American Society for Information Science, 6(4):272–283.
Google Scholar
Sahami M, Dumais S, Heckerman D and Horvitz E (1998) A bayesian approach to filtering junk e-mail. In: Working Notes of AAAI '98 (The 15th National Conference on Artificial Intelligence), Workshop on Learning for Text Categorization, pp. 55–62.
Schapire RE and Singer Y (2000) BoosTexter: Aboosting-based system for text categorization. Machine Learning, 39:135–168.
Google Scholar
Sebastiani F (2002) Machine learning in automated text categorization. ACM Computing Surveys, 34(1):1–47.
Google Scholar
Shaw J and Fox E (1995) Combination of multiple searches. In: TREC-3, Proceedings of the 3rd Text Retrieval Conference, pp. 105–108.
Ting K and Witten I (1999) Issues in stacked generalization. Journal of Artificial Intelligence Research, 10:271–289.
Google Scholar
Toyama K and Horvitz E (2000) Bayesian modality fusion: Probabilistic integration of multiple vision algorithms for head tracking. In: ACCV 2000, Proceedings of the 4th Asian Conference on Computer Vision.
van Rijsbergen CJ (1979) Information Retrieval. Butterworths, London.
Google Scholar
Weiss S, Apte C, Damerau F, Johnson D, Oles F, Goetz T and Hampp T (1999) Maximizing text-mining performance. IEEE Intelligent Systems, 14(4):63–69.
Google Scholar
WinMine Toolkit v1.0, http://research.microsoft.com/~dmax/WinMine/ContactInfo.html (visited 2002). Microsoft Corporation.
Wolpert DH (1992) Stacked generalization. Neural Networks, 5:241–259.
Google Scholar
Yang Y, Ault T and Pierce T (2000) Combining multiple learning strategies for effective cross validation. In: ICML '00, Proceedings of the 17th International Conference on Machine Learning, pp. 1167–1182.
Yang Y and Liu X (1999) A re-examination of text categorization methods. In: SIGIR '99, Proceedings of the 22nd Annual International ACM Conference on Research and Development in Information Retrieval, pp. 42–49.
Zhang T and Oles FJ (2001) Text categorization based on regularized linear classification methods. Information Retrieval, 4:5–31.
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science Department, Carnegie Mellon University, Pittsburgh, PA 15213, USA
Paul N. Bennett
Microsoft Research, One Microsoft Way, Redmond, WA, 98052, USA
Susan T. Dumais & Eric Horvitz

Authors

Paul N. Bennett
View author publications
You can also search for this author in PubMed Google Scholar
Susan T. Dumais
View author publications
You can also search for this author in PubMed Google Scholar
Eric Horvitz
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bennett, P.N., Dumais, S.T. & Horvitz, E. The Combination of Text Classifiers Using Reliability Indicators. Information Retrieval 8, 67–100 (2005). https://doi.org/10.1023/B:INRT.0000048491.59134.94

Download citation

Issue Date: January 2005
DOI: https://doi.org/10.1023/B:INRT.0000048491.59134.94

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

The Combination of Text Classifiers Using Reliability Indicators

Abstract

Article PDF

Similar content being viewed by others

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

Aleatoric and epistemic uncertainty in machine learning: an introduction to concepts and methods

Supervised Classification Algorithms in Machine Learning: A Survey and Review

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

The Combination of Text Classifiers Using Reliability Indicators

Abstract

Article PDF

Similar content being viewed by others

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

Aleatoric and epistemic uncertainty in machine learning: an introduction to concepts and methods

Supervised Classification Algorithms in Machine Learning: A Survey and Review

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation