Training Classifiers for Tree-structured Categories with Partially Labeled Data
In this paper we propose a new method for training classifiers for multi-class problems when classes are not (necessarily) mutually exclusive and may be related by means of a probabilistic tree structure. It is based on the definition of a Bayesian model relating network parameters, feature vectors and categories. Learning is stated as a maximum likelihood estimation problem of the classifier parameters. The proposed algorithm is specially suited to situations where each training sample is labeled with respect to only one or part of the categories in the tree. Our experiments on information retrieval scenarios show the advantages of the proposed method.
Keywordstraining classifier Bayesian model probabilistic tree structure
- 1.L. Cai and T. Hoffman, “Hierarchical Document Categorization with Support Vector Machines,” in Proc. of CIKM 2004, Washington DC, USA, Nov. 2004.Google Scholar
- 2.J. Keshet, O. Dekel, and Y. Singer, “Large Margin Hierarchical Classification,” in Proc. of the 21st ICML, Banff, Canada, 2004.Google Scholar
- 8.J. Pearl, Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, Morgan Kaufmann, San Mateo, CA, 1988.Google Scholar
- 9.M. I. Jordan and R. A. Jacobs, “Hierarchical Mixtures of Experts and the em Algorithm,” Neural Comput., vol. 6, no. 2, March 1994, pp. 181–214.Google Scholar
- 10.D. D. Lewis, Reuters-21578 Text Categorization Test Collection, Tech. Rep., AT&T Labs–Research, 1997.Google Scholar
- 11.D. D. Lewis, Y. Yang, T. Rose, and F. Li, “Rcv1: A New Benchmark Collection for Text Categorization Research,” J. Mach. Learn. Res., vol. 5, no. 361, 2004, p. 397.Google Scholar
- 13.D. D. Lewis, “Rcv1-v2/lyrl2004: The Lyrl2004 Distribution of the Rcv1-v2 Text Categorization Test Collection,” Tech. Rep., http://www.ics.uci.edu/~kdd/databases/reuters21578/reuters21578.html, 2004.