Abstract
Conditional random fields are one of the most popular structured prediction models. Nevertheless, the problem of incorporating domain knowledge into the model is poorly understood and remains an open issue. We explore a new approach for incorporating a particular form of domain knowledge through generalized isotonic constraints on the model parameters. The resulting approach has a clear probabilistic interpretation and efficient training procedures. We demonstrate the applicability of our framework with an experimental study on sentiment prediction and information extraction tasks.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Barlow, R. E., Bartholomew, D. J., Bremner, J. M., & Brunk, H. D. (1972). Statistical inference under order restrictions (the theory and application of isotonic regression). New York: Wiley.
Chang, M., Ratinov, L., & Roth, D. (2007). Guiding semi-supervision with constraint-driven learning. In Proceedings of the 45th annual meeting of the association of computational linguistics.
Collins, M. (2002). Discriminative training methods for hidden Markov models: Theory and experiments with perceptron algorithms. In Proceedings of the ACL-02 conference on empirical methods in natural language processing.
Druck, G., Mann, G., & McCallum, A. (2006). Leveraging existing resources using generalized expectation criteria. In NIPS workshop on learning problem design.
Garthwaite, P. H., Kadane, J., & O’Hagan, A. (2005). Statistical methods for eliciting probability distributions. Journal of the American Statistical Association, 100, 680–701.
Grenager, T., Klein, D., & Manning, C. D. (2005). Unsupervised learning of field segmentation models for information extraction. In Proceedings of the 43rd annual meeting on association for computational linguistics.
Haghighi, A., & Klein, D. (2006). Prototype-driven learning for sequence models. In Proceedings of the human language technology conference of the NAACL.
Hirotsu, C. (1978). Ordered alternatives for interaction effects. Biometrika, 65(3), 561–570.
Daumé, III H., Langford, J., & Marcu, D. (2009). Search-based structured prediction. Machine Learning Journal.
Lafferty, J., Pereira, F., & McCallum, A. (2001). Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proc. of the international conference on machine learning.
Mao, Y., & Lebanon, G. (2007). Isotonic conditional random fields and local sentiment flow. In Advances in neural information processing systems 19 (pp. 961–968).
McCallum, A., Freitag, D., & Pereira, F. (2000). Maximum entropy Markov models for information extraction and segmentation. In Proc. of the international conference on machine learning.
McDonald, R., Hannan, K., Neylon, T., Wells, M., & Reynar, J. (2007). Structured models for fine-to-coarse sentiment analysis. In Proceedings of the 45th annual meeting of the association of computational linguistics.
O’Hagan, A., Buck, C. E., Daneshkhah, A., Eiser, J. R., Garthwaite, P. H., Jenkinson, D. J., Oakley, J. E., & Rakow, T. (2006). Uncertain judgements: Eliciting experts’ probabilities. New York: Wiley.
Pang, B., & Lee, L. (2004). A sentimental eduction: sentiment analysis using subjectivity summarization based on minimum cuts. In Proc. of the association of computational linguistics.
Pang, B., & Lee, L. (2005). Seeing stars: Exploiting class relationship for sentiment categorization with respect to rating scales. In Proc. of the association of computational linguistics.
Pang, B., Lee, L., & Vaithyanathan, S. (2002). Thumbs up?: Sentiment classification using machine learning techniques. In EMNLP ’02: Proceedings of the ACL-02 conference on empirical methods in natural language processing.
Rabiner, L. R. (1989). A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2), 257–286.
Silvapulle, M. J., & Sen, P. K. (2004). Constrained statistical inference: Order, inequality, and shape constraints. New York: Wiley.
Stanley, R. P. (2000). Enumerative combinatorics (Vol. 1). Cambridge: Cambridge University Press.
Taskar, B., Guestrin, C., & Koller, D. (2004). Max-margin Markov networks. In Advances in neural information processing systems.
Author information
Authors and Affiliations
Corresponding author
Additional information
Editors: Charles Parker, Yasemin Altun, and Prasad Tadepalli.
A shorter version of this paper appeared as the conference paper (Mao and Lebanon 2007).
Rights and permissions
About this article
Cite this article
Mao, Y., Lebanon, G. Generalized isotonic conditional random fields. Mach Learn 77, 225–248 (2009). https://doi.org/10.1007/s10994-009-5139-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10994-009-5139-1