Blum, A. (1992). Learning boolean functions in an infinite attribute space. Machine Learning
(4), 373–386.Google Scholar
Breiman, L. (1994). Bagging predictors (Technical Report 421). University of California, Berkeley.Google Scholar
Cesa-Bianchi, N., Freund, Y., Helmbold, D.P., & Warmuth, M.(1994). On-line prediction and conversion strategies. Computational Learning Theory: Eurocolt'93 (pp. 205–216). Oxford University Press.
Chen, S.F., & Goodman, J. (1996). An empirical study of smoothing techniques for language modeling. Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics, Santa Cruz, CA.
Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning
, 273–297.Google Scholar
Dagan, I., Karov, Y., & Roth, D. (1997). Mistake-driven learning in text categorization. EMNLP-97, The Second Conference on Empirical Methods in Natural Language Processing (pp. 55–63).
Dietterich, T.G. (1998). Approximate statistical tests for comparing supervised classification learning algorithms. Neural Computation
(7), 1895–1924.Google Scholar
Domingos, P., & Pazzani, M. (1997). On the optimality of the simple Bayesian classifier under zero-one loss. Machine Learning
, 103–130.Google Scholar
Duda, R.O., & Hart, P.E. (1973). Pattern classification and scene analysis. Wiley.
Fleiss, J.L. (1981). Statistical methods for rates and proportions. John Wiley and Sons.
Flexner, S.B. (Ed.). (1983). Random House unabridged dictionary
(2nd ed.). New York: Random House.Google Scholar
Freund, Y., & Schapire, R.E. (1995). A decision-theoretic generalization of on-line learning and an application to boosting. Computational Learning Theory: Eurocolt'95 (pp. 23–37). Springer-Verlag.
Gale, W.A., Church, K.W., & Yarowsky, D. (1993). A method for disambiguating word senses in a large corpus. Computers and the Humanities
, 415–439.Google Scholar
Golding, A.R. (1995). A Bayesian hybrid method for context-sensitive spelling correction. Proceedings of the 3rd Workshop on Very Large Corpora, Boston, MA.
Golding, A.R., & Schabes, Y. (1996). Combining trigram-based and feature-based methods for context-sensitive spelling correction. Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics, Santa Cruz, CA.
Herbster, M., & Warmuth, M. (1995). Tracking the best expert. Proceedings of the 12th International Conference on Machine Learning (pp. 286–294). Morgan Kaufmann.
Holte, R.C., Acker, L.E., & Porter, B.W. (1989). Concept learning and the problem of small disjuncts. Proceedings of the International Joint Conference on Artificial Intelligence, Detroit.
Jones, M.P., & Martin, J.H. (1997). Contextual spelling correction using latent semantic analysis. Proceedings of the 5th Conference on Applied Natural Language Processing, Washington, DC.
Katz, S.M. (1987). Estimation of probabilities from sparse data for the language model component of a speech recognizer. IEEE Trans. on Acoustics, Speech, and Signal Processing
(3), 400–401.Google Scholar
Kivinen, J., & Warmuth, M.K. (1995). Exponentiated gradient versus gradient descent for linear predictors. ACM Symp. on the Theory of Computing.
Kneser, R., & Ney, H. (1995). Improved backing-off for m-gram language modeling. Proceedings of the International Conf. on Acoustics, Speech, and Signal Processing
(Vol. 1, pp. 181–184).Google Scholar
Kohavi, R., Becker, B., & Sommerfield, D. (1997). Improving simple Bayes. Proceedings of the European Conference on Machine Learning.
KuLcera, H., & Francis, W.N. (1967). Computational analysis of present-day American English
. Providence, RI: Brown University Press.Google Scholar
Kukich, K. (1992). Techniques for automatically correcting words in text. ACM Computing Surveys
(4), 377–439.Google Scholar
Littlestone, N. (1988). Learning quickly when irrelevant attributes abound: A new linear-threshold algorithm. Machine Learning
, 285–318.Google Scholar
Littlestone, N. (1991). Redundant noisy attributes, attribute errors, and linear threshold learning using Winnow. Proceedings of the 4th AnnualWorkshop on Computational Learning Theory (pp. 147–156). Morgan Kaufmann.
Littlestone, N. (1995). Comparing several linear-threshold learning algorithms on tasks involving superfluous attributes. Proceedings of the 12th International Conference on Machine Learning (pp. 353–361). Morgan Kaufmann.
Littlestone, N., & Warmuth, M.K. (1994). The weighted majority algorithm. Information and Computation
(2), 212–261.Google Scholar
Mangu, L., & Brill, E. (1997). Automatic rule acquisition for spelling correction. Proceedings of the 14th International Conference on Machine Learning. Morgan Kaufmann.
Marcus, M.P., Santorini, B., & Marcinkiewicz, M. (1993). Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics
(2), 313–330.Google Scholar
Mays, E., Damerau, F.J., & Mercer, R.L. (1991). Context based spelling correction. Information Processing and Management
(5), 517–522.Google Scholar
Ney, H., Essen, U., & Kneser, R. (1994). On structuring probabilistic dependences in stochastic language modelling. Computer Speech and Language
, 1–38.Google Scholar
Ng, H.T., & Lee, H.B. (1996). Integrating multiple knowledge sources to disambiguate word sense: An examplarbased approach. Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics, Santa Cruz, CA.
Powers, D. (1997). Learning and application of differential grammars. Proceedings of the ACL Special Interest Group in Natural Language Learning, Madrid.
Reddy, L., & Tadepalli, P. (1997). Active learning with committees for text categorization. Proceedings of the National Conference on Artificial Intelligence (pp. 602–608).
Roth, D. (1998). Learning to resolve natural language ambiguities: Aunified approach. Proceedings of the National Conference on Artificial Intelligence (pp. 806–813).
Roth, D., & Zelenko, D. (1998). Part of speech tagging using a network of linear separators. COLING-ACL 98, The 17th International Conference on Computational Linguistics (pp. 1136–1142).
Valiant, L.G. (1994). Circuits of the mind. Oxford University Press.
Valiant, L.G. (1995). Rationality. Workshop on Computational Learning Theory (pp. 3–14).
Yarowsky, D. (1994). Decision lists for lexical ambiguity resolution: Application to accent restoration in Spanish and French. Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics, Las Cruces, NM.