Advertisement

RDRCE: Combining Machine Learning and Knowledge Acquisition

  • Han Xu
  • Achim Hoffmann
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6232)

Abstract

We present a new interactive workbench RDRCE (RDR Case Explorer) to facilitate the combination of Machine Learning and manual Knowledge Acquisition for Natural Language Processing problems. We show how to use Brill’s well regarded transformational learning approach and convert its results into an RDR tree. RDRCE then strongly guides the systematic inspection of the generated RDR tree in order to further refine and improve it by manually adding more rules. Furthermore, RDRCE also helps in quickly recognising potential noise in the training data and allows to deal with noise effectively. Finally, we present a first study using RDRCE to build a high-quality Part-of-Speech tagger for English. After some 60 hours of manual knowledge acquisition, we already exceed slightly the state-of-the art performance on unseen benchmark test data and the fruits of some 15 years of further research in learning methods for Part-of-Speech taggers.

Keywords

Knowledge Acquisition Ripple Down Rules Machine Learning TBL Part-of-Speech tagger 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Brill, E.: Some advances in transformation-based part of speech tagging. In: AAAI 1994: Proceedings of the Twelfth National Conference on Artificial Intelligence, vol. 1, pp. 722–727 (1994)Google Scholar
  2. 2.
    Catlett, J.: Ripple-down rules as a mediating representation in interactive induction. In: Proceedings of the Japanese Knowledge Acquisition for Knowledge-Based Systems Workshop, Kobe, Japan, pp. 155–170 (1992)Google Scholar
  3. 3.
    Collins, M.: Discriminative training methods for hidden markov models: theory and experiments with perceptron algorithms. In: EMNLP 2002: Proceedings of the ACL 2002 Conference on Empirical Methods in Natural Language Processing, p. 10 (July 2002)Google Scholar
  4. 4.
    Compton, P., Jansen, R.: A philosophical basis for knowledge acquisition. Knowl. Acquis. 2(3), 241–257 (1990)CrossRefGoogle Scholar
  5. 5.
    Edwards, G., Compton, P., Malor, R., Srinivasan, A., Lazarus, L.: Peirs: a pathologist maintained expert system for the interpretation of chemical pathology reports. Pathology 25, 27–34 (1993)CrossRefGoogle Scholar
  6. 6.
    Gaines, B.R.: An ounce of knowledge is worth a ton of data: Quantiative studies of the trade-off between expertise and data based on statistically well-founded empirical induction. In: Proceedings of the 6th International Workshop on Machine Learning, pp. 156–159 (June 1989)Google Scholar
  7. 7.
    Kang, B., Compton, P., Preston, P.: Multiple classification ripple down rules: Evaluation and possibilities. In: Proceedings of the 9th AAAI-sponsored Banff Knowledge Acquisition for Knowledge Based Systems Workshop, pp. 17.1–17.20 (1995)Google Scholar
  8. 8.
    Kim, Y.S., Kang, B.H., Choi, Y.J.: Incremental Knowledge Management of Web Community Groups on Web Portals. In: 5th International Conference on Practical Aspects of Knowledge Management, Vienna, Austria, pp. 198–207 (2004)Google Scholar
  9. 9.
    Klein, S., Simmons, R.F.: A computational approach to grammatical coding of english words. ACM 10(3), 334–347 (1963)zbMATHCrossRefGoogle Scholar
  10. 10.
    Martinez-Bejar, R., Ibanez-Cruz, F., Le-Gia, T., Cao, T.M., Compton, P.: Fmr: An incremental knowledge acquisition system for fuzzy domains. In: Fensel, D., Studer, R. (eds.) EKAW 1999. LNCS (LNAI), vol. 1621, pp. 349–354. Springer, Heidelberg (1999)CrossRefGoogle Scholar
  11. 11.
    Pham, S.B., Hoffmann, A.: Efficient knowledge acquisition for extracting temporal relations. In: Proceedings of the European Conference on Artificial Intelligence (ECAI), Riva del Garda, Italy, pp. 521–525 (2006)Google Scholar
  12. 12.
    Ramshaw, L.A., Marcus, M.P.: Text chunking using transformation-based learning. In: Proceedings of the Third Workshop on Very Large Corpora, pp. 82–94 (1995)Google Scholar
  13. 13.
    Ratnaparkhi, A.: A maximum entropy model for part-of-speech tagging. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, vol. 1, pp. 133–142 (1996)Google Scholar
  14. 14.
    Richards, D.: Two decades of ripple down rules research. The Knowledge Engineering Review 24(2), 159–184 (2009)CrossRefGoogle Scholar
  15. 15.
    Samuel, K., Carberry, S., Vijay-Shanker, K.: Dialogue act tagging with transformation-based learning. In: Proceedings of the 17th International Conference on Computational Linguistics (August 1998)Google Scholar
  16. 16.
    Scheffer, T.: Algebraic foundations and improved methods of induction or ripple-down rules. In: Proceedings of the 2nd Pacific Rim Knowledge Acquisition Workshop, Sydney, Australia, pp. 279–292 (1996), ISBN: 0-7334-1450-8Google Scholar
  17. 17.
    Shen, L., Satta, G., Joshi, A.K.: Guided learning for bidirectional sequence classification. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pp. 760–767 (June 2007)Google Scholar
  18. 18.
    Spoustová, D., Hajič, J., Raab, J., Spousta, M.: Semi-supervised training for the averaged perceptron pos tagger. In: EACL 2009: Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics (March 2009)Google Scholar
  19. 19.
    Suryanto, H., Compton, P.: Invented predicates to reduce knowledge acquisition. In: Motta, E., Shadbolt, N.R., Stutt, A., Gibbins, N. (eds.) EKAW 2004. LNCS (LNAI), vol. 3257, pp. 293–306. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  20. 20.
    Toutanova, K., Klein, D., Manning, C., Singer, Y.: Feature-rich part-of-speech tagging with a cyclic dependency network. In: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, vol. 1, pp. 173–180 (2003)Google Scholar
  21. 21.
    Wada, T., Motoda, H., Washio, T.: Knowledge acquisition from both human expert and data. In: Cheung, D., Williams, G.J., Li, Q. (eds.) PAKDD 2001. LNCS (LNAI), vol. 2035, pp. 550–561. Springer, Heidelberg (2001)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Han Xu
    • 1
  • Achim Hoffmann
    • 1
  1. 1.School of Computer Science and EngineeringUniversity of New South WalesSydneyAustralia

Personalised recommendations