Advertisement

Using learned extraction patterns for text classification

  • Ellen Riloff
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1040)

Abstract

A major knowledge-engineering bottleneck for information extraction systems is the process of constructing an appropriate dictionary of extraction patterns. AutoSlog is a dictionary construction system that has been shown to substantially reduce the time required for knowledge engineering by learning extraction patterns automatically. However, an open question was whether these extraction patterns were useful for tasks other than information extraction. We describe a series of experiments that show how the extraction patterns learned by AutoSlog can be used for text classification. Three dictionaries produced by AutoSlog for different domains performed well in our text classification experiments.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Carbonell, J. G. 1979. Towards a Self-Extending Parser. In Proceedings of the 17th Meeting of the Association for Computational Linguistics. 3–7.Google Scholar
  2. DeJong, Gerald and Mooney, R. 1986. Explanation-Based Learning: An Alternative View. Machine Learning 1:145–176.Google Scholar
  3. Fisher, D. H. 1987. Knowledge Acquisition Via Incremental Conceptual Clustering. Machine Learning 2:139–172.Google Scholar
  4. Granger, R. H. 1977. FOUL-UP: A Program that Figures Out Meanings of Words from Context. In Proceedings of the Fifth International Joint Conference on Artificial Intelligence. 172–178.Google Scholar
  5. Jacobs, Paul and Rau, Lisa 1990. SCISOR: Extracting Information from On-Line News. Communications of the ACM 33(11):88–97.Google Scholar
  6. Jacobs, P. and Zernik, U. 1988. Acquiring Lexical Knowledge from Text: A Case Study. In Proceedings of the Seventh National Conference on Artificial Intelligence. 739–744.Google Scholar
  7. Kim, J. and Moldovan, D. 1993. Acquisition of Semantic Patterns for Information Extraction from Corpora. In Proceedings of the Ninth IEEE Conference on Artificial Intelligence for Applications, Los Alamitos, CA. IEEE Computer Society Press. 171–176.Google Scholar
  8. Lehnert, W. G. and Sundheim, B. 1991. A Performance Evaluation of Text Analysis Technologies. AI Magazine 12(3):81–94.Google Scholar
  9. Lehnert, W.; Cardie, C.; Fisher, D.; Riloff, E.; and Williams, R. 1991. University of Massachusetts: Description of the CIRCUS System as Used for MUC-3. In Proceedings of the Third Message Understanding Conference (MUC-3), San Mateo, CA. Morgan Kaufmann. 223–233.Google Scholar
  10. Lehnert, W.; Cardie, C.; Fisher, D.; McCarthy, J.; Riloff, E.; and Soderland, S. 1992. University of Massachusetts: MUC-4 Test Results and Analysis. In Proceedings of the Fourth Message Understanding Conference (MUC-4), San Mateo, CA. Morgan Kaufmann. 151–158.Google Scholar
  11. Lehnert, W. 1991. Symbolic/Subsymbolic Sentence Analysis: Exploiting the Best of Two Worlds. In Barnden, J. and Pollack, J., editors 1991, Advances in Connectionist and Neural Computation Theory, Vol. 1. Ablex Publishers, Norwood, NJ. 135–164.Google Scholar
  12. Mitchell, T. M.; Keller, R.; and Kedar-Cabelli, S. 1986. Explanation-Based Generalization: A Unifying View. Machine Learning 1:47–80.Google Scholar
  13. Proceedings of the Third Message Understanding Conference (MUC-3), San Mateo, CA. Morgan Kaufmann.Google Scholar
  14. Proceedings of the Fourth Message Understanding Conference (MUC-4), San Mateo, CA. Morgan Kaufmann.Google Scholar
  15. Proceedings of the Fifth Message Understanding Conference (MUC-5), San Francisco, CA. Morgan Kaufmann.Google Scholar
  16. Quinlan, J. R. 1986. Induction of Decision Trees. Machine Learning 1:80–106.Google Scholar
  17. Riloff, E. and Lehnert, W. 1994. Information Extraction as a Basis for High-Precision Text Classification. ACM Transactions on Information Systems 12(3):296–333.Google Scholar
  18. Riloff, E. and Shoen, J. 1995. Automatically Acquiring Conceptual Patterns Without an Annotated Corpus. In Proceedings of the Third Workshop on Very Large Corpora. 148–161.Google Scholar
  19. Riloff, E. 1993. Automatically Constructing a Dictionary for Information Extraction Tasks. In Proceedings of the Eleventh National Conference on Artificial Intelligence. AAAI Press/The MIT Press. 811–816.Google Scholar
  20. Riloff, E. 1996. An Empirical Study of Automated Dictionary Construction for Information Extraction in Three Domains. Artificial Intelligence. To appear.Google Scholar
  21. Soderland, S.; Fisher, D.; Aseltine, J.; and Lehnert, W. 1995. CRYSTAL: Inducing a conceptual dictionary. In Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence. 1314–1319.Google Scholar
  22. Proceedings of the TIPSTER Text Program (Phase I), San Francisco, CA. Morgan Kaufmann.Google Scholar
  23. Utgoff, P. 1988. ID5: An Incremental ID3. In Proceedings of the Fifth International Conference on Machine Learning. 107–120.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1996

Authors and Affiliations

  • Ellen Riloff
    • 1
  1. 1.Department of Computer ScienceUniversity of UtahSalt Lake CityUSA

Personalised recommendations