Advertisement

Machine Learning

, Volume 34, Issue 1–3, pp 233–272 | Cite as

Learning Information Extraction Rules for Semi-Structured and Free Text

  • Stephen Soderland
Article

Abstract

A wealth of on-line text information can be made available to automatic processing by information extraction (IE) systems. Each IE application needs a separate set of rules tuned to the domain and writing style. WHISK helps to overcome this knowledge-engineering bottleneck by learning text extraction rules automatically.

WHISK is designed to handle text styles ranging from highly structured to free text, including text that is neither rigidly formatted nor composed of grammatical sentences. Such semi-structured text has largely been beyond the scope of previous systems. When used in conjunction with a syntactic analyzer and semantic tagging, WHISK can also handle extraction from free text such as news stories.

natural language processing information extraction rule learning 

References

  1. Ashish, N., & Knoblock, C. (1997). Wrapper generation for semi-structured Internet sources. SIGMOD Record, 26(4), 8–15.Google Scholar
  2. Breiman, L., Friedman, J., Olshen, R., & Stone, C. (1984). Classification and regression trees. California: Wadsworth International Group.Google Scholar
  3. Califf, M.E., & Mooney, R. (1997). Relational learning of pattern-match rules for information extraction. Working Papers of ACL-97 Workshop on Natural Language Learning (pp. 9–15).Google Scholar
  4. Cohen, W. (1996). Learning trees and rules with set-valued features. Proceedings of the Thirteenth National Conference on Artificial Intelligence (pp. 709–716).Google Scholar
  5. Cohn, D., Atlas, L., & Ladner, R. (1994). Improving generalization with active learning. Machine Learning, 15(2), 201–221.Google Scholar
  6. Dagan, I., & Engelson, S. (1996). Sample selection in natural language learning. In S. Wermter, E. Riloff, & G. Scheller (Eds.), Connectionist, statistical, and symbolic approaches to learning for natural language processing. Berlin: Springer.Google Scholar
  7. Domingos, P. (1994). The RISE system: Conquering without separating. Proceedings of the Sixth IEEE International Conference on Tools with Artificial Intelligence (pp. 704–707).Google Scholar
  8. Fisher, D., Soderland, S., McCarthy, J., Feng, F., & Lehnert, W. (1995). Description of the UMass system as used for MUC-6. Proceedings of the Sixth Message Understanding Conference (pp. 221–236), San Fransisco, CA: Morgan Kaufmann.Google Scholar
  9. Freitag, D. (1998). Multistrategy learning for information extraction. Proceedings of the Fifteenth International Machine Learning Conference (pp. 161–169).Google Scholar
  10. Huffman, S. (1996). Learning information extraction patterns from examples. In S. Wermter, E. Riloff, & G. Scheller (Eds.), Connectionist, statistical, and symbolic approaches to learning for natural language processing. Berlin: Springer.Google Scholar
  11. Kim, J., & Moldovan, D. (1993). Acquisition of semantic patterns for information extraction from corpora. Proceedings of the Ninth IEEE Conference on Artificial Intelligence for Applications (pp. 171–176). IEEE Computer Society Press.Google Scholar
  12. Krupka, G. (1995). Description of the SRA system as used for MUC-6. Proceedings of the Sixth Message Understanding Conference (pp. 221–236). San Fransisco, CA: Morgan Kaufmann.Google Scholar
  13. Kushmerick, N., Weld, D., & Doorenbos, R. (1997). Wrapper induction for information extraction. Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence (pp. 729–737).Google Scholar
  14. Lewis, D., & Gale, W. (1994). A sequential algorithm for training text classifiers. Proceedings of ACM-SIGIR Conference on Information Retrieval (pp. 3–12).Google Scholar
  15. Michalski, R.S. (1983). A theory and methodology of inductive learning, In Michalski, Carbonell, & Mitchell (Eds.), Machine learning: An artificial intelligence approach. Palo Alto, CA: Tioga Publishing.Google Scholar
  16. MUC-6. (1995). Proceedings of the Sixth Message Understanding Conference. San Fransisco, CA: Morgan Kaufmann.Google Scholar
  17. Quinlan, J.R. (1990). Learning logical definitions from relations. Machine Learning, 5(3), 239–266.Google Scholar
  18. Quinlan, J.R. (1993). C4.5: Programs for machine learning. San Fransisco, CA: Morgan Kaufmann.Google Scholar
  19. Riloff, E. (1993). Automatically constructing a dictionary for information extraction tasks. Proceedings of the Eleventh National Conference on Artificial Intelligence (pp. 811–816).Google Scholar
  20. Soderland, S. (1997). Learning text analysis rules for domain-specific natural language processing. Ph.D. thesis (Technical Report UM-CS-1996-087). University of Massachusetts, Amherst.Google Scholar
  21. Soderland, S. (1997a). Learning to extract text-based information from the World Wide Web. Proceedings of the Third International Conference on Knowledge Discovery and Data Mining.Google Scholar
  22. Soderland, S., Fisher, D., Aseltine, J., & Lehnert, W. (1995). CRYSTAL: Inducing a conceptual dictionary. Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence (pp. 1314–1321).Google Scholar
  23. Valiant, L. (1984). A theory of the learnable. Communications of the ACM, 27(11), 1134–1142.Google Scholar

Copyright information

© Kluwer Academic Publishers 1999

Authors and Affiliations

  • Stephen Soderland
    • 1
  1. 1.Department Computer Science and EngineeringUniversity of WashingtonSeattle

Personalised recommendations