Towards Knowledge Acquisition from Information Extraction

  • Chris Welty
  • J. William Murdock
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4273)


In our research to use information extraction to help populate the semantic web, we have encountered significant obstacles to interoperability between the technologies. We believe these obstacles to be endemic to the basic paradigms, and not quirks of the specific implementations we have worked with. In particular, we identify five dimensions of interoperability that must be addressed to successfully populate semantic web knowledge bases from information extraction systems that are suitable for reasoning. We call the task of transforming IE data into knowledge-bases knowledge integration, and briefly present a framework called KITE in which we are exploring these dimensions. Finally, we report on the initial results of an experiment in which the knowledge integration process uses the deeper semantics of OWL ontologies to improve the precision of relation extraction from text.


Information Extraction Applications of OWL DL Reasoning 


  1. Bontcheva, K.: Open-source Tools for Creation, Maintenance, and Storage of Lexical Resources for Language Generation from Ontologies. In: Fourth International Conference on Language Resources and Evaluation (LREC 2004), Lisbon, Portugal (2004)Google Scholar
  2. Byrd, R., Ravin, Y.: Identifying and Extracting Relations in Text. In: 4th International Conference on Applications of Natural Language to Information Systems (NLDB), Klagenfurt, Austria (1999)Google Scholar
  3. Chu-Carroll, J., Czuba, K., Duboue, P., Prager, J.: IBM’s PIQUANT II in TREC 2005. In: The Fourteenth Text REtrieval Conference (TREC 2005) (2005)Google Scholar
  4. Cimiano, P., Völker, J.: Text2Onto - A Framework for Ontology Learning and Data-driven Change Discovery. In: Montoyo, A., Muńoz, R., Métais, E. (eds.) NLDB 2005. LNCS, vol. 3513, pp. 227–238. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  5. Dagan, I., Glickman, O., Magnini, B.: The PASCAL Recognising Textual Entailment Challenge. In: Proceedings of the PASCAL Challenges Workshop on Recognising Textual Entailment (2005)Google Scholar
  6. Cunningham, H.: Automatic Information Extraction. Encyclopedia of Language and Linguistics, 2nd edn. Elsevier, Amsterdam (2005)Google Scholar
  7. Dill, S., Eiron, N., Gibson, D., Gruhl, D., Guha, R., Jhingran, A., Kanungo, T., Rajagopalan, S., Tomkins, A., Tomlin, J.A., Zien, J.Y.: SemTag and Seeker: Bootstrapping the semantic web via automated semantic annotation. In: 12th International World Wide Web Conference (WWW), Budapest, Hungary (2003)Google Scholar
  8. Doddington, G., Mitchell, A., Przybocki, M., Ramshaw, L., Strassel, S., Weischedel, R.: Automatic Content Extraction (ACE) program - task definitions and performance measures. In: Fourth International Conference on Language Resources and Evaluation (LREC) (2004)Google Scholar
  9. Ferrucci, D., Lally, A.: UIMA: an architectural approach to unstructured information processing in the corporate research environment. Natural Language Engineering 10(3/4), 327–348 (2004)CrossRefGoogle Scholar
  10. Fikes, R., Ferrucci, D., Thurman, D.: Knowledge Associates for Novel Intelligence (KANI). In: 2005 International Conference on Intelligence Analysis, McClean, VA (2005)Google Scholar
  11. Fokoue, A., Kershenbaum, A., Ma, L., Schonberg, E., Srinivas, K.: The Summary Abox: Cutting Ontologies Down to Size. In: Cruz, I., Decker, S., Allemang, D., Preist, C., Schwabe, D., Mika, P., Uschold, M., Aroyo, L.M. (eds.) ISWC 2006. LNCS, vol. 4273. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  12. Götz, T., Suhre, O.: Design and implementation of the UIMA Common Analysis System. IBM Systems Journal 43(3), 476–489 (2004)CrossRefGoogle Scholar
  13. Hobbs, J.R., Pan, F.: An OWL Ontology of Time (2004),
  14. Liddy, E.D.: Text Mining. Bulletin of American Society for Information Science & Technology (2000)Google Scholar
  15. Luo, X., Ittycheriah, A., Jing, H., Kambhatla, N., Roukos, S.: A Mention-Synchronous Coreference Resolution Algorithm Based On the Bell Tree. In: ACL 2004, pp. 135–142 (2004)Google Scholar
  16. Marsh, E.: TIPSTER information extraction evaluation: the MUC-7 workshop (1998)Google Scholar
  17. Maynard, D.: Benchmarking ontology-based annotation tools for the Semantic Web. In: AHM 2005 Workshop Text Mining, e-Research and Grid-enabled Language Technology, Nottingham, UK (2005)Google Scholar
  18. Maynard, D., Yankova, M., Kourakis, A., Kokossis, A.: Ontology-based information extraction for market monitoring and technology watch. In: ESWC Workshop End User Apects of the Semantic Web, Heraklion, Crete (May 2005)Google Scholar
  19. Miller, S., Bratus, S., Ramshaw, L., Weischedel, R., Zamanian, A.: FactBrowser demonstration. In: First international conference on Human Language Technology Research, HLT 2001 (2001)Google Scholar
  20. Milo, T., Zohar, S.: Using Schema Matching to Simplify Heterogeneous Data Translation. In: VLDB 1998 (August 1998)Google Scholar
  21. William Murdock, J., Welty, C.: Obtaining Formal Knowledge from Informal Text Analysis. IBM Research Report RC23961 (2006)Google Scholar
  22. William Murdock, J., McGuinness, D.L., da Silva, P.P., Welty, C., Ferrucci, D.: Explaining Conclusions from Diverse Knowledge Sources. In: Cruz, I., Decker, S., Allemang, D., Preist, C., Schwabe, D., Mika, P., Uschold, M., Aroyo, L.M. (eds.) ISWC 2006. LNCS, vol. 4273. Springer, Heidelberg (2006)Google Scholar
  23. Noy, N.F., Musen, M.A.: Anchor-PROMPT: Using Non-Local Context for Semantic Matching. In: Workshop on Ontologies and Information Sharing, Seattle, WA (2001)Google Scholar
  24. da Silva, P.P., McGuinness, D.L., Fikes, R.: A proof markup language for Semantic Web services. Information Systems 31(4-5), 381–395 (2006)CrossRefGoogle Scholar
  25. Popov, B., Kiryakov, A., Ognyanoff, D., Manov, D., Kirilov, A.: KIM - A Semantic Platform for Information Extraction and Retrieval. Journal of Natural Language Engineering 10(3-4), 375–392 (2004)CrossRefGoogle Scholar
  26. Sauri, R., Littman, J., Gaizauskas, R., Setzer, A., Pustejovsky, J.: TimeML Annotation Guidelines, Version 1.1 (2004),
  27. Schutz, A., Buitelaar, P.: RelExt: A Tool for Relation Extraction from Text in Ontology Extension. In: Gil, Y., Motta, E., Benjamins, V.R., Musen, M.A. (eds.) ISWC 2005. LNCS, vol. 3729. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  28. Voelker, J., Vrandecic, D., Sure, Y.: Automatic Evaluation of Ontologies (AEON). In: Gil, Y., Motta, E., Benjamins, V.R., Musen, M.A. (eds.) ISWC 2005. LNCS, vol. 3729, pp. 716–731. Springer, Heidelberg (2005)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Chris Welty
    • 1
  • J. William Murdock
    • 1
  1. 1.IBM Watson Research Center HawthorneUSA

Personalised recommendations