Skip to main content

Vulcain — An Ontology-Based Information Extraction System

    We’re sorry, something doesn't seem to be working properly.

    Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

  • First Online:
Natural Language Processing and Information Systems (NLDB 2002)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2553))

Abstract

This paper describes an information extraction system, Vulcain, dedicated to message filtering for a specific domain. The paper focuses on a method for identifying domain-specific terms and concepts, using syntactic information and an existing domain ontology. We focused on a method for identifying terms by partial syntactic analysis, based on TAG grammars. The domain ontology is represented in description logics, and DL inference mechanisms are used to validate the candidate concepts.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Assadi, H., Bourigault, D.: FrAnalyse syntaxique et statistique pour la construction d’ontologies à partir des textes. In J. Charlet, M. Zacklad, G. Kassel, D. Bourigault (eds.): Ingénierie des connaissances-Evolutions récentes et nouveaux défis, Eyrolles Publishing House (2000), 243–256.

    Google Scholar 

  2. Baader, F., Hollunder, B.: A Terminological Knowledge Representation Systems with Complete Inference Algorithms. In Proceedings of the Workshop on Processing Declarative Knowledge (1991).

    Google Scholar 

  3. Bonhomme, P. and Lopez, P.: TagML: XML encoding of Resources for Lexicalized Tree Adjoining Grammars. In Proceedings of LREC2000, Athens (2000).

    Google Scholar 

  4. Bouaud, J., Habert, B., Nazarenko, A., Zweigenbaum, P.: FrRegroupements issus de dépendances syntaxiques sur un corpus de spécialité: catégorisation et confrontation à deux conceptualisations du domaine. In J. Charlet, M. Zacklad, G. Kassel, D. Bourigault (eds.): Ingénierie des connaissances-Evolutions récentes et nouveaux défis, Eyrolles Publishing House (2000) 275–290.

    Google Scholar 

  5. Buitelaar, P.: CORELEX: Systematic Polysemy and Underspecification, Ph.D. thesis, Brandeis University, Department of Computer Science (1998)

    Google Scholar 

  6. Capponi, N., Toussaint, Y.: FrInterprétation de classes de termes par généralisation de structures prédicat-argument. In J. Charlet, M. Zacklad, G. Kassel, D. Bourigault (eds.): Ingénierie des connaissances-Evolutions récentes et nouveaux défis, Eyrolles Publishing House (2000) 337–356.

    Google Scholar 

  7. Chanod J.P.: Natural Language Processing and Digital Libraries. In M.T. Pazienza (ed.): Information Extraction, Springer-Verlag, LNAI 1714, (1999) 17–31.

    Google Scholar 

  8. Daille, B.: Study and Implementation of Combined Techniques for Automatic Extraction of Terminology. In J. Klavans, P. Resnik (eds.): The Balancing Act-Combining Symbolic and Statistical Approaches to Language, MIT Press (1996) 49–66.

    Google Scholar 

  9. Fensel D. et al.: OIL in a nutshell. In R. Dieng et al. (eds.): Knowledge Acquisition, Modeling, and Management, Proceedings of the European Knowledge Acquisition Conference (EKAW-2000), Lecture Notes in Artificial Intelligence, LNAI, Springer-Verlag (2000).

    Google Scholar 

  10. Guarino, N.: Semantic Matching: Formal Ontological Distinctions for Information Organization, Extraction, and Integration. In M. T. Pazienza (ed.): Information Extraction: A Multidisciplinary Approach to an Emerging Information Technology. Springer Verlag (1997) 139–170.

    Google Scholar 

  11. Heid, U.: A linguistic bootstrapping approach to the extraction of term candidates from German text. In Terminology, (2000) 161–180.

    Google Scholar 

  12. Haarslev V., Muller R.: Description of the RACER System and its Applications. In Proceedings of the International Workshop on Description Logics (DL-2001), Stanford, USA, (2001), 132–141

    Google Scholar 

  13. Joshi A.: An Introduction to Tree Adjoining Grammars. In Mathematics of Language, John Benjamins Publishing, Amsterdam/Philadelphia (1987), 87–115.

    Google Scholar 

  14. Lopez, P.: Robust Parsing with Lexicalized Tree Adjoining Grammars, Ph.D.Thesis, INRIA, Nancy, France (1999).

    Google Scholar 

  15. Miller, G., Beckwith, R., Fellbaum, C., Gross, D., Miller, K.:Introduction to Word-Net: An On-Line Lexical Database. In International Journal of Lexicography, 3(4), (1990), 302–312.

    Article  Google Scholar 

  16. Riloff, E., Lorenzen, J.: Extraction-based Text Categorization Generating Domain-Specific Role Relationships Automatically. In T. Strzalkowski (ed.): Natural Language Information Retrieval, Kluwer Academic Publishers, (1999), 167–196.

    Google Scholar 

  17. Riloff, E., Shepherd, J.: A Corpus-Based Approach for Building Semantic Lexicons. In Proceedings of the Second Conference on Empirical Methods in Natural Language Processing (1997).

    Google Scholar 

  18. Schimd, H.:Probabilistic Part-of-Speech Tagging Using Decision Trees. In Proceedings of the International Conference on New Methods in Language Processing, Manchester, United Kingdom (1994)

    Google Scholar 

  19. Vilain, M.: Inferential Information Extraction. In M. Pazienza (ed.): Information Extraction, LNAI 1714, Springer-Verlag, (1999), 95–119.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Todirascu, A., Romary, L., Bekhouche, D. (2002). Vulcain — An Ontology-Based Information Extraction System. In: Andersson, B., Bergholtz, M., Johannesson, P. (eds) Natural Language Processing and Information Systems. NLDB 2002. Lecture Notes in Computer Science, vol 2553. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36271-1_6

Download citation

  • DOI: https://doi.org/10.1007/3-540-36271-1_6

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-00307-6

  • Online ISBN: 978-3-540-36271-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics