Advertisement

Terminology Extraction: An Analysis of Linguistic and Statistical Approaches

  • Maria Teresa Pazienza
  • Marco Pennacchiotti
  • Fabio Massimo Zanzotto
Part of the Studies in Fuzziness and Soft Computing book series (STUDFUZZ, volume 185)

Abstract

Are linguistic properties and behaviors important to recognize terms? Are statistical measures effective to extract terms? Is it possible to capture a sort of termhood with computation linguistic techniques? Or maybe, terms are too much sensitive to exogenous and pragmatic factors that cannot be confined in computational linguistic? All these questions are still open. This study tries to contribute in the search of an answer, with the belief that it can be found only through a careful experimental analysis of real case studies and a study of their correlation with theoretical insights.

Keywords

Noun Phrase Linguistic Analysis Candidate Term Association Measure Term Extraction 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Ananiadou, S., Maynard D.: Identifying contextual information for term extraction. In Proc. of 5th International Congress on Terminology and Knowledge Engineering (1999)Google Scholar
  2. 2.
    Basili, R., Pazienza, M.T., Velardi. P.: An Empirical Symbolic Approach to Natural Language Processing. Artificial Intelligence, Vol. 85 (1996)Google Scholar
  3. 3.
    Basili, R., De Rossi, G., Pazienza M.T.: Inducing Terminology for Lexical Acquisition. Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP-2), Brown University, Providence, Rhode Island (1997)Google Scholar
  4. 4.
    Basili, R., Bordoni, L., Pazienza, M.T.: Extracting terminology from corpora, Proc. of the 2nd International Conference on Terminology, Standardization and Technology Transfer (1997)Google Scholar
  5. 5.
    Basili, R., Pazienza, M.T., Zanzotto, F.M.: Customizable modular lexicalized parsing. In: Proc. of the 6th International Workshop on Parsing Technology (2000)Google Scholar
  6. 6.
    Basili, R., Missikoff, M., Velardi, P.: Identification of relevant terms to support the construction of Domain Ontologies, ACL workshop on HLT, Toulouse, France. (2001)Google Scholar
  7. 7.
    Basili, R., Pazienza, M. T., Zanzotto, F. M.: Decision trees as explicit domain term definition 19th International Conference on Computational Linguistic (COLING2002). Taipei (Taiwan) (2002)Google Scholar
  8. 8.
    Benveniste, E.: Problèmes de linguistique générale. Gallimard (1966)Google Scholar
  9. 9.
    Bourigault, D.: Surface grammatical analysis for the extraction of terminological noun phrases. In: Proc. of Fifteenth International Conference on Computational Linguistics (1992)Google Scholar
  10. 10.
    Brill, E.: Some advances in transformation-based part-of-speech tagging. In Proceedings of the 15th International Conference on Computational Linguistic, 1034–1038 (1994)Google Scholar
  11. 11.
    Church, K.W., Hanks, P.: Word Association Norms, Mutual Information and Lexicography. ACL (1989), 76–83Google Scholar
  12. 12.
    Church, K.W., Gale, E., Hanks, P., Hindle, D.: Using statistics in lexical analysis. In Lexical Acquisition: Using On-line Resources to Build a Lexicon, Lawrence Erlbaum. (1991)Google Scholar
  13. 13.
    Daille, B.: Approach mixte pour l’extraction de termilogie: statistique lexicale et filters linguistiques. PhD Thesis, C2V, TALANA, Universitè Paris VII (1994)Google Scholar
  14. 14.
    Daille, B., Habert, B., Jacquemin, C., Royaut, J.: Empirical observation of term variations and principles for their description. Terminology, 3(2) (1996) 197–258Google Scholar
  15. 15.
    Dennis, Sally, F.: The construction of a thesaurus automatically from a sample of text. In Proceedings of the Symposium on Statistical Association Methods For Mechanized Documentation, Washington, DC. (1965) 61–148Google Scholar
  16. 16.
    Dunning, T.: Accurate Methods for the Statistics of Surprise and Coincidence. Computational Linguistics 19(1) (1994) 61–74Google Scholar
  17. 17.
    Earl, L.L.: Experiments in Automatic Extracting and Indexing. Information Storage and Retrieval 6(X) (1970) 273–288Google Scholar
  18. 18.
    Enguehard, C., Pantera, L.: Automatic Natural Language acquisition of a terminology. Journal of Quantitative Linguistics 2(1) (1994) 27–32CrossRefGoogle Scholar
  19. 19.
    Evans, D.A., Zhai, C.: Noun-phrase analysis in unrestricted text for information retrieval, Proceedings of the 34th conference on Association for Computational Linguistics. Santa Cruz, California (1996) 17–24Google Scholar
  20. 20.
    Evert, S., Krenn, B.: Methods for the qualitative evaluation of lexical association measures. In Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics, Toulouse, France. (2001) 188–195Google Scholar
  21. 21.
    Fano, R.M.: Transmission of Information: A statistical Theory of Communications. MIT Press, Cambridge, MA. (1961)Google Scholar
  22. 22.
    Frantzi, K.T., Ananiadou, S.: Extracting Nested Collocations. COLING (1996). 41–46Google Scholar
  23. 23.
    Hisamitsu, T., Tsujii, J.: Measuring Term Representativeness. Third Summer Convention on Information Extraction (SCIE 2002). Roma, Italy (2002)Google Scholar
  24. 24.
    Jacquemin, C.: Variation terminologique: Reconnaissance et acquisition automatiques de termes et de leurs variantes en corpus. Mémoire d’Habilitation à Diriger des Recherches en informatique fondamentale, Université de Nantes, France (1997)Google Scholar
  25. 25.
    Jones, L.P., Gassie, E.W., Radhakrishnan, S.: INDEX: The statistical basis for an automatic conceptual phrase-indexing system. Journal of the American Society for Information Science 41(2) (1990) 87–97CrossRefGoogle Scholar
  26. 26.
    Justeson, J., Katz.: Technical Terminology: some linguistic properties and an algorithm for identification in text. In: Natural Language Engineering, 1 (1995) 9–27CrossRefGoogle Scholar
  27. 27.
    Kageura, K., Umino, B.: Methods of automatic term recognition. Terminology, 3(2). (1996)Google Scholar
  28. 28.
    Krenn, B.: Empirical Implications on Lexical Association Measures. Proceedings of The Ninth EURALEX International Congress. Stuttgart, Germany. (2000)Google Scholar
  29. 29.
    Nakagawa, H., Mori, T.: Automatic term recognition based on statistics of compound nouns and their components. Terminology 9(2):201 (2003)Google Scholar
  30. 30.
    Pazienza, M.T.: A domain specific terminology extraction system. In: International Journal of Terminology. Benjamin Ed., Vol. 5.2 (1999) 183–201Google Scholar
  31. 31.
    Pazienza, M.T., Pennacchiotti, M., Vindigni, M., Zanzotto, F.M.: Shumi, Support To Human Machine Interaction. Technical Report. ESA-ESTEC contract N.18149/04/NL/MV — Natural Language Techniques in Support of Spacecraft Design (2004)Google Scholar
  32. 32.
    Salton, G., Yang, C.S., Yu, C.T.: A Theory of term importance in automatic text analysis. In: Journal of the American Society for Information Science 26(1) (1975) 33–44Google Scholar
  33. 33.
    Smadja, F.A., McKeown, K., Hatzivassiloglou, V.: Translating collocations for bilingual lexicons: a statistical approach. Computational Linguistics, 22:1. (1996)Google Scholar
  34. 34.
    Zanzotto, F.M.: L’estrazione della terminologia come strumento per la modellazione di domini conoscitivi. PhD Thesis, Università degli Studi di Roma Tor Vergata (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Maria Teresa Pazienza
    • 1
  • Marco Pennacchiotti
    • 1
  • Fabio Massimo Zanzotto
    • 2
  1. 1.Artificial Intelligence Research GroupUniversity of Roma Tor VergataItaly
  2. 2.University of Milano BicoccaItaly

Personalised recommendations